cassandra workshop - cassandra from scratch in one day
TRANSCRIPT
![Page 1: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/1.jpg)
@calonso
CASSANDRA WORKSHOPCassandra from scratch in one day.
![Page 2: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/2.jpg)
@calonso
• Introductions
• Cassandra Core concepts
• CQL
• Data modelling
• More Cassandra Concepts
• Hardware Considerations
![Page 3: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/3.jpg)
@calonso
CARLOS ALONSO
• Spanish Londoner
• MSc Salamanca University, Spain
• Software Engineer @MyDrive Solutions
• Cassandra certified developer
• Cassandra MVP 2015
• @calonso / http://mrcalonso.com
![Page 4: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/4.jpg)
@calonso
MYDRIVE SOLUTIONS
• World leading driver profiling company
• Using technology and data to understand how to improve driving behaviour
• Recently acquired by the Generali Group
• @_MyDrive / http://mydrivesolutions.com
• We are hiring!!
![Page 5: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/5.jpg)
@calonso
AND YOU?
![Page 6: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/6.jpg)
@calonso
AND YOU?I’m a Db admin (ORACLE?) and I want to learn Cassandra
![Page 7: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/7.jpg)
@calonso
AND YOU?I’m a Db admin (ORACLE?) and I want to learn Cassandra
I’m rolling in production with Cassandra
![Page 8: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/8.jpg)
@calonso
AND YOU?I’m a Db admin (ORACLE?) and I want to learn Cassandra
I’ve never heard about NoSQLI’m rolling in production with Cassandra
![Page 9: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/9.jpg)
@calonso
AND YOU?I’m a Db admin (ORACLE?) and I want to learn Cassandra
I’ve never heard about NoSQLI’ve heard about Cassandra and
want to get my hands on it
I’m rolling in production with Cassandra
![Page 10: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/10.jpg)
@calonso
AND YOU?I’m a Db admin (ORACLE?) and I want to learn Cassandra
I’ve never heard about NoSQL
I’ve never heard about SQL
I’ve heard about Cassandra and want to get my hands on it
I’m rolling in production with Cassandra
![Page 11: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/11.jpg)
@calonso
AND YOU?I’m a Db admin (ORACLE?) and I want to learn Cassandra
I’ve never heard about NoSQL
I’ve never heard about SQL
I don’t know what I’m doing here
I’ve heard about Cassandra and want to get my hands on it
I’m rolling in production with Cassandra
![Page 12: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/12.jpg)
@calonso
AND YOU?I’m a Db admin (ORACLE?) and I want to learn Cassandra
I’ve never heard about NoSQL
I’ve never heard about SQL
I don’t know what I’m doing here
I’ve heard about Cassandra and want to get my hands on it
I’m evaluating Cassandra as a potential solution
I’m rolling in production with Cassandra
![Page 13: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/13.jpg)
@calonso
AND YOU?I’m a Db admin (ORACLE?) and I want to learn Cassandra
I’ve never heard about NoSQL
I’ve never heard about SQL
I don’t know what I’m doing here
I’ve heard about Cassandra and want to get my hands on it
I’ve using Cassandra for sometests and want to go deeperI’m evaluating Cassandra as
a potential solution
I’m rolling in production with Cassandra
![Page 14: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/14.jpg)
@calonso
CASSANDRA
• A.k.a Alexandra or Kassandra
• Daughter of King Priam and Queen Hecuba of Troy.
• Apollo gave her the power of prophecy to seduce her. She refused and then Apollo spat on her mouth cursing her never to be believed.
• https://en.wikipedia.org/wiki/Cassandra
![Page 15: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/15.jpg)
@calonso
CASSANDRA
• Open Source distributed database management system
• Initially developed at Facebook
• Inspired by Amazon’s Dynamo and Google BigTable papers
• Became Apache top-level project in Feb, 2010
• Nowadays developed by DataStax
![Page 16: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/16.jpg)
@calonso
WHY CASSANDRA?
“Cassandra is the cursed ORACLE”
![Page 17: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/17.jpg)
@calonso
WHY CASSANDRA?
“Cassandra is the cursed ORACLE”
![Page 18: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/18.jpg)
@calonso
CASSANDRA CORE CONCEPTSTechnical introduction to Apache Cassandra
![Page 19: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/19.jpg)
@calonsoNOSQL
![Page 20: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/20.jpg)
@calonso
BIG DATA REQUIREMENTS
• Everywhere
• Fast
• Always available
• Consistent
+
Ingestion Consumption
![Page 21: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/21.jpg)
@calonso
THE CAP THEOREM
![Page 22: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/22.jpg)
@calonso
SCALINGVertical Horizontal
![Page 23: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/23.jpg)
![Page 24: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/24.jpg)
@calonso
CASSANDRA• Fast Distributed NoSQL Database
• High Availability
• Linear Scalability => Predictability
• No SPOF
• Multi-DC
• Horizontally scalable => $$$
• Not a drop in replacement for RDBMS
![Page 25: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/25.jpg)
@calonso
CASSANDRA CLUSTER
![Page 26: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/26.jpg)
@calonsoREPLICATION FACTOR
How many copies (replicas) for your data
![Page 27: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/27.jpg)
@calonsoCONSISTENCY LEVEL
How many replicas of your data must respond ok?
![Page 28: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/28.jpg)
@calonso
CASSANDRA DATA MODEL• Query driven data model
• Column family non relational db
![Page 29: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/29.jpg)
@calonso
CQL
CREATE TABLE users ( id UUID, name VARCHAR, surname VARCHAR, birthdate TIMESTAMP, PRIMARY KEY(id));
Familiar row-column SQL-like approach.
INSERT INTO users (id, name, surname, birthdate) VALUES (uuid(), ‘Carlos’, ‘Alonso’, ’1985-03-19’);
SELECT * FROM users WHERE id = ‘f81d4fae-7dec-11d0-a765-00a0c91e6bf6’;
ALTER TABLE users ADD address VARCHAR;
![Page 30: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/30.jpg)
@calonso
DISTRIBUTIONS
• Latest features
• JIRA
• Support via mailing list & IRC
• http://cassandra.apache.org
![Page 31: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/31.jpg)
@calonso
DISTRIBUTIONS
• Integrated Solr for Multi-DC Search
• Integrated Spark for Analytics
• Free Startup Program
• Expert support
• Focused on stable releases for enterprises
• http://www.datastax.com/products/datastax-enterprise
![Page 32: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/32.jpg)
@calonso
CASSANDRA: YES• If you need:
• No SPOF
• Linear horizontal scalability in commodity hardware
• Real-time writes
• Reliable data replication across distributed data centres
• Clearly defined schema in a NoSQL environment
![Page 33: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/33.jpg)
@calonso
CASSANDRA: NO
• If you need:
• ACID transactions with rollback
• Justification for high-end software
![Page 34: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/34.jpg)
@calonso
REVIEW QUESTIONSWhat do consistency, availability and partition tolerance mean?
![Page 35: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/35.jpg)
@calonso
REVIEW QUESTIONSWhat do consistency, availability and partition tolerance mean?
Consistency: All clients have the exact same value for the whole data set at any given point.
![Page 36: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/36.jpg)
@calonso
REVIEW QUESTIONSWhat do consistency, availability and partition tolerance mean?
Consistency: All clients have the exact same value for the whole data set at any given point.
Availability: All clients can read and write to the system at any given point.
![Page 37: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/37.jpg)
@calonso
REVIEW QUESTIONSWhat do consistency, availability and partition tolerance mean?
Consistency: All clients have the exact same value for the whole data set at any given point.
Availability: All clients can read and write to the system at any given point.
Partition tolerance: Whether or not the system tolerates a node being disconnected from the system.
![Page 38: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/38.jpg)
@calonso
REVIEW QUESTIONS
Where does Cassandra fit within the CAP Theorem?
![Page 39: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/39.jpg)
@calonso
REVIEW QUESTIONS
Where does Cassandra fit within the CAP Theorem?
AP: Cassandra trades off consistency in order to guarantee availability and partition tolerance, but in a configurable way, so it’s
up to the developer where to sit for each query.
![Page 40: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/40.jpg)
@calonso
REVIEW QUESTIONS
Which are the technological roots of Cassandra?
![Page 41: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/41.jpg)
@calonso
REVIEW QUESTIONS
Which are the technological roots of Cassandra?
Google BigTable and Amazon Dynamo pulled together by developers at Facebook
![Page 42: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/42.jpg)
@calonso
REVIEW QUESTIONS
What technology does Cassandra use to model data?
![Page 43: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/43.jpg)
@calonso
REVIEW QUESTIONS
What technology does Cassandra use to model data?
CQL: Cassandra Query Language
![Page 44: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/44.jpg)
@calonso
INSTALLATIONInstalling, configuring and running Cassandra
![Page 45: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/45.jpg)
@calonso
REQUIREMENTS
JAVA >= 1.7.0_25
All nodes synchronised (NTP)
![Page 46: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/46.jpg)
@calonso
INSTALLATION
http://cassandra.apache.org/download/
![Page 47: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/47.jpg)
@calonso
CONFIGURATION• cluster_name
• listen_address
• rpc_address
• commitlog_directory
• data_file_directories
• saved_caches_directory
conf/cassandra.yaml
![Page 48: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/48.jpg)
@calonso
CONFIGURATION• MAX_HEAP_SIZE
• if system memory < 2G => 1/2 of it
• if between 2G and 4G => 1G
• if > 4G => 1/4 of it but no more than 8G
• HEAP_NEWSIZE
• 1/4 of MAX_HEAP_SIZE
conf/cassandra-env.sh
![Page 49: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/49.jpg)
@calonso
START/STOP
sudo bin/cassandra
sudo service cassandra start
ctrl - csudo bin/cassandra [-f]
ps aux | grep cassandra sudo kill <pid>
sudo service cassandra stop
![Page 50: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/50.jpg)
@calonso
START/STOP
Node localhost/127.0.0.1 state jump to normal
![Page 51: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/51.jpg)
@calonso
REVIEW QUESTIONS
Which setting determines a node’s cluster? Where is it configured?
![Page 52: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/52.jpg)
@calonso
REVIEW QUESTIONS
Which setting determines a node’s cluster? Where is it configured?
cluster_name: In conf/cassandra.yaml
![Page 53: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/53.jpg)
@calonso
REVIEW QUESTIONS
How would you stop a Cassandra instance running in background in an Unix based machine?
![Page 54: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/54.jpg)
@calonso
REVIEW QUESTIONS
How would you stop a Cassandra instance running in background in an Unix based machine?
1. Get the PID: ps aux | grep cassandra2. Kill the process: kill <pid>
![Page 55: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/55.jpg)
@calonso
REVIEW QUESTIONS
Which settings would you adjust to tune how much memory Cassandra uses?
In which file?
![Page 56: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/56.jpg)
@calonso
REVIEW QUESTIONS
Which settings would you adjust to tune how much memory Cassandra uses?
In which file?
MAX_HEAP_SIZE in conf/cassandra-env.sh
![Page 57: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/57.jpg)
@calonso
BASIC TOOLSKnowing tools required for basic Cassandra management
![Page 58: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/58.jpg)
NODETOOLThe command line swiss army knife.
![Page 59: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/59.jpg)
@calonso
NODETOOL
status: displays cluster state, load, host ID and token
info: displays node memory use, disk load, uptime …
ring: displays node status and cluster ring state
help: displays all possible commands and description
![Page 60: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/60.jpg)
CQLSHOur data management and first
exploration tool
![Page 61: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/61.jpg)
@calonso
CQLSH
DESC[RIBE]: shows information of the arguments
SOURCE: executes a file containing CQL statements
TRACING: enables/disables the tracing mode
help: shows available cqlsh + CQL commands
SELECT, ALTER, INSERT, …
![Page 62: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/62.jpg)
CASSANDRA-STRESS
Our tool to assess performance
![Page 63: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/63.jpg)
@calonso
CASSANDRA-STRESS
read: to execute a read-only workload
mixed: executes mixed workload
user: user defined schema and workloads
write: to execute a write-only workload
![Page 64: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/64.jpg)
CCMOne tool to manage them all.
![Page 65: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/65.jpg)
@calonso
CCM
• Python 2.7 +
• PyYAML
• Six
• Ant
• Loopback IP aliases (Mac OS)
Prerequisites
github: pcmanus/ccm
• Testing tool
• Communicates with localhost only
Limitations
![Page 66: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/66.jpg)
@calonso
CCM
start/stop: starts/stops all nodes in cluster
status: shows current cluster status<node> <command>: runs command connecting to nodei.e: ccm node1 cqlsh
create: downloads, compiles and builds cluster
![Page 67: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/67.jpg)
@calonso
REVIEW QUESTIONS
Which tool/command would I use to know the start/stop status of a particular node of my cluster
![Page 68: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/68.jpg)
@calonso
REVIEW QUESTIONS
Which tool/command would I use to know the start/stop status of a particular node of my cluster
nodetool status
![Page 69: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/69.jpg)
@calonso
REVIEW QUESTIONS
Name and describe two non CQL commands allowed in cqlsh.
![Page 70: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/70.jpg)
@calonso
REVIEW QUESTIONS
Name and describe two non CQL commands allowed in cqlsh.
CAPTURE COPY DESCRIBE EXPAND PAGING SOURCECONSISTENCY DESC EXIT HELP SHOW TRACING
![Page 71: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/71.jpg)
@calonso
REVIEW QUESTIONS
Can I manage my production cluster remotely using CCM?
![Page 72: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/72.jpg)
@calonso
REVIEW QUESTIONS
Can I manage my production cluster remotely using CCM?
No, that’s CCM’s biggest limitation. Only connects to localhost.
![Page 73: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/73.jpg)
@calonso
REVIEW QUESTIONS
What happens if, in a cqlsh session I type: DESCRIBE KEY and press TAB?
![Page 74: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/74.jpg)
![Page 75: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/75.jpg)
@calonso
INTERNAL ARCHITECTUREInternal processes that make Cassandra work
![Page 76: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/76.jpg)
@calonso
CLUSTER COMPONENTS• Column: The smallest key-value pair.
• Row: Collection of columns. Identified by a row key.
• Partition: Bucket containing several rows. Identified by a token.
• Node: a Cassandra instance. Contains a token range.
• Rack: a logical set of nodes
• Data Center : a logical set of racks.
• Cluster : The full set of nodes. Covers a whole token ring.
![Page 77: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/77.jpg)
CONSISTENT HASHING
Which node holds this data?
![Page 78: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/78.jpg)
– Wikipedia
“Hashing is the transformation of a string of characters into a usually shorter fixed-length value or key that represents the original string.
Hashing is used to index and retrieve items in a database because it is faster to find the item using the shorter hashed key than to find it using
the original value.”
![Page 79: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/79.jpg)
@calonso
CONSISTENT HASHING
• Data is stored in partitions, identified by a unique token within the range (-2^63 - 2^63)
• Nodes contain partition ranges.
![Page 80: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/80.jpg)
@calonso
THE PARTITIONER• System running on each node that
computes hashes through a hash function.
• Various partitioners available.
• Default is murmur3
• All nodes MUST use the same!!!!
Hash function
“Carlos” 185664
1773456738847666528349
-894763734895827651234
![Page 81: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/81.jpg)
@calonso
VNODES
![Page 82: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/82.jpg)
@calonso
VNODES
![Page 83: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/83.jpg)
@calonso
![Page 84: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/84.jpg)
REQUEST COORDINATIONHow are client requests coordinated?
![Page 85: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/85.jpg)
@calonso
THE COORDINATOR
• The node designated to coordinate a particular query.
• ANY node can coordinate ANY request.
• No SPOF: One of the main Cassandra’s principles.
• The driver chooses which node will coordinate
![Page 86: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/86.jpg)
@calonso
A FULL EXAMPLE
CREATE TABLE users ( id UUID, name VARCHAR, surname VARCHAR, birthdate TIMESTAMP, PRIMARY KEY(id));
SELECT * FROM users WHERE id = ‘f81d4fae-7dec-11d0-a765-00a0c91e6bf6’;
DriverClient
DriverPartitioner
CREATE KEYSPACE test WITH REPLICATION = { “class”: “SimpleStrategy”, “replication_factor”: 3 };
CONSISTENCY QUORUM;
![Page 87: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/87.jpg)
@calonso
A FULL EXAMPLE
CREATE TABLE users ( id UUID, name VARCHAR, surname VARCHAR, birthdate TIMESTAMP, PRIMARY KEY(id));
SELECT * FROM users WHERE id = ‘f81d4fae-7dec-11d0-a765-00a0c91e6bf6’;
DriverClient
DriverPartitioner
CREATE KEYSPACE test WITH REPLICATION = { “class”: “SimpleStrategy”, “replication_factor”: 3 };
CONSISTENCY QUORUM;
![Page 88: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/88.jpg)
@calonso
A FULL EXAMPLE
CREATE TABLE users ( id UUID, name VARCHAR, surname VARCHAR, birthdate TIMESTAMP, PRIMARY KEY(id));
SELECT * FROM users WHERE id = ‘f81d4fae-7dec-11d0-a765-00a0c91e6bf6’;
DriverClient
DriverPartitioner
f81d4fae-…
CREATE KEYSPACE test WITH REPLICATION = { “class”: “SimpleStrategy”, “replication_factor”: 3 };
CONSISTENCY QUORUM;
![Page 89: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/89.jpg)
@calonso
A FULL EXAMPLE
CREATE TABLE users ( id UUID, name VARCHAR, surname VARCHAR, birthdate TIMESTAMP, PRIMARY KEY(id));
SELECT * FROM users WHERE id = ‘f81d4fae-7dec-11d0-a765-00a0c91e6bf6’;
DriverClient
DriverPartitioner
f81d4fae-…834
CREATE KEYSPACE test WITH REPLICATION = { “class”: “SimpleStrategy”, “replication_factor”: 3 };
CONSISTENCY QUORUM;
![Page 90: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/90.jpg)
@calonso
A FULL EXAMPLE
CREATE TABLE users ( id UUID, name VARCHAR, surname VARCHAR, birthdate TIMESTAMP, PRIMARY KEY(id));
SELECT * FROM users WHERE id = ‘f81d4fae-7dec-11d0-a765-00a0c91e6bf6’;
DriverClient
DriverPartitioner
f81d4fae-…834
CREATE KEYSPACE test WITH REPLICATION = { “class”: “SimpleStrategy”, “replication_factor”: 3 };
CONSISTENCY QUORUM;
![Page 91: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/91.jpg)
@calonso
A FULL EXAMPLE
CREATE TABLE users ( id UUID, name VARCHAR, surname VARCHAR, birthdate TIMESTAMP, PRIMARY KEY(id));
SELECT * FROM users WHERE id = ‘f81d4fae-7dec-11d0-a765-00a0c91e6bf6’;
DriverClient
DriverPartitioner
f81d4fae-…834
CREATE KEYSPACE test WITH REPLICATION = { “class”: “SimpleStrategy”, “replication_factor”: 3 };
CONSISTENCY QUORUM;
![Page 92: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/92.jpg)
@calonso
A FULL EXAMPLE
CREATE TABLE users ( id UUID, name VARCHAR, surname VARCHAR, birthdate TIMESTAMP, PRIMARY KEY(id));
SELECT * FROM users WHERE id = ‘f81d4fae-7dec-11d0-a765-00a0c91e6bf6’;
DriverClient
DriverPartitioner
f81d4fae-…834
CREATE KEYSPACE test WITH REPLICATION = { “class”: “SimpleStrategy”, “replication_factor”: 3 };
CONSISTENCY QUORUM;
![Page 93: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/93.jpg)
REPLICATIONHow many copies of your data?
![Page 94: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/94.jpg)
@calonso
WHY REPLICATION?
• Disaster recovery
• Bring data closer to users (to reduce latencies)
• Workload segregation (analytical vs transactional)
![Page 95: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/95.jpg)
@calonso
REPLICATIONDefined at keyspace level
CREATE KEYSPACE <my_keyspace> WITH REPLICATION = { “class”: “SimpleStrategy”, “replication_factor”: 2 };
CREATE KEYSPACE <my_keyspace> WITH REPLICATION = { “class”: “NetworkTopologyStrategy”,
“dc-east”: 2, “dc-west”: 3 };
![Page 96: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/96.jpg)
@calonso
SIMPLESTRATEGY
DriverClient
CREATE KEYSPACE <my_keyspace> WITH REPLICATION = { “class”: “SimpleStrategy”, “replication_factor”: 3 };
Token: 834
![Page 97: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/97.jpg)
@calonso
SIMPLESTRATEGY
DriverClient
CREATE KEYSPACE <my_keyspace> WITH REPLICATION = { “class”: “SimpleStrategy”, “replication_factor”: 3 };
Token: 834
![Page 98: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/98.jpg)
@calonso
SIMPLESTRATEGY
DriverClient
CREATE KEYSPACE <my_keyspace> WITH REPLICATION = { “class”: “SimpleStrategy”, “replication_factor”: 3 };
Token: 834
![Page 99: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/99.jpg)
@calonso
SIMPLESTRATEGY
DriverClient
CREATE KEYSPACE <my_keyspace> WITH REPLICATION = { “class”: “SimpleStrategy”, “replication_factor”: 3 };
Token: 834
![Page 100: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/100.jpg)
@calonso
NETWORKTOPOLOGYSTRATEGY
DriverClient
Token: 834
CREATE KEYSPACE <my_keyspace> WITH REPLICATION = { “class”: “NetworkTopologyStrategy”,
“dc-east”: 2, “dc-west”: 3 };
dc-east
rack-1
rack-2
rack-1
dc-west
rack-2
![Page 101: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/101.jpg)
@calonso
NETWORKTOPOLOGYSTRATEGY
DriverClient
Token: 834
CREATE KEYSPACE <my_keyspace> WITH REPLICATION = { “class”: “NetworkTopologyStrategy”,
“dc-east”: 2, “dc-west”: 3 };
dc-east
rack-1
rack-2
rack-1
dc-west
rack-2
![Page 102: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/102.jpg)
@calonso
NETWORKTOPOLOGYSTRATEGY
DriverClient
Token: 834
CREATE KEYSPACE <my_keyspace> WITH REPLICATION = { “class”: “NetworkTopologyStrategy”,
“dc-east”: 2, “dc-west”: 3 };
dc-east
rack-1
rack-2
rack-1
dc-west
rack-2
![Page 103: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/103.jpg)
@calonso
NETWORKTOPOLOGYSTRATEGY
DriverClient
Token: 834
CREATE KEYSPACE <my_keyspace> WITH REPLICATION = { “class”: “NetworkTopologyStrategy”,
“dc-east”: 2, “dc-west”: 3 };
dc-east
rack-1
rack-2
rack-1
dc-west
rack-2
![Page 104: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/104.jpg)
@calonso
NETWORKTOPOLOGYSTRATEGY
DriverClient
Token: 834
CREATE KEYSPACE <my_keyspace> WITH REPLICATION = { “class”: “NetworkTopologyStrategy”,
“dc-east”: 2, “dc-west”: 3 };
dc-east
rack-1
rack-2
rack-1
dc-west
rack-2
![Page 105: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/105.jpg)
@calonso
NETWORKTOPOLOGYSTRATEGY
DriverClient
Token: 834
CREATE KEYSPACE <my_keyspace> WITH REPLICATION = { “class”: “NetworkTopologyStrategy”,
“dc-east”: 2, “dc-west”: 3 };
dc-east
rack-1
rack-2
rack-1
dc-west
rack-2
![Page 106: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/106.jpg)
@calonso
WHAT IF A NODE OR DC IS DOWN?Hinted Handoff to the rescue!
DriverClient
X
![Page 107: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/107.jpg)
@calonso
WHAT IF A NODE OR DC IS DOWN?Hinted Handoff to the rescue!
DriverClient
X
![Page 108: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/108.jpg)
@calonso
WHAT IF A NODE OR DC IS DOWN?Hinted Handoff to the rescue!
DriverClient
X
834
![Page 109: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/109.jpg)
@calonso
WHAT IF A NODE OR DC IS DOWN?Hinted Handoff to the rescue!
DriverClient
X
834
![Page 110: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/110.jpg)
@calonso
WHAT IF A NODE OR DC IS DOWN?Hinted Handoff to the rescue!
DriverClient
X
834
834
![Page 111: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/111.jpg)
@calonso
WHAT IF A NODE OR DC IS DOWN?Hinted Handoff to the rescue!
DriverClient
X
834
834
834
![Page 112: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/112.jpg)
@calonso
WHAT IF A NODE OR DC IS DOWN?Hinted Handoff to the rescue!
DriverClient
834
834
834
![Page 113: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/113.jpg)
@calonso
WHAT IF A NODE OR DC IS DOWN?Hinted Handoff to the rescue!
DriverClient
834
834
834
![Page 114: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/114.jpg)
@calonso
WHAT IF A NODE OR DC IS DOWN?Hinted Handoff to the rescue!
DriverClient
834
834
834
![Page 115: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/115.jpg)
@calonso
WHAT IF A NODE OR DC IS DOWN?Hinted Handoff to the rescue!
DriverClient
834
834
834
![Page 116: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/116.jpg)
CONSISTENCYHow much consistency do we want?
![Page 117: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/117.jpg)
@calonso
CONSISTENCY LEVEL
How many nodes must to successfully write for the write to be success?
How many nodes must send their data for the read to be success?
![Page 118: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/118.jpg)
@calonso
CONSISTENCY LEVEL
RF = 3
![Page 119: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/119.jpg)
@calonso
CONSISTENCY LEVEL
RF = 3
ANY (Only writes)
![Page 120: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/120.jpg)
@calonso
CONSISTENCY LEVEL
RF = 3
ANY (Only writes)
ONE, TWO, THREE
![Page 121: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/121.jpg)
@calonso
CONSISTENCY LEVEL
RF = 3
ANY (Only writes)
ONE, TWO, THREE
LOCAL_ONE
![Page 122: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/122.jpg)
@calonso
CONSISTENCY LEVEL
RF = 3
ANY (Only writes)
ONE, TWO, THREE
QUORUM = floor(RF / 2 + 1)
LOCAL_ONE
![Page 123: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/123.jpg)
@calonso
CONSISTENCY LEVEL
RF = 3
ANY (Only writes)
ONE, TWO, THREE
QUORUM = floor(RF / 2 + 1)
LOCAL_ONE
LOCAL_QUORUM
![Page 124: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/124.jpg)
@calonso
CONSISTENCY LEVEL
RF = 3
ANY (Only writes)
ONE, TWO, THREE
QUORUM = floor(RF / 2 + 1)
ALL
LOCAL_ONE
LOCAL_QUORUM
![Page 125: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/125.jpg)
@calonso
CONSISTENCY LEVEL
Availability /Partition tolerance Consistency
![Page 126: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/126.jpg)
@calonso
DEMOPlay with RFs, CLs and hints
![Page 127: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/127.jpg)
REPAIRStrengthening consistency.
![Page 128: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/128.jpg)
@calonso
DIGEST QUERY
In consistent reads, only one node is asked for data, the others are asked for a digest.
![Page 129: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/129.jpg)
@calonso
READ REPAIRWhat if nodes disagree?
DriverClient
CL >= QUORUM
SELECT city FROM …
![Page 130: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/130.jpg)
@calonso
READ REPAIRWhat if nodes disagree?
DriverClient
CL >= QUORUM
Madrid: 123SELECT city FROM …
![Page 131: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/131.jpg)
@calonso
READ REPAIRWhat if nodes disagree?
DriverClient
CL >= QUORUM
Madrid: 123
Salamanca: 125
SELECT city FROM …
![Page 132: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/132.jpg)
@calonso
READ REPAIRWhat if nodes disagree?
DriverClient
CL >= QUORUM
Madrid: 123
Salamanca: 125
London: 150
SELECT city FROM …
![Page 133: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/133.jpg)
@calonso
READ REPAIRWhat if nodes disagree?
DriverClient
CL >= QUORUM
Madrid: 123
Salamanca: 125
London: 150London
SELECT city FROM …
![Page 134: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/134.jpg)
@calonso
READ REPAIRAnd if CL < QUORUM?
The coordinator will issue a read_repair based on read_repair_chance table property.
CREATE TABLE users ( …) WITH read_repair_chance = 0.1;
![Page 135: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/135.jpg)
@calonso
MANUAL REPAIRLast defense against data entropy.
The nodetool repair command makes all data on a node consistent with the latest replicas in the cluster.
—partitioner-range: option to restrict repair to node’s primary range only
![Page 136: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/136.jpg)
@calonso
MANUAL REPAIR• Run nodetool repair :
• Recovering a failed node
• Increasing RF
• Periodically on every node
• Sequentially
• Once a week
![Page 137: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/137.jpg)
GOSSIPNodes gossip between themselves
![Page 138: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/138.jpg)
@calonso
GOSSIP
• Every second
• Three nodes
• Heartbeat + Versioned information of the whole cluster.
![Page 139: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/139.jpg)
@calonso
GOSSIP
• Provide consistent list of seeds
• At least one per DC
Nodes prefer (10%) to gossip with their seeds
![Page 140: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/140.jpg)
@calonso
SNITCH• Allows the node to know its rack and data center topology.
• Enables replication in different racks
![Page 141: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/141.jpg)
@calonso
SNITCH
• GossipingPropertyFileSnitch: config from cassandra-rackdc.properties and propagated by gossiping
• Ec2Snitch: Amazon EC2 aware. Single region. Single DC. Availability zone = Rack
• Ec2MultiRegionSnitch: Multiple regions. Region = DC.
• …
![Page 142: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/142.jpg)
@calonso
REVIEW QUESTIONS
Describe the relationship of nodes, racks, data centers and clusters.
![Page 143: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/143.jpg)
@calonso
REVIEW QUESTIONS
Describe the relationship of nodes, racks, data centers and clusters.
node > rack > data center > cluster
![Page 144: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/144.jpg)
@calonso
REVIEW QUESTIONS
What is the function of the partitioner?
![Page 145: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/145.jpg)
@calonso
REVIEW QUESTIONS
What is the function of the partitioner?
The partitioner’s function is to hash keys. Then the rest of the cluster uses that output to determine where the data should live.
![Page 146: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/146.jpg)
@calonso
REVIEW QUESTIONS
Can a node hold a partition with a token outside its primary range?
![Page 147: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/147.jpg)
@calonso
REVIEW QUESTIONS
Can a node hold a partition with a token outside its primary range?
Yes, if it’s replicating data for some other node, or if it’s holding a hint.
![Page 148: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/148.jpg)
@calonso
REVIEW QUESTIONS
In a 3 nodes cluster with RF = 2. How much total volume does each node own?
![Page 149: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/149.jpg)
@calonso
REVIEW QUESTIONS
In a 3 nodes cluster with RF = 2. How much total volume does each node own?
66%
![Page 150: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/150.jpg)
@calonso
REVIEW QUESTIONS
What is the function of the nodetool repair operation?
![Page 151: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/151.jpg)
@calonso
REVIEW QUESTIONS
What is the function of the nodetool repair operation?
Synchronising replicas.Ensuring the node’s data is the most recent.
![Page 152: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/152.jpg)
@calonso
REVIEW QUESTIONS
What is a remote coordinator?
![Page 153: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/153.jpg)
@calonso
REVIEW QUESTIONS
What is a remote coordinator?
When using multiple DCs and NetworkTopologyStrategy, at the point of replicating in the second DC, the only node that receives the data in that DC will coordinate the request there. Is the remote coordinator.
This is to avoid transmitting all data to all nodes from DC to DC.
![Page 154: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/154.jpg)
@calonso
REVIEW QUESTIONS
How could RF and CL be tuned to ensure immediate consistency?
![Page 155: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/155.jpg)
@calonso
REVIEW QUESTIONS
How could RF and CL be tuned to ensure immediate consistency?
• RF >= 3• CL Write = ONE and CL Read = ALL• CL Write = ALL and CL Read = ONE• CL Write = QUORUM and CL Read = Quorum
![Page 156: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/156.jpg)
@calonso
CQLThe Cassandra Query Language
![Page 157: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/157.jpg)
@calonso
PHYSICAL DATA STRUCTURE
![Page 158: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/158.jpg)
DDL + DMLDefining our data shape
and actually using it
![Page 159: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/159.jpg)
@calonso
DEV CENTER
![Page 160: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/160.jpg)
@calonso
DDL
CREATE KEYSPACE musicdb WITH REPLICATION = { “class”: “SimpleStrategy”, “replication_factor”: 3};
DROP KEYSPACE musicdb;
USE musicdb
![Page 161: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/161.jpg)
@calonso
PRACTICE TIME!
We need to build a system for an online electronic books reading site.
![Page 162: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/162.jpg)
@calonso
PRACTICE TIME!
We need to build a system for an online electronic books reading site.
CREATE KEYSPACE e_library WITH REPLICATION = { “class”: “SimpleStrategy”, “replication_factor”: 3};
![Page 163: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/163.jpg)
@calonso
DDLCREATE TABLE performer ( name VARCHAR, type VARCHAR, country VARCHAR, style VARCHAR, founded INT, born TIMESTAMP, died TIMESTAMP, PRIMARY KEY (name));
![Page 164: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/164.jpg)
@calonso
PRIMARY KEY
PARTITION KEY +CLUSTERING COLUMN(S)
![Page 165: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/165.jpg)
@calonso
PRIMARY KEY• Simple partition key, no clustering columns:
• PRIMARY KEY (name)
• Composite partition key, no clustering columns:
• PRIMARY KEY ((album_title, year))
• Simple partition key and clustering columns:
• PRIMARY KEY (album_title, number)
• Composite partition key and clustering columns:
• PRIMARY KEY ((album_title, year), number)
![Page 166: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/166.jpg)
@calonso
PRIMARY KEYS
CREATE TABLE tracks_by_album ( album_title VARCHAR, year INT, performer VARCHAR STATIC, genre VARCHAR STATIC, number INT, track_title VARCHAR, PRIMARY KEY ((album_title, year), number));
CREATE TABLE albums_by_track ( track_title VARCHAR, performer VARCHAR, year INT, album_title VARCHAR, PRIMARY KEY ( track_title, performer, year, album_title));
![Page 167: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/167.jpg)
CQL TYPE Constants DescriptionASCII strings US-ASCII character strings
BIGINT integers 64-bit signed longBLOB blobs Arbitrary bytes (no validation), as hexadecimal
BOOLEAN booleans true or falseCOUNTER integers Distributed counter value (64 bit long)DECIMAL integers or floats Variable precision decimalDOUBLE integers 64-bit IEEE-754 floating pointFLOAT integers, floats 32-bit IEEE-754 floating pointINET strings IP address string in IPV4 or IPV6 formatINT integers 32-bit signed integerLIST n/a A collection of one or more ordered elementsMAP n/a A JSON style array of literals { literal: literal, literal: literal, …}SET n/a A collection of one or more elements
TEXT strings UTF-8 encoded textTIMESTAMP integers, strings Date + time as mills since EPOCH
TUPLE n/a Up to 32k fieldsUUID uuids Standard UUID
VARCHAR strings UTF-8 encoded stringVARINT integers Arbitrary precision integer
TIMEUUID uuids Type I UUID
![Page 168: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/168.jpg)
@calonso
INSERT
• CQL INSERTS are:
• Atomic: Either all the values are inserted or none
• Isolated: Two inserts on the exact same PK happen one after the other, no mixed values.
INSERT INTO albums_by_performer (performer, year, title, genre)VALUES (‘The Beatles’, 1966, ‘Revolver’, ‘Rock’);
![Page 169: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/169.jpg)
@calonso
UPDATE
• Primary Key columns cannot be changed.
• Full Primary key is required as predicate.
• CQL UPDATES are:
• Atomic: Either all the values are inserted or none
• Isolated: Two inserts on the exact same PK happen one after the other, no mixed values.
UPDATE albums_by_performer SET genre = ‘Rock’WHERE performer = ‘The Beatles’ AND year = 1966 AND title = ‘Revolver’;
![Page 170: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/170.jpg)
@calonso
UPSERT
INSERT INTO albums_by_performer (performer, year, title, genre)VALUES (‘The Beatles’, 1966, ‘Revolver’, ‘Rock’);
UPDATE albums_by_performer SET genre = ‘Rock’WHERE performer = ‘The Beatles’ AND year = 1966 AND title = ‘Revolver’;
==
![Page 171: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/171.jpg)
@calonso
LWT
• Use at your own discretion:
• Cassandra uses Paxos algorithm to determine if the record exists or not.
• In total 6x performance penalty.
INSERT INTO albums_by_performer (performer, year, title, genre)VALUES (‘The Beatles’, 1966, ‘Revolver’, ‘Rock’) IF NOT EXISTS;
![Page 172: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/172.jpg)
@calonso
PRACTICE TIME!
We need to design a system that holds users. Users will have name, ID card (unique), a phones list (home, mobile and work), birth date and an email address.
NOTE: As we haven’t studied SELECT, use SELECT * FROM <table name>; to inspect your data.
![Page 173: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/173.jpg)
@calonso
PRACTICE TIME!
CREATE TABLE users ( ID VARCHAR PRIMARY KEY, name VARCHAR, home_phone VARCHAR, work_phone VARCHAR, mobile_phone VARCHAR, email VARCHAR, birth_date TIMESTAMP
);
![Page 174: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/174.jpg)
@calonso
MORE DDL
ALTER TABLE album ADD cover_image VARCHAR;
ALTER TABLE album DROP cover_image;
ALTER TABLE album ALTER cover_image TYPE BLOB;
![Page 175: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/175.jpg)
@calonso
MORE DDL
CREATE TABLE albums_by_genre ( genre VARCHAR, performer VARCHAR, year INT, album_title VARCHAR, PRIMARY KEY ( genre, performer, year, album_title)) WITH CLUSTERING ORDER BY (performer ASC, year DESC, title ASC);
![Page 176: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/176.jpg)
@calonso
SECONDARY INDEXES• Tables are indexed on columns in a PK
• Search on a partition key is very efficient
• Search on a PK and Clustering column is very efficient
• Search on other things is not supported
• Secondary indexes allow indexing other columns to be queried.
• One index per column
![Page 177: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/177.jpg)
@calonso
SECONDARY INDEXES
CREATE TABLE performer ( name VARCHAR, type VARCHAR, country VARCHAR, style VARCHAR, founded INT, born TIMESTAMP, died TIMESTAMP, PRIMARY KEY (name));
DROP INDEX performers_by_style;
CREATE INDEX performers_by_style ON perfomer (style);
![Page 178: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/178.jpg)
@calonso
SECONDARY INDEXES• Same recommendations for RDBMS
• Use indexes on low cardinality fields
• Beware of the write overhead
• Every node indexes it local data therefore => a read hits all nodes!!
• Don’t use them. Use lookup tables instead.
![Page 179: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/179.jpg)
@calonso
PRACTICE TIME!
We need to query the users by name.
![Page 180: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/180.jpg)
@calonso
PRACTICE TIME!
We need to query the users by name.
CREATE INDEX users_by_name ON users (name);
![Page 181: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/181.jpg)
@calonso
UUID
• Type 4 UUID
• Our way to ensure uniqueness in a distributed system.
7ffa4040-9132-4e0b-b04f-610e869d8717
![Page 182: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/182.jpg)
@calonso
UUID
• Type 4 UUID
• Our way to ensure uniqueness in a distributed system.
7ffa4040-9132-4e0b-b04f-610e869d8717
![Page 183: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/183.jpg)
@calonso
PRACTICE TIME!
Our system has another entity: Books. Books have a title and an author. We have no guarantee of any of them or even their combination to be unique.
![Page 184: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/184.jpg)
@calonso
PRACTICE TIME!
Our system has another entity: Books. Books have a title and an author. We have no guarantee of any of them or even their combination to be unique.
CREATE TABLE books ( uid TIMEUUID PRIMARY KEY, title VARCHAR, author VARCHAR);
![Page 185: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/185.jpg)
@calonso
TIMEUUID
• Timestamp + UUID
• Type 1 UUID
• Generated with CQL now() function
• Can extract the Timestamp with CQL dateof() function
c9cc9e60-711c-11e5-9d70-feff819cdc9f
![Page 186: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/186.jpg)
@calonso
TIMEUUID
• Timestamp + UUID
• Type 1 UUID
• Generated with CQL now() function
• Can extract the Timestamp with CQL dateof() function
c9cc9e60-711c-11e5-9d70-feff819cdc9f
![Page 187: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/187.jpg)
@calonso
TIMEUUID
CREATE TABLE track_ratings_by_user (user UUID,activity TIMEUUID,rating INT,album_title VARCHAR,album_year INT,track_title VARCHAR,PRIMARY KEY (user, activity)
) WITH CLUSTERING ORDER (activity DESC);
![Page 188: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/188.jpg)
@calonso
TTL
• Time To Live for columns specified in seconds.
• After TTL expires, column is marked with a Tombstone.
INSERT INTO albums_by_performer (performer, year, title, genre)VALUES (‘The Beatles’, 1966, ‘Revolver’, ‘Rock’) USING TTL 30;
![Page 189: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/189.jpg)
@calonso
PRACTICE TIME!We are in the BigData era and therefore we want to measure absolutelyeverything our users do in our portal. Actions will be defined by a type (string)and a receiver (int).
![Page 190: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/190.jpg)
@calonso
PRACTICE TIME!We are in the BigData era and therefore we want to measure absolutelyeverything our users do in our portal. Actions will be defined by a type (string)and a receiver (int).
CREATE TABLE user_action ( user_ID VARCHAR, time TIMESTAMP, type VARCHAR, receiver INT, PRIMARY KEY(user_ID, time) );
![Page 191: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/191.jpg)
@calonso
DELETE• A whole partition:
• DELETE FROM <table> WHERE <partition_key> = value;
• A row:
• DELETE FROM <table> WHERE <primary key> = value;
• A column:
• DELETE <column name> FROM <table> WHERE <primary key> = value;
• Deleted things are marked with a tombstone, not actually removed.
![Page 192: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/192.jpg)
@calonso
TRUNCATE
TRUNCATE albums_by_performer;
![Page 193: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/193.jpg)
@calonso
COUNTERS• Implements distributed counters
• The value can only be updated, never set
• Cannot be part of the PK
• If present on a table, all non counter columns in the same table must be part of the PK
CREATE TABLE ratings_by_track (album_title VARCHAR,album_year INT,track_title VARCHAR,num_ratings COUNTER,sum_ratings COUNTERPRIMARY KEY (album_title, album_year, track_title)
);
![Page 194: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/194.jpg)
@calonso
COUNTERS
• Performance considerations
• Update requires a read before
• Accuracy considerations
• Counter update is not idempotent, so retrying false failures leads to wrong value.
![Page 195: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/195.jpg)
@calonso
COUNTERS
• No INSERT
• No value set, only update.
CREATE TABLE stats ( performer VARCHAR albums COUNTER, concerts COUNTER, PRIMARY KEY (performer));
UPDATE stats SET albums = albums + 1, concerts = concerts + 10WHERE performer = ‘The Beatles’;
![Page 196: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/196.jpg)
@calonso
PRACTICE TIME!
We need to keep track of the number of times a specific book has been readby a specific user.
![Page 197: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/197.jpg)
@calonso
PRACTICE TIME!
We need to keep track of the number of times a specific book has been readby a specific user.
CREATE TABLE books_read_by_user ( book_uid UUID, user_ID VARCHAR, times COUNTER, PRIMARY_KEY(book_uid, user_ID));
![Page 198: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/198.jpg)
@calonso
COLLECTIONS
• Set: Uniqueness
• email_addresses SET<VARCHAR>
• List: Order
• email_addresses LIST<VARCHAR>
• Map: Key-Value pairs
• email_addresses MAP<VARCHAR, VARCHAR>
Our users can have several email addresses…
![Page 199: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/199.jpg)
@calonso
SETS• Insert:
• INSERT INTO band (name, members) VALUES (‘The Beatles’, {‘John’, ’Paul’, ‘George’});
• Union (duplicates deletion managed transparently):
• UPDATE band SET members = members + {‘John’, ’Ringo’} WHERE name = ‘The Beatles’;
• Difference:
• UPDATE band SET members = members - {‘Ringo’} WHERE name = ‘The Beatles’;
• Deletion:
• DELETE members FROM band WHERE name = ‘The Beatles’;
![Page 200: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/200.jpg)
@calonso
LISTS
• Insert:
• INSERT INTO song (name, songwriters) VALUES (‘Hold your hand’, [‘John’, ’Paul’]);
• Append:
• UPDATE song SET songwriters = songwriters +[‘Paul’] WHERE name = …;
CREATE TABLE song ( name VARCHAR songwriters LIST<VARCHAR>, PRIMARY KEY (name));
![Page 201: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/201.jpg)
@calonso
LISTS• Prepend:
• UPDATE song SET songwriters = [‘Paul’] + songwriters WHERE name = …;;
• Update:
• UPDATE song SET songwriters[1] = ‘Jonathan’ WHERE name = …;
• Subtract
• UPDATE song SET songwriters = songwriters - [‘Jonathan’] WHERE name = …;
• Delete
• DELETE songwriters[0] FROM song WHERE name = …;
![Page 202: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/202.jpg)
@calonso
MAPS• Insert:
• INSERT INTO album (title, tracks) VALUES (‘Revolver’, { 1: ’Taxman’, 2: ‘Eleanor’});
• Update:
• UPDATE album SET tracks[3] = ‘Yellow Submarine’ WHERE title = …;
• Delete:
• DELETE tracks[3] FROM album WHERE title = …;
CREATE TABLE album ( title VARCHAR,tracks MAP<INT, VARCHAR>,
PRIMARY KEY (title));
![Page 203: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/203.jpg)
@calonso
PRACTICE TIME!
Our users can define a set of preferences in the portal: TimeZone, Language and Currency
![Page 204: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/204.jpg)
@calonso
PRACTICE TIME!
Our users can define a set of preferences in the portal: TimeZone, Language and Currency
ALTER TABLE users ADD preferences MAP<VARCHAR, VARCHAR>;
![Page 205: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/205.jpg)
@calonso
USER DEFINED TYPES
CREATE TABLE track_ratings_by_user (user UUID,activity TIMEUUID,rating INT,song FROZEN <track>,PRIMARY KEY (user, activity)
) WITH CLUSTERING ORDER BY (activity DESC);
CREATE TYPE track (album_title VARCHAR,album_year INT,track_title VARCHAR
);
FROZEN: the value has to be fully written, cannot update a single field (i.e: album_year)
![Page 206: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/206.jpg)
@calonso
USER DEFINED TYPESCREATE TABLE track_ratings_by_user (user UUID,activity TIMEUUID,rating INT,song FROZEN <track>,PRIMARY KEY (user, activity)
) WITH CLUSTERING ORDER BY (activity DESC);
CREATE TYPE track (album_title VARCHAR,album_year INT,track_title VARCHAR
);
INSERT INTO track_ratings_by_user (user, activity, rating, song) VALUES (6ed4f220…, now(), 10, { album_title: ‘Let it be’, album_year: 1970, track_title: ‘Let it be’ });
![Page 207: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/207.jpg)
@calonso
USER DEFINED TYPES
• Update:
• UPDATE track_ratings_by_user SET song = { album_title: ‘Let it be’, album_year: 1970, track_title: ‘Two of us’} WHERE user = … AND activity = …;
• Delete:
• DELETE song FROM track_ratings_by_user WHERE user = … AND activity = …;
![Page 208: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/208.jpg)
@calonso
TUPLESCREATE TABLE user (id UUID PRIMARY KEY,email TEXT,name TEXT,preferences SET<TEXT>,equalizer FROZEN<TUPLE<FLOAT, FLOAT, FLOAT, INT, VARCHAR>>
);
INSERT INTO user (id, equalizer) VALUES (6ed4f220…, (3.0, 1.1, 5.1, 3, “Pop-Rock”));
![Page 209: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/209.jpg)
@calonso
PRACTICE TIME!
Our users can have an e-reader, defined by brand and model.
CREATE TYPE e_reader (brand VARCHAR,model VARCHAR
);
![Page 210: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/210.jpg)
@calonso
PRACTICE TIME!
Our users can have an e-reader, defined by brand and model.
CREATE TYPE e_reader (brand VARCHAR,model VARCHAR
);
ALTER TABLE users ADD reader FROZEN <e_reader>;
![Page 211: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/211.jpg)
BATCHGrouping and atomising queries.
![Page 212: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/212.jpg)
@calonso
BATCH• Combines multiple INSERT, UPDATE and DELETE operations into a
single logical operation:
• Saves on client - coordinator communication
• Atomic: if one succeeds, all will
• No isolation: other transactions can read/write data affected by partial batch.
![Page 213: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/213.jpg)
@calonso
BATCH
• All modified cells will share same timestamp, so when read, will look as atomic => No order guarantee!!
• Don’t use BATCHES with operations on the same PK.
BEGIN BATCH DELETE FROM albums WHERE name = ‘Let it be’; INSERT INTO albums WHERE name = ‘Let it be’;APPLY BATCH;
![Page 214: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/214.jpg)
@calonso
BATCH + LWT
• The whole BATCH will only run if conditions for all LWT are met.
• All operations in the BATCH will run sequentially.
BEGIN BATCH UPDATE user SET lock = true IF lock = false; DELETE FROM albums WHERE name = ‘Let it be’; INSERT INTO albums WHERE name = ‘Let it be’; UPDATE user SET lock = false;APPLY BATCH;
![Page 215: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/215.jpg)
![Page 216: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/216.jpg)
@calonso
ROLLBACK
• Not necessary
• RDBMS cannot know, at the beginning of a transaction, if all queries will be able to succeed
• Cassandra can, so if they won’t doesn’t even start
![Page 217: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/217.jpg)
SELECTSearching for data
![Page 218: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/218.jpg)
@calonso
SELECT• All rows:
• SELECT * FROM album;
• Specific columns:
• SELECT performer, title, year FROM album;
• Specific field from a UDT:
• SELECT performer.lastname FROM album;
• Count:
• SELECT COUNT(*) FROM album;
![Page 219: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/219.jpg)
@calonso
WHERE• Equality matches:
• SELECT * FROM tracks_by_album WHERE album_title = ‘Revolver’ AND year = 1966;
• SELECT * FROM tracks_by_album WHERE album_title = ‘Revolver’ AND year = 1966 AND number = 6;
• IN:
• Only applicable in the last WHERE clause
• SELECT * FROM tracks_by_album WHERE album_title = ‘Revolver’ AND year = 1966 AND number IN (2, 3, 4);
![Page 220: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/220.jpg)
@calonso
WHERE• Range search:
• Only on clustering columns.
• SELECT * FROM tracks_by_album WHERE album_title = ‘Revolver’ AND year = 1966 AND number >= 6 AND number < 2;
• ALLOW FILTERING:
• Allows scanning through all partitions => potentially very time consuming
• SELECT * FROM tracks_by_album WHERE number = 2 ALLOW FILTERING;
![Page 221: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/221.jpg)
@calonso
DATA MODELLINGProcesses and good practices to design our schema.
![Page 222: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/222.jpg)
@calonso
DATA MODELLING
• Understand your data
• Decide how you’ll query the data
• Define column families to satisfy those queries
• Implement and optimize
![Page 223: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/223.jpg)
@calonso
DATA MODELLINGConceptual
Model
Logical Model
Physical Model
Query-DrivenMethodology
Analysis &Validation
![Page 224: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/224.jpg)
@calonso
DATA MODELLINGE-R
Diagram
ChebotkoDiagram
Physical-levelChebotko Diagram
Query-DrivenMethodology
Analysis &Validation
![Page 225: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/225.jpg)
@calonso
CONCEPTUAL MODEL
![Page 226: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/226.jpg)
@calonso
QUERY DRIVEN METHODOLOGY• Spread data evenly around the cluster
• Minimize the number of partitions read
• Follow the mapping rules:
• Entities and relationships: map to tables
• Equality search attributes: must be at the beginning of the primary key
• Inequality search attributes: become clustering columns
• Ordering attributes: become clustering columns
• Key attributes: map to primary key columns
![Page 227: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/227.jpg)
@calonsoLOGICAL MODEL
![Page 228: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/228.jpg)
@calonso
ANALYSIS & VALIDATION• Are write conflicts (overwrites) possible?
• How large are partitions?
• Ncells = Nrow X ( Ncols – Npk – Nstatic ) + Nstatic < 1M
• How much data duplication? (batches)
• Client side joins or new table?
![Page 229: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/229.jpg)
@calonsoPHYSICAL MODEL
![Page 230: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/230.jpg)
@calonso
REVIEW QUESTIONS
What is the relationship between a column family and a CQL table?
![Page 231: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/231.jpg)
@calonso
REVIEW QUESTIONS
What is the relationship between a column family and a CQL table?
Terminologically they’re the same, but technically a column family refers to the physical representation while table refers to the logical tabular representation when queried from CQL.
![Page 232: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/232.jpg)
@calonso
REVIEW QUESTIONS
How are clustering columns ordered by default? How can we modify it?
![Page 233: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/233.jpg)
@calonso
REVIEW QUESTIONS
How are clustering columns ordered by default? How can we modify it?
Ascending by default.We can modify it by adding WITH CLUSTERING ORDER BY… in CQL table definition.
![Page 234: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/234.jpg)
@calonso
REVIEW QUESTIONS
Which is the biggest reason for using UUIDs in Cassandra?
![Page 235: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/235.jpg)
@calonso
REVIEW QUESTIONS
Which is the biggest reason for using UUIDs in Cassandra?
Distributed uniqueness. UUIDs guarantee almost 100% uniqueness in distributed systems.
![Page 236: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/236.jpg)
@calonso
REVIEW QUESTIONS
What is the difference between an UUID and a TIMEUUID?
![Page 237: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/237.jpg)
@calonso
REVIEW QUESTIONS
What is the difference between an UUID and a TIMEUUID?
TIMEUUID contains date and time information embedded.
![Page 238: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/238.jpg)
@calonso
REVIEW QUESTIONS
When should secondary indexes be used?
![Page 239: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/239.jpg)
@calonso
REVIEW QUESTIONS
When should secondary indexes be used?
Very rarely. Only when it’s holding values with very low cardinality and a lookup table is truly inconvenient.
![Page 240: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/240.jpg)
@calonso
REVIEW QUESTIONS
Are CQL COUNTERS 100% accurate?
![Page 241: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/241.jpg)
@calonso
REVIEW QUESTIONS
Are CQL COUNTERS 100% accurate?
No, not 100%, because its update operations are not idempotent and a wrong will assign a wrong value.
![Page 242: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/242.jpg)
@calonso
REVIEW QUESTIONS
What does it mean that Cassandra does UPSERTs?
![Page 243: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/243.jpg)
@calonso
REVIEW QUESTIONS
What does it mean that Cassandra does UPSERTs?
That the INSERT and UPDATE operation are exactly equivalent.
![Page 244: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/244.jpg)
@calonso
REVIEW QUESTIONS
What predicates are allowed in a CQL query?
![Page 245: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/245.jpg)
@calonso
REVIEW QUESTIONS
What predicates are allowed in a CQL query?
Equality, Inequality and IN
![Page 246: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/246.jpg)
@calonso
REVIEW QUESTIONS
When should the ALLOW FILTERING clause be used?
![Page 247: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/247.jpg)
@calonso
REVIEW QUESTIONS
When should the ALLOW FILTERING clause be used?
Typically never. Only in development to scan through all your data.
![Page 248: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/248.jpg)
@calonso
REVIEW QUESTIONS
How can data from two tables be combined in a CQL query?
![Page 249: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/249.jpg)
@calonso
REVIEW QUESTIONS
How can data from two tables be combined in a CQL query?
Cassandra doesn’t support JOIN statements, so we can:• Nest dependent data in the same table.• JOIN at application level.
![Page 250: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/250.jpg)
@calonso
REVIEW QUESTIONS
How can data from two tables be combined in a CQL query?
![Page 251: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/251.jpg)
@calonso
REVIEW QUESTIONS
How can data from two tables be combined in a CQL query?
Cassandra doesn’t support JOIN statements, so we can:• Nest dependent data in the same table.• JOIN at application level.
![Page 252: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/252.jpg)
@calonso
REVIEW QUESTIONS
What is the purpose of Chebotko Diagrams?
![Page 253: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/253.jpg)
@calonso
REVIEW QUESTIONS
What is the purpose of Chebotko Diagrams?
Capture our entities and properties as tables along with the query access patterns expected on them.
![Page 254: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/254.jpg)
@calonso
REVIEW QUESTIONS
Which is the most important thing to keep in mind when designing our data models?
![Page 255: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/255.jpg)
@calonso
REVIEW QUESTIONS
Which is the most important thing to keep in mind when designing our data models?
Minimize the number of accessed partitions.
![Page 256: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/256.jpg)
@calonso
MORE CASSANDRA CONCEPTSWrite and read paths and compactions.
![Page 257: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/257.jpg)
WRITE PATHThe writing process
![Page 258: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/258.jpg)
@calonso
WRITE PATHRDBMS CASSANDRA
![Page 259: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/259.jpg)
@calonso
WRITE PATH
• Memtable: in-memory tables corresponding to CQL tables.
• CommitLog: append-only log to make writes durable.
• SSTables: Memtable snapshots periodically flushed to disk. Never updated.
• Compaction: Periodic process to merge and streamline SSTables.
![Page 260: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/260.jpg)
![Page 261: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/261.jpg)
@calonso
FLUSH PROCESS• Dumps a Memtable to a new SSTable on disk and its summary index.
• Marks associated commit log entries as flushed
• Triggered by:
• memtable_total_space_in_mb reached
• commitlog_total_space_in_mb reached
• nodetool flush
![Page 262: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/262.jpg)
READ PATHThe reading process
![Page 263: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/263.jpg)
@calonso
READ PATH• Memtable: in memory table. Serves data as part of the merge process
• RowCache: in memory cache. Stores recently read columns
• BloomFilter : predicts wether a partition key may be in its corresponding SSTable
• KeyCaches: maps recently read partition keys to specific SSTable offsets
• Partition summaries: indexes the partition indexes.
• Partition indexes: Sorted partition keys mapped to their SSTables offsets
• SSTables: static files containing data.
![Page 264: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/264.jpg)
![Page 265: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/265.jpg)
@calonso
READ/WRITE STATS
nodetool cfstats <keyspace>.<column family>
![Page 266: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/266.jpg)
COMPACTIONSStreamlining tables in disk
![Page 267: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/267.jpg)
@calonso
DELETES
• When a column is deleted a Tombstone is applied to the column in its Memtable
• Tombstoned read columns are ignored
• Tombstoned columns are around for gc_grace_seconds time.
• gc_grace_seconds time is configurable, but beware “Zombies”
![Page 268: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/268.jpg)
@calonso
COMPACTIONS• Merges most recent partition keys and columns
• Evicts tombstoned columns
• Creates new SSTable
• Rebuilds partition indexes and summaries
• Deletes old SSTables
![Page 269: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/269.jpg)
![Page 270: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/270.jpg)
@calonso
COMPACTIONS
• SizeTieredCompactionStrategy
• LeveledCompactionStrategy
• DateTieredCompactionStrategy
CREATE TABLE user (id UUID PRIMARY KEY,email TEXT,name TEXT,preferences SET<TEXT>,
) WITH COMPACTION = { “class”: “<strategy>”, <params> };
![Page 271: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/271.jpg)
@calonso
SIZE TIERED COMPACTION
![Page 272: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/272.jpg)
@calonso
SIZE TIERED COMPACTION• Fast to complete
• Tables size endlessly increasing
• Potentially inconsistent read latency for updated data
• May waste disk as we don’t know when deleted data will be merged away
• Requires 2x free disk space as largest table
• Recommended for write-once, read-many use cases
![Page 273: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/273.jpg)
@calonso
LEVELED COMPACTION
![Page 274: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/274.jpg)
@calonso
LEVELED COMPACTION
• Continuously compacting (more I/O)
• 10 x stable_size_in_mb (160Mb) as max required disk space
• Ensures low read latency
• Recommended with overwrites (updates) and tombstones
![Page 275: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/275.jpg)
@calonso
DATETIERED COMPACTION
• Compacts together data that was written near in time
• Recommended for time series
![Page 276: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/276.jpg)
@calonso
REVIEW QUESTIONS
What happens when a Memtable is flushed?
![Page 277: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/277.jpg)
@calonso
REVIEW QUESTIONS
What happens when a Memtable is flushed?
We create a new SSTable on disk. Also the corresponding CommitLog entries are marked as flushed.
![Page 278: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/278.jpg)
@calonso
REVIEW QUESTIONS
What causes a Memtable to flush?
![Page 279: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/279.jpg)
@calonso
REVIEW QUESTIONS
What causes a Memtable to flush?
• memtable_total_space_in_mb reached• commitlog_total_space_in_mb reached• nodetool flush manually executed
![Page 280: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/280.jpg)
@calonso
REVIEW QUESTIONS
Do disk seeks happen during writes?
![Page 281: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/281.jpg)
@calonso
REVIEW QUESTIONS
Do disk seeks happen during writes?
No, during writes we only write to the commit log that is an append-ahead log type. That means that writes happen
sequentially on disk.
![Page 282: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/282.jpg)
@calonso
REVIEW QUESTIONS
What benefit do Bloom Filters provide to the read process?
![Page 283: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/283.jpg)
@calonso
REVIEW QUESTIONS
What benefit do Bloom Filters provide to the read process?
It allows to skip reading SSTables that do not have the data we’re looking for.
![Page 284: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/284.jpg)
@calonso
REVIEW QUESTIONS
Is the partition summary read for partition keys found in the key cache?
![Page 285: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/285.jpg)
@calonso
REVIEW QUESTIONS
Is the partition summary read for partition keys found in the key cache?
No. The key cache allows us to skip the partition summary and partition index and go straight to the SSTable.
![Page 286: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/286.jpg)
@calonso
REVIEW QUESTIONS
What is the relationship between the partition summary and the partition index?
![Page 287: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/287.jpg)
@calonso
REVIEW QUESTIONS
What is the relationship between the partition summary and the partition index?
The partition summary is an index on the partition index.
![Page 288: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/288.jpg)
@calonso
REVIEW QUESTIONS
What are zombie columns and how do you prevent them?
![Page 289: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/289.jpg)
@calonso
REVIEW QUESTIONS
What are zombie columns and how do you prevent them?
Zombie columns are those that appear after bringing up a node that has been down for long enough to not see the tombstone
(gc_grace_seconds).
![Page 290: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/290.jpg)
@calonso
REVIEW QUESTIONS
What are the benefits of SizeTieredCompaction?
![Page 291: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/291.jpg)
@calonso
REVIEW QUESTIONS
What are the benefits of SizeTieredCompaction?
• Enable fast write operations• Less disk I/O pressure
![Page 292: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/292.jpg)
@calonso
REVIEW QUESTIONS
What are the benefits of LeveledCompaction?
![Page 293: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/293.jpg)
@calonso
REVIEW QUESTIONS
What are the benefits of LeveledCompaction?
• Predictable fast read performance• Not necessary to have a lot of free disk space for it to happen.
![Page 294: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/294.jpg)
@calonso
HARDWARE CONSIDERATIONS
![Page 295: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/295.jpg)
@calonso
MEMORY• Memory helps reads
• Recommendations
• Dedicated machines: 16GB - 64GB. Never below 8GB
• Virtual machines: 8GB - 16GB. Never below 4GB
• Testing machines: Virtual machines ~ 256Mb
![Page 296: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/296.jpg)
@calonso
CPU
• CPU helps writes
• Recommendations
• Dedicated machines: 8 core processors
• Virtual machines: 8 cores + CPU burst
![Page 297: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/297.jpg)
@calonso
DISK• SizeTieredCompaction: 50% free disk space
• LeveledCompaction: 10% free disk space
• Recommendations
• 500gb to 1tb per node
• Two drives: One for data, one for CommitLog
• SSDs if possible
![Page 298: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/298.jpg)
@calonso
NETWORK
• Gigabit ethernet or faster
![Page 299: Cassandra Workshop - Cassandra from scratch in one day](https://reader038.vdocuments.us/reader038/viewer/2022102322/587841151a28ab707b8b6789/html5/thumbnails/299.jpg)
@calonso
THANK YOU!