cassandra - doag deutsche oracle-anwendergruppe e.v. · • cql creates a common languageso that...

55
BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH Apache Cassandra Big Data: Why do I need Cassandra Jan Ott @ jhsott

Upload: others

Post on 26-Sep-2019

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH

Apache CassandraBig Data: Why do I need Cassandra

Jan Ott

@jhsott

Page 2: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Our company.

Trivadis DOAG16: Cassandra2 17.11.16

Trivadis is a market leader in IT consulting, system integration, solution engineeringand the provision of IT services focusing on andtechnologiesin Switzerland, Germany, Austria and Denmark. We offer our services in the followingstrategic business fields:

Trivadis Services takes over the interacting operation of your IT systems.

O P E R A T I O N

Page 3: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

COPENHAGEN

MUNICH

LAUSANNE

BERN

ZURICH

BRUGG

GENEVA

HAMBURG

DÜSSELDORF

FRANKFURT

STUTTGART

FREIBURG

BASEL

VIENNA

With over 600 specialists and IT experts in your region.

Trivadis DOAG16: Cassandra3 17.11.16

14 Trivadis branches and more than600 employees

200 Service Level Agreements

Over 4,000 training participants

Research and development budget:CHF 5.0 million

Financially self-supporting andsustainably profitable

Experience from more than 1,900 projects per year at over 800customers

Page 4: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

17.11.16 Trivadis DOAG16: Cassandra4

Technology on its own won't help you.You need to know how to use it properly.

Page 5: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Agenda

1. Introduction - What is Apache Cassandra?

2. Cassandra Data Model

3. First Steps in the Cassandra World

4. Cassandra Query Language - CQL

5. Summary

17.11.16 Trivadis DOAG16: Cassandra5

Page 6: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

What is Apache Cassandra?

17.11.16 Trivadis DOAG16: Cassandra6

Page 7: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

The Cassandra Elevator Pitch

„Apache Cassandra is an open source, distributed, decentralized, elastically scalable, highly available, fault-tolerant, tuneablyconsistent, row-oriented database that bases its distributiondesign on Amazon’s Dynamo and its data model on Google’sBigtable. Created at Facebook, it is now used at some of themost popular sites on the Web.“

Cassandra: The Definitive Guide by Jeff Carpenter and Eben Hewitt

17.11.16 Trivadis DOAG16: Cassandra7

Page 8: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

History of Cassandra

Bigtable Dynamo

17.11.16 Trivadis DOAG16: Cassandra8

Page 9: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Highlights

• Apache Cassandra™ is free• Distributed & Decentralized• Elastic Scalable• High Available• Fault Tolerance• Tunable data consistency• CQL language (like SQL)

17.11.16 Trivadis DOAG16: Cassandra9

Page 10: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Elastic Scalable

• Capable of comfortably scaling to petabytes• New nodes = Linear performance increases• Add and remove nodes online

100'000

txns/sec

200'000

txns/sec

400'000

txns/sec

17.11.16 Trivadis DOAG16: Cassandra10

Page 11: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Continuous availability

• All nodes the same

• Customized replication affords tunable data redundancy

• Read/write from any node

• Can replicate data among different physical data center racks

Node1

Node2

Node3

Node4[26-50]

[0-25]

[51-75]

[76-100] [0-25]

[0-25]

[26-50]

[26-50]

[51-75]

[51-75]

[76-100]

[76-100]

client

17.11.16 Trivadis DOAG16: Cassandra11

Page 12: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Multi Data-centers (DCs)

• Out-of-the-box (config only)

• Client writes local

• Data syncs across WAN

• Replication per Data Center

• Cloud platforms support

• AWS config for multi-regions DCs

• Google Compute Engine (GCE) Support

• Microsoft Azure support

• Cloud Stack support

17.11.16 Trivadis DOAG16: Cassandra12

Page 13: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Multi DC usages

• Data locality• Disaster recovery

Node1

Node2

Node3

Node4

Node1

Node2

Node3

Node4

WestEastClient

Trivadis DOAG16: Cassandra17.11.1613

Page 14: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Write Requests

coordinator sends a write request to all replicas that own the row being written

17.11.16 Trivadis DOAG16: Cassandra14

Page 15: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Read Requests

There are two types of read requests that a coordinator can send to a replica:

• A direct read request

• A background read repair request

The number of replicas contacted by a direct read request is determined by the consistency level specified by the client.

17.11.16 Trivadis DOAG16: Cassandra15

Page 16: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Tunable Data Consistency

What do I need?

• Writes• Reads• Consistency 1

2

3

4

5

6

� Any

� One/Two/Three

� Local_One

� Quorum

� Local_Quorum

� Each_Quorum

� All

Writes

� One/Two/Three

� Local_One

� Quorum

� Local_Quorum

� Each_Quorum

� All

Reads

17.11.16 Trivadis DOAG16: Cassandra16

Page 17: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Who is using Cassandra?

Largest infrastructure running over 75,000 Cassandra nodes, storing more than 10 petabytes of data with one cluster was over 1,000 nodes

17.11.16 Trivadis DOAG16: Cassandra17

Page 18: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Cassandra Data Model

17.11.16 Trivadis DOAG16: Cassandra18

Page 19: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Cassandra Data Model – 3. Normalform

• Forget Normalization – DENORMALIZE

• Design by Query

• No Joins – Denormalize

- Model

- Materialized Views

- Do it on the client side – not recommended

• No Referential Integrity

- Possible to define but not enforced

17.11.16 Trivadis DOAG16: Cassandra19

Page 20: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

How Cassandra stores data

• Model brought from Google Bigtable• Row Key and a lot of columns• Column names sorted (UTF8, Int, Timestamp, etc.)

ColumnName … Column Name

ColumnValue ColumnValue

Timestamp Timestamp

TTL TTL

RowKey

1 2Billion

BillionofRows

Trivadis DOAG16: Cassandra17.11.1620

Page 21: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Static Column Family – "Skinny Row"

21

rowkey

CREATE TABLE skinny (rowkey text, c1 text PRIMARY KEY,c2 text,c3 text,

PRIMARY KEY (rowkey));

GrowsuptoBillionofRows

rowkey-1 c1 c2 c3

value-c1 value-c2 value-c3

rowkey-2 c1 c3

value-c1 value-c3

rowkey-3 c1 c2 c3

value-c1 value-c2 value-c3

c1 c2 c3

PartitionKey

17.11.16 Trivadis DOAG16: Cassandra

Page 22: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Dynamic Column Family – "Wide Row"

22

rowkey

BillionofRows rowkey-1 ckey-1:c1 ckey-1:c2

value-c1 value-c2

rowkey-2

rowkey-3

CREATE TABLE wide (rowkey text, ckey text,c1 text,c2 text,

PRIMARY KEY (rowkey, ckey) WITH CLUSTERING ORDER BY (ckey ASC);

ckey-2:c1 ckey-2:c2

value-c1 value-c2

ckey-3:c1 ckey-3:c2

value-c1 value-c2

ckey-1:c1 ckey-1:c2

value-c1 value-c2

ckey-2:c1 ckey-2:c2

value-c1 value-c2

ckey-1:c1 ckey-1:c2

value-c1 value-c2

ckey-2:c1 ckey-2:c2

value-c1 value-c2

ckey-3:c1 ckey-3:c2

value-c1 value-c2

1 2Billion

PartitionKey Clustering Key

17.11.16 Trivadis DOAG16: Cassandra

Page 23: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

First Steps

17.11.16 Trivadis DOAG16: Cassandra23

Page 24: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Getting Cassandra

Apache Cassandra Distribution• http://cassandra.apache.org/

DataStax Distribution• DataStax Enterprice 5.0 - Sandbox - VM

https://academy.datastax.com/downloads/welcomeVM with Cassandra (1 node), DataStax DevCenter, DataStax OpsCenter

• Oracle Virtual Boxhttp://www.oracle.com/technetwork/server-storage/virtualbox/downloads/index.html

• Login – datastax/datastax

17.11.16 Trivadis DOAG16: Cassandra24

Page 25: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

DataStax OpsCenter

• At-a-Glance Cluster Management

• Point-and-Click Provisioning andAdministration

• Secured Administration

• Always On Management and Monitoring

• Visual Monitoring and Tuning

• Best Practice Advice

• Proactive Assistance

• Smart Data Protection17.11.16 Trivadis DOAG16: Cassandra25

Page 26: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

DataStax OpsCenter

Point-and-Click Provisioning and Administration

17.11.16 Trivadis DOAG16: Cassandra26

Page 27: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

DataStax OpsCenter

Visual Monitoring and Tuning

17.11.16 Trivadis DOAG16: Cassandra27

Page 28: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

CQL – Cassandra Query Language

17.11.16 Trivadis DOAG16: Cassandra28

Page 29: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Introducing CQL

• CQL is a reintroduction of schema so that you don't have to read code to understand the data model.

• CQL creates a common language so that details of the data model can be easily communicated.

• CQL is a best-practices Cassandra interface and hides the messy details.

17.11.16 Trivadis DOAG16: Cassandra29

Page 30: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

CQL Language

• SQL like syntax

• Data Definition Language – DDL CREATE / ALTER / DROP / …

• Data Manipulation Language – DML INSERT, UPDATE, DELETE

• Query data with SELECT

• Build in Functions – COUNT, MIN, MAX, sum, avg, LIMIT, ...

• UDF – User Defined Function / UDA - User Defined Aggregate

17.11.16 Trivadis DOAG16: Cassandra30

Page 31: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

CQL Shell for Apache Cassandra

cqlsh is the command line utility for execution CQL commands (think of SQL*Plus for Cassandra)

CQL3 is default since Cassandra 1.2

$ cqlshConnected to DataStaxCluster at localhost:9160.[cqlsh 4.1.0 | Cassandra 2.0.5.24 | CQL spec 3.1.1 | Thrift protocol 19.39.0]Use HELP for help.cqlsh>

17.11.16 Trivadis DOAG16: Cassandra31

Page 32: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

CQL Shell for Apache Cassandra

$ cat create-table.cql | cqlsh

$ cqlsh –f create-table.cql

cqlsh> SOURCE '~/cassandra_training/cql/create-table.cql'

Execute a script with the –f option

Alternatively pie scripts into cqlsh

Source files inside cqlsh

17.11.16 Trivadis DOAG16: Cassandra32

Page 33: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Creating a Keyspace

Create a keyspace with SimpleStrategy and replication factor option

Make the new keyspace the active one

cqlsh> CREATE KEYSPACE my_spaceWITH REPLICATION = {'class':'SimpleStrategy',

'replication_factor':1};

cqlsh> USE my_space;

cqlsh:my_space>

17.11.16 Trivadis DOAG16: Cassandra33

Page 34: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Describing a Keyspace

Use the DESCRIBE KEYSPACE to show the metadata of the keyspace

cqlsh> DESCRIBE KEYSPACE my_space;

CREATE KEYSPACE training WITH replication = {'class': 'SimpleStrategy','replication_factor': '1'

};

cqlsh>

17.11.16 Trivadis DOAG16: Cassandra34

Page 35: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Create a Static table Dept

Use CREATE TABLE to create a static column family (table) named ”dept"

cqlsh:my_space> CREATE TABLE dept(deptno int,dname varchar,loc varchar,PRIMARY KEY (deptno));

dname loc

10 ACCOUNTING NEWYORK

20 RESEARCH DALLAS

30 SALES CHICAGO

40 OPERATIONS BOSTON

17.11.16 Trivadis DOAG16: Cassandra35

Page 36: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Create a Dynamic table (wide-row) Employee

A Dynamic Table is also created with the CREATE TABLE statement but using a composite partition key cqlsh:training> CREATE TABLE emp(

empno int,ename varchar,…deptno int,primary key (dname,ename));

KING:empno ... CLARK:empno ...

10 7839 ... 7782 ...

JONES:empno ... SCOTT:empno ... FORD:empno ...

20 7566 ... 7788 ... 7902 ...

17.11.16 Trivadis DOAG16: Cassandra36

Page 37: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Truncate / Drop Table

Use TRUNCATE to truncate the data

Use DROP TABLE to drop the whole table, operation is irreversible and removes all information within the specified table!

• Will raise an error, if it does not exist, use IF EXISTS to prevent (new in 2.0):

cqlsh:training> TRUNCATE employee;

cqlsh:training> DROP TABLE employee;

cqlsh:training> DROP TABLE IF EXISTS employee;

17.11.16 Trivadis DOAG16: Cassandra37

Page 38: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Insert data into Dept

• PRIMARY KEY is always required• Insert with same primary key => update

cqlsh:training> INSERT INTO dept (deptno, dname, loc)VALUES (10, 'ACCOUNTING', 'NEW YORK');

17.11.16 Trivadis DOAG16: Cassandra38

Page 39: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Retrieving data from Dept table

SELECT statement returns rows and columns, just as in SQLIt can optionally also have a WHERE clause, an ORDER BY clause and a LIMIT clause

cqlsh:training> SELECT deptno, dname FROM dept LIMIT 2;

deptno | dname--------+------------

10 | ACCOUNTING30 | SALES

(2 rows)

17.11.16 Trivadis DOAG16: Cassandra39

Page 40: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Retrieving data from Dept table (II)

Restriction on column other than PRIMARY KEY won't work

Can be solved with an Index (but be careful, better use de-normalization)

cqlsh:my_space> SELECT * FROM dept WHERE loc = 'NEW YORK';InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING"

cqlsh:my_space> CREATE INDEX ON dept(loc);cqlsh:my_space> SELECT * FROM dept WHERE loc = 'NEW YORK';deptno | dname | loc

--------+------------+----------10 | ACCOUNTING | NEW YORK

17.11.16 Trivadis DOAG16: Cassandra40

Page 41: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Update data in Dept

• WHERE over Primary Key• If Primary Key does not exist => INSERT

cqlsh:my_space> UPDATE dept SET loc = 'LOS ANGELES' WHERE deptno = 10;

17.11.16 Trivadis DOAG16: Cassandra41

Page 42: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Cassandra Data TypesCategory CQLDataType Description

String ascii US-ASCIIcharacterstring

text UTF-8encodedstring, usedmostofthetimefor

storingStringdata.

varchar UTF-8Strings.

inet Used forstoringIPaddresses

Numeric int 32-bitsignedinteger

float 32-bitIEEE-754floatingpoint

double 64-bitIEEE-754floatingpoint

varint Arbitraryprecision integers

bigint 64-bitnumber,equivalenttolong.

decimal Variable-precisiondecimal

counter Distributedcountervalue(64-bitlong)

17.11.16 Trivadis DOAG16: Cassandra42

Page 43: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Cassandra Data Types (II)

Category CQLDataType Description

UUIDs uuid AUUIDinstandardUUIDformat

timeuuid Type1UUIDonly,forstoringuniquetime-base IDs

Collections list Ordered collectionofoneormoreelements

map Collectionofarbitrary key-valuepairs

set Unorderedcollectionofoneormoreunique

elements

Miscellaneous boolean Boolean(true/false)

blob Usedfor storingbinarydatawritteninhexadecimal

timestamp Date/Time

17.11.16 Trivadis DOAG16: Cassandra43

Page 44: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Batch operation

• COMMIT ?• BEGIN BATCH … APPLY BATCH – execute multiple mutations – single operation

BEGIN BATCHINSERT INTO dept (deptno, dname, loc)VALUES (50, 'IT', 'ZURICH');UPDATE emp SET sal = 9000 WHERE empno = 9000;

APPLY BATCH;

17.11.16 Trivadis DOAG16: Cassandra44

Page 45: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Alter Table

• ALTER TABLE change meta data

• CQL is quickflexible schema, not changes to existing data

cqlsh:my_space> ALTER TABLE deptADD operational BOOLEAN;

cqlsh:training> DESCRIBE TABLE employee;

CREATE TABLE my_space.dept (deptno int PRIMARY KEY,dname text,loc text,operational boolean

) WITH...;

cqlsh:my_space> SELECT * FROM dept LIMIT 2;deptno | dname | loc | operational

--------+------------+-------------+-------------

50 | IT | ZURICH | null10 | ACCOUNTING | LOS ANGELES | null

(2 rows)

17.11.16 Trivadis DOAG16: Cassandra45

Page 46: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Collections

CQL3 also supports collections for storing complex data structures• Set {value,…}, List [value,…], Map {key:value,…}

cqlsh:training> CREATE TABLE collection_sample(id int PRIMARY KEY,

string_set set<text>,string_list list<text>,string_map map<text, text>);

cqlsh:training> INSERT INTO coll(id, string_set, string_list, string_map) VALUES (1,

{'text1','text2','text1'}, ['text1','text2','text1'], {'key1':'value1'});

17.11.16 Trivadis DOAG16: Cassandra46

Page 47: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Collections (II)

cqlsh:training> SELECT * FROM collection_sample;

id | string_list | string_map | string_set----+-----------------------------+--------------------+--------------------1 | ['text1', 'text2', 'text1'] | {'key1': 'value1'} | {'text1', 'text2'}

(1 rows)

17.11.16 Trivadis DOAG16: Cassandra47

Page 48: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

UDF – User Defined Function

Sample CodeCREATE FUNCTION count_if_true(input boolean) RETURNS NULL ON NULL INPUT RETURNS intLANGUAGE java AS 'if (input) return 1; else return total;';

SELECT door_number, count_if_true(is_open)FROM my_doors;

„CREATE OR REPLACE” or “IF NOT EXSITS” Syntax possible

17.11.16 Trivadis DOAG16: Cassandra48

Page 49: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

UDA – User Defined Aggregate

Sample CodeCREATE FUNCTION state_count_if_true(total int, input boolean) RETURNS NULL ON NULL INPUT RETURNS intLANGUAGE java AS 'if (input) return total+1; else return total;';

CREATE AGGREGATE total_open (boolean)SFUNC state_count_if_trueSTYPE intINITCOND 0;

SELECT door_number, total_open(is_open)FROM my_doors;

„CREATE OR REPLACE” or “IF NOT EXSITS” Syntax possible

17.11.16 Trivadis DOAG16: Cassandra49

Page 50: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Materialized Views

Relieve the pain of manual denormalization

cqlsh:training> CREATE MATERIALIZED VIEW employee_by_role (AS SELECT role, name, ageFROM employeeWHERE role IS NOT NULLPRIMARY KEY (role, name);

cqlsh:training> CREATE TABLE employee_by_role (role text, name text, age int,PRIMARY KEY (role, name));

Cassandra3.0

17.11.16 Trivadis DOAG16: Cassandra50

Page 51: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Time-to-Live (TTL) on Insert

• Insert a row with a TTL in seconds (30s) • after that the row is deleted

cqlsh:my_space> UPDATE emp USING ttl 15 SET sal = 8000 WHERE empno = 9000;

cqlsh:my_space> SELECT ename, sal, ttl(sal) FROM emp WHERE empno = 9000;

ename | sal | ttl(sal)-------+------+----------

null | 8000 | 15

(1 rows)

17.11.16 Trivadis DOAG16: Cassandra51

Page 52: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Summary

• Just great J• No single point of failure – Ring Model• Distribution over nodes / rack’s / data center’s• Tuneable Consistency• CQL• Spark / Cassandra Integration

• CQL limited

• Forget 20 years of experience in relational modelling L => DENORMALIZE J

17.11.16 Trivadis DOAG16: Cassandra52

Page 53: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Jan OttSenior Consultant

[email protected]

17.11.16 Trivadis DOAG16: Cassandra53

Page 54: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

References

• BooksCassandra: The Definite Guide, 2nd Edition

• Apache – Cassandra CQL Documentationhttps://cassandra.apache.org/doc/latest/cql/index.html

• DataStax – CQL Documentationhttp://docs.datastax.com/en/cql/3.3/cql/cql_using/useAboutCQL.html

• Netflix and Cassandrahttp://techblog.netflix.com/search/label/Cassandra

17.11.16 Trivadis DOAG16: Cassandra54

Page 55: Cassandra - DOAG Deutsche ORACLE-Anwendergruppe e.V. · • CQL creates a common languageso that details of the data model can be easily communicated. • CQL is a best-practices

Trivadis @ DOAG 2016

Booth: 3rd Floor – next to the escalatorKnow how, T-Shirts, Contest and Trivadis Power to goWe look forward to your visitBecause with Trivadis you always win !

17.11.16 Trivadis DOAG16: Cassandra55