version 4svoboda/courses/2016... · datamodel columnvalues • emptyvalue null • atomicvalue...

48
B4M36DS2: Database Systems 2 hƩp://www.ksi.mff.cuni.cz/˜svoboda/courses/2016-1-B4M36DS2/ Lecture 10 Column-Family Stores: Cassandra MarƟn Svoboda [email protected]ff.cuni.cz 5. 12. 2016 Charles University in Prague, Faculty of MathemaƟcs and Physics Czech Technical University in Prague, Faculty of Electrical Engineering

Upload: others

Post on 25-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

B4M36DS2: Database Systems 2h p://www.ksi.mff.cuni.cz/˜svoboda/courses/2016-1-B4M36DS2/

Lecture 10

Column-Family Stores: CassandraMar n [email protected]

5. 12. 2016

Charles University in Prague, Faculty of Mathema cs and PhysicsCzech Technical University in Prague, Faculty of Electrical Engineering

Page 2: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Lecture OutlineColumn-family stores

• General introduc onApache Cassandra

• Data model• Cassandra query language

DDL statementsDML statements

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 2

Page 3: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Wide Column StoresData model

• Column family (table)Table is a collec on of similar rows (not necessarily iden cal)

• RowRow is a collec on of columns

– Should encompass a group of data that is accessed togetherAssociated with a unique row key

• ColumnColumn consists of a column name and column value(and possibly other metadata records)Scalar values, but also flat sets, lists or mapsmay be allowed

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 3

Page 4: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Wide Column StoresQuery pa erns

• Create, update or remove a row within a given column family• Select rows according to a row key or simple condi ons

Warning• Wide column stores are not just a special kind of RDBMSs

with a variable set of columns!

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 4

Page 5: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Wide Column StoresSuitable use cases

• Event logging, content management systems, blogs, …I.e. for structured flat data with similar schema

When not to use• ACID transac ons are required• Complex queries: aggrega on (SUM, AVG, …), joining, …• Early prototypes: i.e. when database design may change

Representa ves• Apache Cassandra, Apache HBase, Apache Accumulo,

Hypertable, Google Bigtable

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 5

Page 6: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Wide Column StoresRepresenta ves

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 6

Page 7: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Apache Cassandra

Page 8: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Apache CassandraColumn-family database

• h p://cassandra.apache.org/• Features

Open-source, high availability, linear scalability, sharding(spanning mul ple datacenters), peer-to-peer configurablereplica on, tunable consistency, MapReduce support

• Developed by Apache So ware Founda onOriginally at Facebook

• Implemented in Java• Opera ng systems: cross-pla orm• Ini al release in 2008

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 8

Page 9: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Data ModelDatabase system structure

Instance→ keyspaces→ tables→ rows→ columns

• Keyspace• Table (column family)

Collec on of (similar) rowsTable schema must be specified, yet can be modified later on

• RowCollec on of columnsRows in a table do not need to have the same columnsEach row is uniquely iden fied by a primary key

• ColumnName-value pair + addi onal data

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 9

Page 10: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Data ModelColumn values

• Empty valuenull

• Atomic valueNa ve data types such as texts, integers, dates, …Tuples

– Tuple of anonymous fields, each of any type (even different)User defined types (UDT)

– Set of named fields of any type

• Collec onsLists, sets, andmaps

– Nested tuples, UDTs, or collec ons are allowed, but currentlyonly in frozen mode (such elements are serialized when stored)

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 10

Page 11: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Data ModelCollec ons

• ListSorted collec on of non-unique valuesList elements are ordered by their posi onsNot always recommended because of performance issues

– Internal read-before-write opera ons have to be executed

• SetSorted collec on of unique values

• MapSorted collec on of key-value pairsMap elements are ordered by their keysKeys must be unique

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 11

Page 12: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Sample DataTable of actors

id

trojanname

( Ivan, Trojan )year1964

movies{ samotari, medvidek }

machacek

name( Jiří, Macháček )

year1966

movies{ medvidek, vratnelahve, samotari }

schneiderovaname

( Jitka, Schneiderová )year1973

movies{ samotari }

sverakname

( Zdeněk, Svěrák )year1936

movies{ vratnelahve }

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 12

Page 13: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Sample DataTable ofmovies

id

samotaritle

Samotářiyear2000

actorsnull

genres[ comedy, drama ]

medvidek

tleMedvídek

director( Jan, Hřebejk )

year2007

actors{ trojan: Ivan, machacek: Jirka }

vratnelahvetle

Vratné lahveyear2006

actors{ machacek: Robert Landa }

zelarytle

Želaryyear2003

actors{}

genres[ romance, drama ]

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 13

Page 14: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Data ModelAddi onal data associated with…

the whole column in case of atomic values, orevery element of a collec on

• Time-to-live (TTL)A er a certain amount of me (number of seconds)a given value is automa cally deleted

• Timestamp (write me)Timestamp of the last value modifica onAssigned automa cally or manually as well

• Both the records can be queriedUnfortunately not in case of collec ons and their elements

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 14

Page 15: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Cassandra APICQLSH

• Interac ve command line shell• bin/cqlsh• Uses CQL (Cassandra Query Language)

Client drivers• Provided by the community• Available for various languages

Java, Python, Ruby, PHP, C++, Scala, Erlang, …

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 15

Page 16: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Query LanguageCQL = Cassandra Query Language

• Declara ve query languageInspired by SQL

• DDL statementsCREATE KEYSPACE – creates a new keyspaceCREATE TABLE – creates a new table…

• DML statementsSELECT – selects and projects rows from a single tableINSERT – inserts rows into a tableUPDATE – updates columns of rows in a tableDELETE – removes rows from a table…

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 16

Page 17: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

KeyspacesCREATE KEYSPACE

CREATECREATE KEYSPACEKEYSPACE

IFIF NOTNOT EXISTSEXISTS

keyspace namekeyspace name

WITHWITH option nameoption name == option valueoption value

• Creates a new keyspace• Replica on op on is mandatory

SimpleStrategy (one replica on factor)NetworkTopologyStrategy(individual replica on factor for each data center)

CREATE KEYSPACE moviedbWITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3}

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 17

Page 18: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

KeyspacesUSE

• Changes the current keyspace

USEUSE keyspace namekeyspace name

DROP KEYSPACE• Removes a keyspace, all its tables, data etc.

DROPDROP KEYSPACEKEYSPACE keyspace namekeyspace name

IFIF EXISTSEXISTS

ALTER KEYSPACE• Modifies op ons of an exis ng keyspace

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 18

Page 19: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

TablesCREATE TABLE

• Creates a new table within the current keyspace• Each table must have exactly one primary key specified

CREATECREATE TABLETABLE

IFIF NOTNOT EXISTSEXISTS

table nametable name ((

column namecolumn name typetype

PRIMARYPRIMARY KEYKEY

,,

,, primary keyprimary key

))

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 19

Page 20: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

TablesExamples – tables for actors and movies

CREATE TABLE actors (id TEXT PRIMARY KEY,name TUPLE<TEXT, TEXT>,year SMALLINT,movies SET<TEXT>

)

CREATE TABLE movies (id TEXT,title TEXT,director TUPLE<TEXT, TEXT>,year SMALLINT,actors MAP<TEXT, TEXT>,genres LIST<TEXT>,countries SET<TEXT>,PRIMARY KEY (id)

)

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 20

Page 21: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

TablesPrimary key has two parts:

• Compulsory par on keySingle column or mul ple columnsDescribes how table rows are distributed among par ons

• Op onal clustering columnsDefines the clustering order,i.e. how table rows are locally stored within a par on

PRIMARYPRIMARY KEYKEY ((

column namecolumn name

(( column namecolumn name

,,

)),, column namecolumn name

))

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 21

Page 22: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

TablesDROP TABLE

• Removes a table together with all data it contains

DROPDROP TABLETABLE table nametable name

IFIF EXISTSEXISTS

TRUNCATE TABLE• Preserves a table but removes all data it contains

TRUNCATETRUNCATE

TABLETABLE

table nametable name

ALTER TABLE• Allows to alter, add or drop table columns

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 22

Page 23: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

TypesTypes of columns

• Na ve types• Tuples• Collec on types: lists, sets, andmaps• User-defined types

native typenative type

TUPLETUPLE << typetype

,,

>>

LISTLIST << typetype >>

SETSET << typetype >>

MAPMAP << typetype ,, typetype >>

......

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 23

Page 24: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

TypesNa ve types

• tinyint, smallint, int, bigintSigned integers (1B, 2B, 4B, 8B)

• varintArbitrary-precision integer

• decimalVariable-precision decimal

• float, doubleFloa ng point numbers (4B, 8B)

• booleanBoolean values true and false

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 24

Page 25: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

TypesNa ve types

• text, varcharUTF8 encoded stringEnclosed in single quotes (not double quotes)

– Escaping sequence: ''

• asciiASCII encoded string

• date, time, timestampDates, mes and mestampsE.g. '2016-12-05', '2016-12-05 09:15:00', 1480929300

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 25

Page 26: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

TypesNa ve types

• counter – 8B signed integerOnly 2 opera ons supported: incremen ng and decremen ng

– I.e. value of a counter cannot be set to a par cular numberRestric ons in usage

– Counters cannot be a part of a primary key– Either all table columns (outside the primary key) are counters,

or none of them– TTL is not supported– …

• blob – arbitrary bytes• inet – IP address (both IPv4 and IPv6)• …

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 26

Page 27: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

LiteralsTuple and collec on literals

• Literals for tuples, lists, sets, andmaps

((

termterm

,,

))

[[

termterm

,,

]]

{{

termterm

,,

}}

{{

termterm :: termterm

,,

}}

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 27

Page 28: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Selec onSELECT statement

• Selects matching rows from a single table

SELECT clauseSELECT clause FROM clauseFROM clause

WHERE clauseWHERE clause

GROUP BY clauseGROUP BY clause ORDER BY clauseORDER BY clause LIMIT clauseLIMIT clause

ALLOWALLOW FILTERINGFILTERING

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 28

Page 29: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Selec onClauses of SELECT statements

• SELECT – columns or values to appear in the result• FROM – single table to be queried• WHERE – filtering condi ons to be applied on table rows• GROUP BY – columns used for grouping of rows• ORDER BY – criteria defining the order of rows in the result• LIMIT – number of rows to be included in the result

ExampleSELECT id, title, actorsFROM moviesWHERE year = 2000 AND genres CONTAINS 'comedy'

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 29

Page 30: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Selec onFROM clause

• Defines a single table to be queriedFrom the current / specified keyspace

• I.e. joining of mul ple tables is not possible

FROMFROM

keyspace namekeyspace name ..

table nametable name

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 30

Page 31: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Selec onWHERE clause

• One or more rela ons a row must sa sfyin order to be included in the query result

WHEREWHERE relationrelation

ANDAND

• Only simple condi ons can be expressed andnot all rela ons are allowed, e.g.:

only primary key columns can be involvedunless secondary index structures existnon-equal rela ons on par on keys are not supported…

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 31

Page 32: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Selec onWHERE clause: rela ons

column namecolumn name

(( column namecolumn name

,,

))

==

!=!=

<<

<=<=

=>=>

>>

termterm

ININ (( termterm

,,

))

CONTAINSCONTAINS

CONTAINSCONTAINS KEYKEY

termterm

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 32

Page 33: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Selec onWHERE clause: rela ons

• Comparisons=, !=, <, <=, =>, >

• INReturns true if the actual value is one of the enumerated

• CONTAINSMay only be used on collec ons (lists, sets, and maps)Returns true if a collec on contains a given element

• CONTAINS KEYMay only be used on mapsReturns true is a map contains a given key

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 33

Page 34: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Selec onSELECT clause

• Defines columns or values to be included in the result* = all the table columnsAliases can be defined using AS

SELECTSELECT

DISTINCTDISTINCT

**

selectorselector

ASAS identifieridentifier

,,

• DISTINCT – duplicate rows are removed

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 34

Page 35: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Selec onSELECT clause: selectors

column namecolumn name

termterm

COUNTCOUNT (( ** ))

WRITETIMEWRITETIME (( column namecolumn name ))

TTLTTL (( column namecolumn name ))

• COUNT(*)Number of all the rows in a group (see aggrega on)

• WRITETIME and TTLSelects mestamp / remaining me-to-live of a given columnCannot be used on collec ons and their elementsCannot be used in other clauses (e.g. WHERE)

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 35

Page 36: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Selec onORDER BY clause

• Defines the order of rows returned in the query result• Only orderings induced by clustering columns are allowed!

ORDERORDER BYBY column namecolumn name

ASCASC

DESCDESC

,,

LIMIT clause• Limits the number of rows returned in the query result

LIMITLIMIT integerinteger

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 36

Page 37: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Selec onGROUP BY clause

• Groups rows of a table according to certain columns• Only groupings induced by primary key columns are allowed!

GROUPGROUP BYBY column namecolumn name

,,

• When a non-grouping column is selected without anaggregate func on, the first value encounter is alwaysreturned

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 37

Page 38: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Selec onGROUP BY clause: aggregates

• Na ve aggregatesCOUNT(column)

– Number of all the values in a given column– null values are ignored

MIN(column), MAX(column)– Minimal / maximal value in a given column

SUM(column)– Sum of all the values of a given column

AVG(column)– Average of all the values of a given column

• User-defined aggregates

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 38

Page 39: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Selec onALLOW FILTERINGmodifier

• By default, only non-filtering queries are allowedI.e. queries wherethe number of rows read∼ the number of rows returnedSuch queries have predictable performance

– They will execute in a me that is propor onalto the amount of data returned

• ALLOW FILTERING enables (some) filtering queries

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 39

Page 40: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Inser onsINSERT statement

• Inserts a new row into a given tableWhen a row with a given primary key already exists,it is updated

• At least primary key columns must be specified… and they have to always be explicitly enumerated

INSERTINSERT INTOINTO

keyspace namekeyspace name ..

table nametable name

(( column namecolumn name

,,

)) VALUESVALUES (( termterm

,,

))

IFIF NOTNOT EXISTSEXISTS USINGUSING update parametersupdate parameters

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 40

Page 41: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Inser onsExample

INSERT INTO movies (id, title, director, year, actors, genres)VALUES (

'stesti','Štěstí',('Bohdan', 'Sláma'),2005,{ 'vilhelmova': 'Monika', 'liska': 'Toník' },[ 'comedy', 'drama' ]

)USING TTL 86400

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 41

Page 42: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Inser ons and UpdatesUpdate parameters

• TTL: me-to-live0 or null or simply missing for persistent values

• TIMESTAMP: write me

TIMESTAMPTIMESTAMP

TTLTTL

integerinteger

ANDAND

• Only newly inserted / updated values are really affected

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 42

Page 43: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

UpdatesUPDATE statement

• Updates exis ng rows within a given tableWhen a row with a given primary key does not yet exist,it is inserted

• All primary key columns must be specified in theWHERE clause

UPDATEUPDATE

keyspace namekeyspace name ..

table nametable name

USINGUSING update parametersupdate parameters

SETSET assignmentassignment

,,

WHERE clauseWHERE clause

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 43

Page 44: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

UpdatesUPDATE statement: assignments

• Describe modifica ons to be applied• Allowed assignments:

Value of a whole column is replacedValue of a list or map element is replacedValue of an UDT field is replaced

column namecolumn name

column namecolumn name [[ termterm ]]

column namecolumn name .. field namefield name

== termterm

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 44

Page 45: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

UpdatesExamples

UPDATE moviesSET

year = 2006,director = ('Jan', 'Svěrák'),actors = { 'machacek': 'Robert Landa', 'sverak': 'Josef Tkaloun' },genres = [ 'comedy' ],countries = { 'CZ' }

WHERE id = 'vratnelahve'

UPDATE moviesSET

actors = actors + { 'vilhelmova': 'Helenka' },genres = [ 'drama' ] + genres,countries = countries + { 'SK' }

WHERE id = 'vratnelahve'

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 45

Page 46: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

UpdatesExamples

UPDATE moviesSET

actors = actors - { 'vilhelmova', 'landovsky' },genres = genres - [ 'drama', 'sci-fi' ],countries = countries - { 'SK' }

WHERE id = 'vratnelahve'

UPDATE moviesSET

actors['vilhelmova'] = 'Helenka',genres[1] = 'comedy'

WHERE id = 'vratnelahve'

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 46

Page 47: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Dele onsDELETE statement

• Removes exis ng rows / columns / collec on elementsfrom a given table

DELETEDELETE

column namecolumn name

column namecolumn name [[ termterm ]]

column namecolumn name .. field namefield name

,,

FROM clauseFROM clause WHERE clauseWHERE clause

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 47

Page 48: Version 4svoboda/courses/2016... · DataModel Columnvalues • Emptyvalue null • Atomicvalue Navedatatypes suchastexts,integers,dates,… Tuples – Tupleofanonymousfields,eachofanytype(evendifferent)

Lecture ConclusionCassandra

• Column-family storeCassandra query language

• DDL statements• DML statements

SELECT, INSERT, UPDATE, DELETE

B4M36DS2: Database Systems 2 | Lecture 10: Column-Family Stores: Cassandra | 5. 12. 2016 48