on cassandra development: past, present and future

On Cassandra DevelopmentPast, Present and Future

Sylvain Lebresne

1

©2012 DataStax

About:me

•Sylvain Lebresne

•@pcmanus

•[email protected]

•Be sure to check DataStax Entreprise 2.0 !

2

2

mailto:[email protected]






©2012 DataStax

Apache Cassandra•Apache Top-Level project since February 2010

•∼200 contributors (15 official committers)

3

3

©2012 DataStax

Releases

•Current tentative release cycle: 4 months (including 1 month freeze)

•Minor releases “when appropriate”4

Version Release date0.5 24 Jan 2010

0.6 13 Apr 2010

0.7 9 Jan 2011

0.8 2 Jun 2011

1.0 18 Oct 2011

1.1 Soon

4

©2012 DataStax

Past: Cassandra 1.0

5

5

©2012 DataStax

Cassandra 1.0•Released October 18, 2011

•Current stable version

•Last minor revision: 1.0.8 (February 27, 2012)

6

6

©2012 DataStax

Cassandra 1.0 features•SSTables Compression (Snappy, Deflate)

•SSTables Checksumming

•Leveled Compaction

•Improved memory management:

• Simpler memtable_total_space_mb and commitlog_total_space_in_mb settings

•Arena allocations for memtables

•Off-heap row cache by default

•Reads optimisations

7

7

©2012 DataStax

Cassandra 1.0 features cont’d•More reliable hinted handoffs

•Faster disk space reclamation

•Single-pass streaming

•Repair improvements (nodetool repair -pr)

•Multi-threaded compactions

•...

8

8

©2012 DataStax

Present: Cassandra 1.1

9

9

©2012 DataStax

Cassandra 1.1•Beta 2 released yesterday (March 27, 2012)

•Final release slated for ... soon

10

10

©2012 DataStax

Global caches•Prior to 1.1, one key cache and one row cache per

column family

•Now: one global key cache and one global row cache

•Motivations:

• Simpler configuration ({key,row}_cache_size_in_mb)

• Better use of the LRU list

•Per-CF option reduced to: • caching=ALL|KEYS_ONLY|ROWS_ONLY|NONE

11

11

©2012 DataStax

Row level isolation•Batched writes are atomic (for a row) since day 1

•Batched writes within a row are now isolated

•When doing

12

UPDATE UsersSET login=‘tom’ AND password=‘1234’WHERE id=‘550e8400-e29b-41d4-a716-446655440’

UPDATE UsersSET login=‘t0m’ AND password=‘abcd’WHERE id=‘550e8400-e29b-41d4-a716-446655440’

followed by

⇒ guarantees that no reader can see (tom, abcd) or (t0m, 1234)

12

©2012 DataStax

CQL 3.0•Motivation:

• Better wide row syntax

•Native syntax for composite types

•Not backward compatible with CQL 2.0

•Only beta, final slated for Cassandra 1.2

13

CREATE TABLE timeline ( user_id uuid, posted_at timestamp, posted_by uuid, content text, device int PRIMARY KEY (user_id, posted_at))

SELECT * FROM timelineWHERE userid=<some_user>AND posted_at > <some_date>

13

©2012 DataStax

Fine-grained storage control•Old:

• /var/lib/cassandra/ks/cf-hc-1-Data.db

•New:

• /var/lib/cassandra/ks/cf/ks-cf-hc-1-Data.db

•Allow to put different CFs on different device

14

14

©2012 DataStax

Concurrent Schema changes•Fixes http://wiki.apache.org/cassandra/

FAQ#schema_disagreement

•Reuse Cassandra Data model to store the schema (simpler integration of ‘describe schema’ for CQL

•Speeds up new nodes addition (the schema size is proportional to the number of keyspaces and column families, not the number of schema operations anymore)

15

15

http://wiki.apache.org/cassandra/FAQ#














©2012 DataStax

Hadoop improvements•Secondary index support

•Wide row support

•New (faster) BulkOutputFormat (compatible with old ColumnFamilyOutputFormat)• job.setOutputFormatClass(BulkOutputFormat.class)

16

16

©2012 DataStax

Other features•Off-heap cache on Windows (no more JNA)

•Write survey mode

•Commit log segment pre-allocation/recycling

•Abortable compactions

•Multi-threaded streaming

•...

17

17

©2012 DataStax

What’s maybe next (subjectif!)

19

•Wide rows speed improvements (#2319)

•Smarter compaction of expired tombstone (#3442)

•CQL3 improvements/finalization (custom protocol)

•Row cache for wide rows (#1956)

•Remove super columns internally (#3237)

•Query tracing (#1123)

•Big cluster improvements

•...

19

Questions?

20

on cassandra development: past, present and future

Technology