on cassandra development: past, present and future

20
On Cassandra Development Past, Present and Future Sylvain Lebresne 1

Upload: pcmanus

Post on 26-Jan-2015

108 views

Category:

Technology


1 download

DESCRIPTION

Slices from my talk at Cassandra Europe

TRANSCRIPT

Page 1: On Cassandra Development: Past, Present and Future

On Cassandra DevelopmentPast, Present and Future

Sylvain Lebresne

1

Page 2: On Cassandra Development: Past, Present and Future

©2012 DataStax

About:me

•Sylvain Lebresne

•@pcmanus

[email protected]

•Be sure to check DataStax Entreprise 2.0 !

2

2

Page 3: On Cassandra Development: Past, Present and Future

©2012 DataStax

Apache Cassandra•Apache Top-Level project since February 2010

•∼200 contributors (15 official committers)

3

3

Page 4: On Cassandra Development: Past, Present and Future

©2012 DataStax

Releases

•Current tentative release cycle: 4 months (including 1 month freeze)

•Minor releases “when appropriate”4

Version Release date0.5 24 Jan 2010

0.6 13 Apr 2010

0.7 9 Jan 2011

0.8 2 Jun 2011

1.0 18 Oct 2011

1.1 Soon

4

Page 5: On Cassandra Development: Past, Present and Future

©2012 DataStax

Past: Cassandra 1.0

5

5

Page 6: On Cassandra Development: Past, Present and Future

©2012 DataStax

Cassandra 1.0•Released October 18, 2011

•Current stable version

•Last minor revision: 1.0.8 (February 27, 2012)

6

6

Page 7: On Cassandra Development: Past, Present and Future

©2012 DataStax

Cassandra 1.0 features•SSTables Compression (Snappy, Deflate)

•SSTables Checksumming

•Leveled Compaction

•Improved memory management:

• Simpler memtable_total_space_mb and commitlog_total_space_in_mb settings

•Arena allocations for memtables

•Off-heap row cache by default

•Reads optimisations

7

7

Page 8: On Cassandra Development: Past, Present and Future

©2012 DataStax

Cassandra 1.0 features cont’d•More reliable hinted handoffs

•Faster disk space reclamation

•Single-pass streaming

•Repair improvements (nodetool repair -pr)

•Multi-threaded compactions

•...

8

8

Page 9: On Cassandra Development: Past, Present and Future

©2012 DataStax

Present: Cassandra 1.1

9

9

Page 10: On Cassandra Development: Past, Present and Future

©2012 DataStax

Cassandra 1.1•Beta 2 released yesterday (March 27, 2012)

•Final release slated for ... soon

10

10

Page 11: On Cassandra Development: Past, Present and Future

©2012 DataStax

Global caches•Prior to 1.1, one key cache and one row cache per

column family

•Now: one global key cache and one global row cache

•Motivations:

• Simpler configuration ({key,row}_cache_size_in_mb)

• Better use of the LRU list

•Per-CF option reduced to: • caching=ALL|KEYS_ONLY|ROWS_ONLY|NONE

11

11

Page 12: On Cassandra Development: Past, Present and Future

©2012 DataStax

Row level isolation•Batched writes are atomic (for a row) since day 1

•Batched writes within a row are now isolated

•When doing

12

UPDATE UsersSET login=‘tom’ AND password=‘1234’WHERE id=‘550e8400-e29b-41d4-a716-446655440’

UPDATE UsersSET login=‘t0m’ AND password=‘abcd’WHERE id=‘550e8400-e29b-41d4-a716-446655440’

followed by

⇒ guarantees that no reader can see (tom, abcd) or (t0m, 1234)

12

Page 13: On Cassandra Development: Past, Present and Future

©2012 DataStax

CQL 3.0•Motivation:

• Better wide row syntax

•Native syntax for composite types

•Not backward compatible with CQL 2.0

•Only beta, final slated for Cassandra 1.2

13

CREATE TABLE timeline ( user_id uuid, posted_at timestamp, posted_by uuid, content text, device int PRIMARY KEY (user_id, posted_at))

SELECT * FROM timelineWHERE userid=<some_user>AND posted_at > <some_date>

13

Page 14: On Cassandra Development: Past, Present and Future

©2012 DataStax

Fine-grained storage control•Old:

• /var/lib/cassandra/ks/cf-hc-1-Data.db

•New:

• /var/lib/cassandra/ks/cf/ks-cf-hc-1-Data.db

•Allow to put different CFs on different device

14

14

Page 15: On Cassandra Development: Past, Present and Future

©2012 DataStax

Concurrent Schema changes•Fixes http://wiki.apache.org/cassandra/

FAQ#schema_disagreement

•Reuse Cassandra Data model to store the schema (simpler integration of ‘describe schema’ for CQL

•Speeds up new nodes addition (the schema size is proportional to the number of keyspaces and column families, not the number of schema operations anymore)

15

15

Page 16: On Cassandra Development: Past, Present and Future

©2012 DataStax

Hadoop improvements•Secondary index support

•Wide row support

•New (faster) BulkOutputFormat (compatible with old ColumnFamilyOutputFormat)• job.setOutputFormatClass(BulkOutputFormat.class)

16

16

Page 17: On Cassandra Development: Past, Present and Future

©2012 DataStax

Other features•Off-heap cache on Windows (no more JNA)

•Write survey mode

•Commit log segment pre-allocation/recycling

•Abortable compactions

•Multi-threaded streaming

•...

17

17

Page 18: On Cassandra Development: Past, Present and Future

©2012 DataStax

Future

18

18

Page 19: On Cassandra Development: Past, Present and Future

©2012 DataStax

What’s maybe next (subjectif!)

19

•Wide rows speed improvements (#2319)

•Smarter compaction of expired tombstone (#3442)

•CQL3 improvements/finalization (custom protocol)

•Row cache for wide rows (#1956)

•Remove super columns internally (#3237)

•Query tracing (#1123)

•Big cluster improvements

•...

19

Page 20: On Cassandra Development: Past, Present and Future

Questions?

20