cassandra 1.1

Post on 15-Jan-2015

3.477 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

©2012 DataStax

Apache Cassandra 1.1

Jonathan Ellis / @spyced

©2012 DataStax

• CQL3

• Global row + key caches

• Fine-grained data storage control

• Row level isolation

• Concurrent schema changes

• Off-heap cache works on Windows

• "Write survey mode"

• Hadoop improvements

• Stress tool

New features in 1.1

©2012 DataStax

Modern Cassandra, briefly• 0.7

• CREATE COLUMN FAMILY

• TTL

• Secondary (column) indexes

• 0.8• Counters

• Automatic memtable tuning

• 1.0• Compression

• Leveled compaction

©2012 DataStax

Global row + key caches• cassandra.yaml

• key_cache_size_in_mb (default 2)

• row_cache_size_in_mb (default 0)

• Also save periods

• Per-CF: caching=ALL|KEYS_ONLY*|ROWS_ONLY|NONE

• Old CF-level options are ignored• row_cache_size, key_cache_size

• save periods

©2012 DataStax

Data storage• Old:

• /var/lib/cassandra/data/Keyspace1/Standard1-hc-1-Data.db

• New:• /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-

Standard1-hc-1-Data.db

• (Includes KS in !lename for easier bulk loading)

©2012 DataStax

Row-level isolation• Never see partial updates to a row

• We now have AID from ACID• C in ACID != C in CAP

©2012 DataStax

Concurrent schema changes• Fixes http://wiki.apache.org/cassandra/

FAQ#schema_disagreement

• Can still have temporary disagreements if you use a new CF before all nodes have it

• Also speeds up adding new nodes

©2012 DataStax

Off-heap cache on Windows• SerializingCacheProvider no longer requires JNA

• SCP is the default starting with 1.0, but falls back to CLHCP if JNA is not present in < 1.1

©2012 DataStax

Write survey mode• bin/cassandra -Dcassandra.write_survey=true

• Allows experimenting w/ compaction, compression, new versions*• isolate node to test reads

©2012 DataStax

Abortable compactions• nodetool stop <type>

©2012 DataStax

• (CQL2 is still default)

• Composite PK support• .. slice syntax removed

• ORDER BY syntax conforms to SQL

CQL3

©2012 DataStax

A simple exampleCREATE TABLE tweets (    tweet_id uuid PRIMARY KEY,    author varchar,    body varchar);

©2012 DataStax

Tweets

tweet_id

1790

1787

1778

author body

gwashingtonTo be prepared for war is one of the most

effectual means of preserving peace

jmadison All men having power ought to be distrusted to a certain degree

gmason

Those gentlemen, who will be elected senators, will fix themselves in the federal

town, and become citizens of that town more than of your state

©2012 DataStax

With clustering

CREATE TABLE timeline (    user_id varchar,    tweet_id uuid,    author varchar,    body varchar,    PRIMARY KEY (user_id, tweet_id));

partition keyclustered

©2012 DataStax

Timeline

user_id

jadams

jadams

ahamilton

ahamilton

tweet_id author body

1787 jmadison All men ...

1790 gwashington To be prepared ...

1778 gmason Those gentlemen ...

1790 gwashington To be prepared ...

clustered (within partition key)not

clustered

©2012 DataStax

Timeline, physical layout

jadams

ahamilton

(1787, author): jmadison

(1787, body):All men ...

(1790, author): gwashington

(1790, body): To be prepared ...

(1778, author): gmason

(1778, body): Those gentlemen ...

(1790, author): gwashington

(1790, body): To be prepared ...

Non-PK columns contain string literal of column name

©2012 DataStax

WITH COMPACT

CREATE TABLE timeline (    user_id varchar,    tweet_id uuid,    author varchar,    body varchar,    PRIMARY KEY (user_id, tweet_id, author))WITH COMPACT STORAGE;

• For backwards compatibilityAll but one column

©2012 DataStax

jadams

ahamilton

(1787, jmadison): All men ...

(1790, gwashington): To be prepared ...

(1778, gmason): Those gentlemen ...

(1790, gwashington): To be prepared ...

no “body” literal

©2012 DataStax

Earlier changes• (1.0.6) Allow CF names to be quali"ed by keyspace for

INSERT, ALTER, DELETE, TRUNCATE• INSERT INTO ks.cf (...) VALUES (...)

• (SELECT was done in 1.0.1)

• (1.0.4) ALTER CF attributes

©2012 DataStax

cqlsh• SOURCE and CAPTURE commands

• (1.0.8) DESCRIBE COLUMNFAMILIES

©2012 DataStax

The future is CQL (based)• cqlsh

• performance• prepared statements

• netty-based transport (CASSANDRA-2478)

• What does this mean for pycassa, Hector, et al?

©2012 DataStax

• 2I support*

• Wide row support*

• BulkOutputFormat

• (*Covered in updated WordCount)

Hadoop Integration

©2012 DataStax

Secondary Index supportIndexExpression expr = new IndexExpression( ByteBufferUtil.bytes("int4"), IndexOperator.EQ, ByteBufferUtil.bytes(0));

ConfigHelper.setInputRange( job.getConfiguration(),

©2012 DataStax

Wide row supportConfigHelper.setInputColumnFamily( job.getConfiguration(), KEYSPACE, COLUMN_FAMILY, true);

Also: PIG_WIDEROW_INPUT

©2012 DataStax

BulkOutputFormatjob.setOutputFormatClass( BulkOutputFormat.class);

• Compatible w/ CFOF + extra options

• OUTPUT_LOCATION

• BUFFER_SIZE_IN_MB

• STREAM_THROTTLE_MBITS

• (system default, 64, unlimited)

• Limitation: can’t stream to dead nodes ("x in 1.1.1?)

©2012 DataStax

Stress tool• tools/bin/stress*

• Insert, read, seq scan, indexed scan, multiget, counter add/get

• CQL

©2012 DataStax

Bonus: What’s new in C* 1.1.1• Incremental repair by token range

• Support for commitlog archiving and PITR

• Identify and blacklist corrupted SSTables from future compactions

• Open 1 sstableScanner per level for leveled compaction

• More CQL3 improvements (e.g. reversed clustering)

• "x re-creating Keyspaces/ColumnFamilies with the same name as dropped ones

©2012 DataStax

DataStax Community, with OpsCenter

top related