cassandra prophecy
DESCRIPTION
introduction to Apache Cassandra distributed databaseTRANSCRIPT
CassandraProphecy
E-mail: [email protected] Khotin
Background● 11+ years in the IT industry● 6+ years with Java● Flexible design promoter● Agile-junkie
highly scalable, eventually consistent, distributed, structured key-value store
Decentralized
● P2P● No SPOF● No network bottlenecks
Fault Tolerant
● High Availability● Replication and redundancy● Node replacement & no downtime● Multiple racks & datacenters
Elastic Scalability
● Scales up and down● Just add or remove nodes● Linear scalability● Low maintenance cost
Tunable consistency
● Different consistency levels● Consistency vs. latency
Rich Data Model
● Goes beyond simple key-value● Values could be indexed● Flexible schema
Scale up problem
Sharding doesn't solve it
Google File System & Google BigTable
Amazon Dynamo
Cassandraby Avinash Lakshman and Prashant Malik
Cassandraused in Inbox Search
Open sourced in July 2008
March 2009Accepted to Apache Incubator
February 2010Top-Level Apache Project
late 2010...Cassandra abandoned
Messaging moved to HBase
October 2011Release 1.0
November 30, 2011Release 1.0.5
(current stable)
Moving forward fast...
Brewer's CAP Theorem
Data Model
Column Family
Column sorting● ASCII● UTF8● Bytes● Long● LexicalUUID● TimeUUID● Custom
Design decision
Denormalization
Denormalization
Design for queries
Keyring
Keyring
Keyring
Keyring
Keyring
Keyring
Keyring
Keyring
Gossip
Optimized for writes
Optimized for writes● No reads● No seeks● No b-trees● Fast● Row - atomic
Tunable Consistency
Tombstone
Low Level Clients● Thrift
● IDL and binary communication protocol● Multiple languages support● Really sucks
● Avro● Better than Thrift, but sucks anyway
High Level Clients● Feature-rich
● Connection pool● Load-balancing● Fail-over
● Hector, Pelops... (Java)● Pycassa... (Python)● Fauna (Ruby)● ...
CQL● SQL for NoSQL
● CREATE KEYSPACE, CREATE COLUMNFAMILY, CREATE INDEX
● USE, SELECT, UPDATE, DELETE...
SELECT population FROM cityWHERE KEY = 'Paris'USING CONSISTENCY QUORUM
Understand your problem
Understand your problem
Find appropriate solution
Don't let default solutions to be imposed on you
Hard to choose?
Leaders will emerge
Resources● http://cassandra.apache.org● Dynamo: Amazon’s Highly Available Key-value Store
● Cassandra - A Decentralized Structured Storage System
● Bigtable: A Distributed Storage System for Structured Data
Questions?