svccg nosql 2011_sri-cassandra
DESCRIPTION
silicon valley cloud computing group, netflix, cassandra talkTRANSCRIPT
HowStuffWorks version
Cassandra
SriSatish Ambatiengineer, DataStax
@srisatish
Bigtable, 2006Dynamo, 2007
OSS, 2008
Incubator, 2009 TLP, 2010
Digital Reasoning: NLP + entity analytics
OpenWave: enterprise messaging
OpenX: largest publisher-side ad network in the world
Cloudkick: performance data & aggregation
SimpleGEO: location-as-API
Ooyala: video analytics and business intelligence
ngmoco: massively multiplayer game worlds
Cassandra in production
•furiously fast writes
• Append only writes • Sequential disk access• No locks in critical path• Key based atomicity
•client issues •write
•n1
•partitioner
•commit log
•apply to memory
•n2
•find node
•n3
Tuneable reads
Read Internals
@r39132 - #netflixcloud 6
•A feather in the CAP• Eventual Consistency• R + W > No N is RFo T is total nodes
• ex: rdbms with backup• R=1, W=2, N=2, T=2
Read Performance• R=1, 100s of nodes • R=1, W=N (consistency)
Write Performance • W=1, R=N• Quorum (fast writes!)
Client Marshal Arts
Roll your own, C
Thrift
pycassa, phpcassa
Ruby, Scala
Ready made, Java: Hector, Pelops
Common Patterns of Doom: Death by a million gets Turn off Nagle Manage your connections
Adding Nodes New nodes
Add themselves to busiest node And then Split its Range
Busy Node starts transmit to new node Bootstrap logic initiated from any node, cli, web
Cassandra on EC2 cloud
Cassandra on EC2 cloud
*Corey Hulen, EC2
inter-node comm Gossip Protocol
It’s exponential (epidemic algorithm)
Failure Detector Accrual rate phi
Anti-Entropy Bringing replicas to uptodate
UDP for control messages TCP for request routing
CompactionsK1 < Serialized data >K2 < Serialized data >K3 < Serialized data >------
Sorted
K2 < Serialized data >K10 < Serialized data >K30 < Serialized data >------
Sorted
K4 < Serialized data >K5 < Serialized data >K10 < Serialized data >------
Sorted
MERGE SORT
Loaded in memory
K1 < Serialized data >K2 < Serialized data >K3 < Serialized data >K4 < Serialized data >K5 < Serialized data >K10 < Serialized data >K30 < Serialized data >
Sorted
K1 OffsetK5 OffsetK30 OffsetBloom Filter
Index File
Data File
CompactionsK1 < Serialized data >K2 < Serialized data >K3 < Serialized data >------
Sorted
K2 < Serialized data >K10 < Serialized data >K30 < Serialized data >------
Sorted
K4 < Serialized data >K5 < Serialized data >K10 < Serialized data >------
Sorted
MERGE SORT
Loaded in memory
K1 < Serialized data >K2 < Serialized data >K3 < Serialized data >K4 < Serialized data >K5 < Serialized data >K10 < Serialized data >K30 < Serialized data >
Sorted
K1 OffsetK5 OffsetK30 OffsetBloom Filter
Index File
Data File
D E L E T E D
A
LT
W
F
P
YKey “C”
U
Availability in Action
A
LT
W
F
P
YKey “C”
U
Xhint
Availability in Action
JMX
OpsCenter
OpsCenter
OpsCenter