outside the box with apache cassnadra
DESCRIPTION
Cassandra presentation given at the 3rd annual Palmetto Open Source Software Conference (POSSCON 2010).TRANSCRIPT
![Page 1: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/1.jpg)
Outside The Box With Apache Cassandra
Eric [email protected]
@jericevans
Palemetto Open Source Software ConferenceApril 16, 2010
![Page 2: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/2.jpg)
Cassandra is...
A massively scalable, decentralized, structured data store (akadatabase).
![Page 3: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/3.jpg)
Outline
1 Background
2 Project History
3 Description
4 Case Studies
5 Roadmap
![Page 4: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/4.jpg)
The Digital Universe
![Page 5: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/5.jpg)
Consolidation
![Page 6: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/6.jpg)
Old Guard
![Page 7: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/7.jpg)
Vertical Scaling Sucks
![Page 8: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/8.jpg)
CAP Theorem (aka Brewer’s Theorem)
Distributed systems cannot provide all three of:
• Consistency
• Availability
• Partition Tolerance
![Page 9: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/9.jpg)
Influential Papers
Dynamo: Amazon’s Highly Available Key-value Store 1
• Voldemort
• Riak
Bigtable: A Distributed Storage System for Structured Data 2
• Hypertable
• HBase
1http:
//www.allthingsdistributed.com/2007/10/amazons_dynamo.html2http://labs.google.com/papers/bigtable-osdi06.pdf
![Page 10: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/10.jpg)
Outline
1 Background
2 Project History
3 Description
4 Case Studies
5 Roadmap
![Page 11: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/11.jpg)
![Page 12: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/12.jpg)
![Page 13: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/13.jpg)
![Page 14: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/14.jpg)
• 7 new committers added
• Dozens of contributors
• 200+ (!) people on IRC
• Hundreds of closed issues (bugs, features, etc)
• 4 major releases; a number of stable point releases
• Graduation to TLP
![Page 15: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/15.jpg)
Outline
1 Background
2 Project History
3 Description
4 Case Studies
5 Roadmap
![Page 16: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/16.jpg)
Cassandra is...
• O(1) DHT
• Eventual consistency
• Tunable trade-offs, consistency vs. availability
![Page 17: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/17.jpg)
![Page 18: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/18.jpg)
But...
• Values are structured, indexed
• Columns / column families
• Slicing w/ predicates (queries)
![Page 19: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/19.jpg)
Column families
![Page 20: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/20.jpg)
Supercolumn families
![Page 21: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/21.jpg)
Client API
• Thrift (12 different languages!)3
• High-level client libraries• Ruby• Perl• Python (Twisted too)• Scala• Java• PHP• Grails• C++
3http://incubator.apache.org/thrift
![Page 22: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/22.jpg)
Querying
• get(): retrieve by column name
• multiget(): by column name for a set of keys
• get slice(): by column name, or a range of names• returning columns• returning super columns
• multiget slice(): a subset of columns for a set of keys
• get count: number of columns or sub-columns
• get range slice(): subset of columns for a range of keys
![Page 23: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/23.jpg)
Updating
• insert(): add/update column (by key)
• batch insert(): add/update multiple columns (by key)
• remove(): remove a column
• batch mutate(): like batch insert() but can also delete(new for 0.6, deprecates batch insert())
![Page 24: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/24.jpg)
Column comparators
• TimeUUID
• LexicalUUID
• UTF8
• Long
• Bytes
• ...
![Page 25: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/25.jpg)
Consistency
CAP Theorem: choose any two of Consistency, Availability, orPartition tolerance.
• Zero
• One
• Quorum ((N / 2) + 1)
• All
![Page 26: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/26.jpg)
About writes...
• Atomic within a column family
• Any node
• Always writeable (hinted hand-off)
• Fast
![Page 27: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/27.jpg)
Writes
![Page 28: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/28.jpg)
About reads...
• Any node
• Read repair
• Key cache
• Record cache
![Page 29: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/29.jpg)
Reads
![Page 30: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/30.jpg)
Outline
1 Background
2 Project History
3 Description
4 Case Studies
5 Roadmap
![Page 31: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/31.jpg)
Case 1: Digg
Digg is a social news site that allows people to discover and sharecontent from anywhere on the Internet by submitting stories andlinks, and voting and commenting on submitted stories and links.
Ranked 98th by Alexa.com.
![Page 32: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/32.jpg)
Digg
![Page 33: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/33.jpg)
Problem
• Terabytes of data; high transaction rate (reads dominated)
• Multiple clusters; heavily sharded
• Management nightmare (high effort, error prone)
• Unsatisfied availability requirements (geographic isolation)
![Page 34: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/34.jpg)
Solution
• Currently production on ”Green Badges”
• Cassandra as primary data store RSN
• Datacenter and rack-aware replication
![Page 35: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/35.jpg)
Case 2: Twitter
Twitter is a social networking and microblogging service thatenables its users to send and read tweets, text-based posts of up to140 characters.
Ranked 12th by Alexa.com.
![Page 36: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/36.jpg)
![Page 37: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/37.jpg)
MySQL
• Terabytes of data, ˜1,000,000 ops/s
• Calls for heavy sharding, light replication
• Schema changes are very difficult, (if possible at all)
• Manual sharding is very high effort
• Automated sharding and replication is Hard
![Page 38: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/38.jpg)
Case 3: Facebook
Facebook is a social networking site where users can create aprofile, add friends, and send them messages. Users can also joingroups organized by location or other points of common interest.
Ranked #2 by Alexa.com.
![Page 39: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/39.jpg)
Inbox Search
• 100 TB
• 160 nodes
• 1/2 billion writes per day (2yr old number?)
![Page 40: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/40.jpg)
Outline
1 Background
2 Project History
3 Description
4 Case Studies
5 Roadmap
![Page 41: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/41.jpg)
0.6
• batch mutate command
• authentication (basic)
• new consistency level, ANY
• fat client
• mmapped i/o reads (default on 64bit jvm)
• improved write concurrency (HH)
• networking optimizations
• row caching
• improved management tools
• per-keyspace replication factor
![Page 42: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/42.jpg)
0.7
• more efficient compactions (row sizes bigger than memory)
• easier (dynamic?) column family changes
• SSTable versioning
• SSTable compression
• support for column family truncation
• improved configuration handling
• remove key range command
• even more improved management tools
• vector clocks w/ server-side conflict resolution
![Page 43: Outside The Box With Apache Cassnadra](https://reader033.vdocuments.us/reader033/viewer/2022051616/5558c3a2d8b42a235c8b4618/html5/thumbnails/43.jpg)
Questions?