![Page 1: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/1.jpg)
Copyright © 2011 Constant Contact Inc. 1
Constant ContactMarch 2011
Dave Connors – VP OperationsJim Ancona – Systems ArchitectMark Schena – Manager Systems Automation
Cassandra & Puppet:Scaling data at $15/month
![Page 2: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/2.jpg)
Copyright © 2011 Constant Contact Inc. 2
Constant Contact
2000 – 2010
Market leader for Small Businesses• Email, Event & Survey• Over 400k paying customers• No. 134 on the Deloitte Technology Fast 500 listing
Business model• Many customers pay as little as $15 a month• ~2 million database transactions per minute
Constant Contact
![Page 3: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/3.jpg)
Copyright © 2011 Constant Contact Inc. 3
Constant Contact
The business problem
![Page 4: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/4.jpg)
Copyright © 2011 Constant Contact Inc. 4
Constant Contact
Small Businesses are looking to us for help with Social Media marketing
• Social Media 10-100 times more data
• Challenge with our business model
![Page 5: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/5.jpg)
Copyright © 2011 Constant Contact Inc. 5
The Key Challenge
Integrate social media data
• Solution = NoSQL
• Cost = Low
• Time to market = ?
The Key Challenge
![Page 6: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/6.jpg)
Copyright © 2011 Constant Contact Inc. 6
Implementation
Ops and Dev both face issues
• Data model• Monitoring• Authentication• Logging• Risk profile• Roles & Responsibilities
Implementing NoSQL
![Page 7: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/7.jpg)
Copyright © 2011 Constant Contact Inc. 7
Dev
Ops
![Page 8: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/8.jpg)
Copyright © 2011 Constant Contact Inc. 8
Apache Cassandra
• Developed at Facebook• Open sourced in 2008• Incubated at Apache• Became an Apache top-level project in 2010
• http://cassandra.apache.org
• In use at Digg, Facebook, Twitter, Reddit, Rackspace, Cloudkick, Cisco, …
• Largest production cluster has over 100 TB of data in over 150 machines
Apache Cassandra
![Page 9: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/9.jpg)
Copyright © 2011 Constant Contact Inc. 9
What is Cassandra?
• Implemented in Java
• Fault Tolerant• Elastic• Durable
• Rich data model• Replicated data • Consistency
options
What is Cassandra
![Page 10: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/10.jpg)
Copyright © 2011 Constant Contact Inc. 10
Replication
X
X X
How many copies of each piece of data
do we want?
N=3
Replication
![Page 11: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/11.jpg)
Copyright © 2011 Constant Contact Inc. 11
Y
Y Y
Y
X Y
Consistency LevelONE
WriterReade
r
YX
X X
Consistency Level One
![Page 12: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/12.jpg)
Copyright © 2011 Constant Contact Inc. 12
Y
Y Y
X
X Y
WriterReade
r
XX
X X
Consistency Level Quorum
![Page 13: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/13.jpg)
Copyright © 2011 Constant Contact Inc. 13
Risks and Mitigation
• Moving target• Developer
unfamiliarity• Operational
procedures• Reliability concerns
• Deployment automation
• Community involvement
• Training/Consulting• Application
selection• Lots of monitoring• Phased rollout
Risks and Mitigation
![Page 14: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/14.jpg)
Copyright © 2011 Constant Contact Inc. 14
Development Challenges
Understanding the data modelChoosing a client
■ Clients available for Java, Python, .NET, Ruby, PHP
■ Don’t use Thrift
Moving target
Development Challenges
![Page 15: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/15.jpg)
Copyright © 2011 Constant Contact Inc. 15
• Not “one neck to wring”• Paid support and training is available:
http://datastax.com• Community
■ Mailing lists■ IRC #cassandra at freenode
• Contribute
Open Source
![Page 16: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/16.jpg)
Copyright © 2011 Constant Contact Inc. 16
• Switchable modes• Mirroring• Dial-able traffic
Phased Rollout
![Page 17: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/17.jpg)
Copyright © 2011 Constant Contact Inc. 17
• Big, complex project• Close collaboration• Flexible roles• Ability to iterate
Collaboration
![Page 18: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/18.jpg)
Copyright © 2011 Constant Contact Inc. 18
Dev
Ops
![Page 19: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/19.jpg)
Copyright © 2011 Constant Contact Inc. 19
“Are you sure you really want that?”
• 3 500G disks• 1 250G disk• No SWAP• RAID Zero Root Partition and Data Storage• 32G Memory
“Are you sure you really want that?”
![Page 20: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/20.jpg)
Copyright © 2011 Constant Contact Inc. 20
We will need how many servers?We will need how many servers?
![Page 21: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/21.jpg)
Copyright © 2011 Constant Contact Inc. 21
• Quorum = 3 • Multiple Datacenters = 2• Use only half the available disk = 2• 12 Servers = ~1 TB Of Data Storage• ~6 TB of Data Storage
3 x 2 = 6x 2 = 12
72x 6 =
How many nodes?
![Page 22: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/22.jpg)
Copyright © 2011 Constant Contact Inc. 22
RanRandom Partitioner
![Page 23: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/23.jpg)
Copyright © 2011 Constant Contact Inc. 23
Tool ChainTool Chain
![Page 24: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/24.jpg)
Copyright © 2011 Constant Contact Inc. 24
with Puppet
• Puppet is the shared framework between Operations and Development
• Versioning of puppet code allows for adoption of development best practices
• Leverage Domain specific knowledge and skill
DevOps with Puppet
![Page 25: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/25.jpg)
Copyright © 2011 Constant Contact Inc. 25
Always Move ForwardAlways Move Forward
![Page 26: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/26.jpg)
Copyright © 2011 Constant Contact Inc. 26
Operational Efficiencies
• Remote logging is a requirement • Cassandra uses log4j natively• Resources not available for remote log4j
development• Scribed with Puppet provides the solution
Operational Efficiencies
![Page 27: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/27.jpg)
Copyright © 2011 Constant Contact Inc. 27
• Munin• JMX trending• Identify critical data points• Rapid development of graphs• Puppet Definitions are used for rapid
deployment
Development takes the Operational Lead
![Page 28: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/28.jpg)
Copyright © 2011 Constant Contact Inc. 28
Sample Munin Graph
![Page 29: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/29.jpg)
Copyright © 2011 Constant Contact Inc. 29
Puppet Code
define munin::cassandracolumnfamily ( ) { include cassandravirtual File <| title == "jmxbin" |>
$confdir="/opt/cassandra-munin-plugins” $plugindir="/etc/munin/plugins" $target="/opt/cassandra-munin-plugins/jmx_" # Match 3 strings separated by periods $pattern = '^([^.]*)[.]([^.]*)[.]([^.]*)$' $keyspace = regsubst($name, $pattern, '\1') $columnfamily = regsubst($name, $pattern, '\2') $file = regsubst($name, $pattern, '\3')
file {"${keyspace}_${columnfamily}_${file}.conf": owner => 'root', ensure => 'file', group => 'root', type => 'file', path => "${confdir}/${keyspace}_${columnfamily}_${file}.conf", mode => '644', content => template("munin/attribute_${file}.conf.erb"), require => [ Package['munin-node'], File['/opt/cassandra-munin-plugins'], File['jmxquery'], ], } file {"$plugindir/${keyspace}_${columnfamily}_${file}": ensure => 'link', owner => 'root', group => 'root', mode => '511', type => 'link', target => "$target", require => [ File['/opt/cassandra-munin-plugins'], File["${keyspace}_${columnfamily}_${file}.conf"], File['jmxquery'], Package['munin-node'], ],
Example: Munin Puppet Code
![Page 30: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/30.jpg)
Copyright © 2011 Constant Contact Inc. 30
Conclusion
• Cassandra as an appliance• Development Best Practices with Life Cycle
Management• Traditional vs. Today
• Infrastructure 4 weeks 4 hours to build 72 nodes
• Development to Deployment9 months 3 months
• CostMillions 150k
Conclusion
![Page 31: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocuments.us/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/31.jpg)
Copyright © 2011 Constant Contact Inc. 31
Q&A
Thank You!