zero to mongo in 60 hours
TRANSCRIPT
Zero to Mongo in 60 Hours
Ryan AngillyMyPunchbowl.com
@angilly
Wednesday, April 28, 2010
I’m a pretty awesome dude
Wednesday, April 28, 2010
I’m a pretty awesome dude
• Electrical Engineer by education
Wednesday, April 28, 2010
I’m a pretty awesome dude
• Electrical Engineer by education
• Ex-EMC’er
Wednesday, April 28, 2010
I’m a pretty awesome dude
• Electrical Engineer by education
• Ex-EMC’er
• Founded MessageSling.com
Wednesday, April 28, 2010
I’m a pretty awesome dude
• Electrical Engineer by education
• Ex-EMC’er
• Founded MessageSling.com
• Entered the deadpool in January
Wednesday, April 28, 2010
I’m a pretty awesome dude
• Electrical Engineer by education
• Ex-EMC’er
• Founded MessageSling.com
• Entered the deadpool in January
• Senior developer @ MyPunchbowl.com
Wednesday, April 28, 2010
I’m a pretty awesome dude
• Electrical Engineer by education
• Ex-EMC’er
• Founded MessageSling.com
• Entered the deadpool in January
• Senior developer @ MyPunchbowl.com
• Built several web apps
Wednesday, April 28, 2010
I’m a pretty awesome dude
• Electrical Engineer by education
• Ex-EMC’er
• Founded MessageSling.com
• Entered the deadpool in January
• Senior developer @ MyPunchbowl.com
• Built several web apps
• All Ruby. Mostly Rails
Wednesday, April 28, 2010
I’m a pretty awesome dude
• Electrical Engineer by education
• Ex-EMC’er
• Founded MessageSling.com
• Entered the deadpool in January
• Senior developer @ MyPunchbowl.com
• Built several web apps
• All Ruby. Mostly Rails
• All SQL
Wednesday, April 28, 2010
I’m a pretty awesome dude
• Electrical Engineer by education
• Ex-EMC’er
• Founded MessageSling.com
• Entered the deadpool in January
• Senior developer @ MyPunchbowl.com
• Built several web apps
• All Ruby. Mostly Rails
• All SQL
• No experience w/ NoSQL/Mongo as of September ’09
Wednesday, April 28, 2010
I’m a pretty awesome dude
• Electrical Engineer by education
• Ex-EMC’er
• Founded MessageSling.com
• Entered the deadpool in January
• Senior developer @ MyPunchbowl.com
• Built several web apps
• All Ruby. Mostly Rails
• All SQL
• No experience w/ NoSQL/Mongo as of September ’09
• Want to provide a beginner’s perspective
Wednesday, April 28, 2010
MyPunchbowl is a pretty awesome company
Wednesday, April 28, 2010
MyPunchbowl is a pretty awesome company
• Leader in start to finish party planning
Wednesday, April 28, 2010
MyPunchbowl is a pretty awesome company
• Leader in start to finish party planning
• Connecting party planners to party vendors -- subject of this talk
Wednesday, April 28, 2010
MyPunchbowl is a pretty awesome company
• Leader in start to finish party planning
• Connecting party planners to party vendors -- subject of this talk
• Currently employs 11 people
Wednesday, April 28, 2010
MyPunchbowl is a pretty awesome company
• Leader in start to finish party planning
• Connecting party planners to party vendors -- subject of this talk
• Currently employs 11 people
• Funded over 2 rounds by Intel & Contour
Wednesday, April 28, 2010
MyPunchbowl is a pretty awesome company
• Leader in start to finish party planning
• Connecting party planners to party vendors -- subject of this talk
• Currently employs 11 people
• Funded over 2 rounds by Intel & Contour
• Strategic partnership w/ OTC among others
Wednesday, April 28, 2010
MyPunchbowl is a pretty awesome company
• Leader in start to finish party planning
• Connecting party planners to party vendors -- subject of this talk
• Currently employs 11 people
• Funded over 2 rounds by Intel & Contour
• Strategic partnership w/ OTC among others
• Amount of traffic, users, events, invites, vendors, etc... make our data sets large enough to be interesting :)
Wednesday, April 28, 2010
MyPunchbowl is a pretty awesome company
• Leader in start to finish party planning
• Connecting party planners to party vendors -- subject of this talk
• Currently employs 11 people
• Funded over 2 rounds by Intel & Contour
• Strategic partnership w/ OTC among others
• Amount of traffic, users, events, invites, vendors, etc... make our data sets large enough to be interesting :)
• We’re talking many millions of ‘things’
Wednesday, April 28, 2010
Our engineers play with everything
Wednesday, April 28, 2010
Our engineers play with everything
Wednesday, April 28, 2010
Our engineers play with everything
Wednesday, April 28, 2010
Our engineers play with everything
Wednesday, April 28, 2010
Our engineers play with everything
Wednesday, April 28, 2010
Our engineers play with everything
Wednesday, April 28, 2010
Our engineers play with everything
Wednesday, April 28, 2010
Our engineers play with everything
Wednesday, April 28, 2010
Our engineers play with everything
Wednesday, April 28, 2010
Our engineers play with everything
Wednesday, April 28, 2010
Our engineers play with everything
Wednesday, April 28, 2010
Our engineers play with everything
Wednesday, April 28, 2010
Our engineers play with everything
Wednesday, April 28, 2010
Our engineers play with everything
Wednesday, April 28, 2010
Wednesday, April 28, 2010
Search by category and location
Wednesday, April 28, 2010
Search by category and location
Search by business name and location
Wednesday, April 28, 2010
Wednesday, April 28, 2010
Track searches
Wednesday, April 28, 2010
Track searches
Track vendor impressions
Wednesday, April 28, 2010
MongoDB + MyPunchbowl
A tale in conversations
Wednesday, April 28, 2010
Tracking requirement a good excuse to finally use MongoDB
Wednesday, April 28, 2010
Tracking requirement a good excuse to finally use MongoDB
Me: Hey Blake I’m gonna use MongoDB to keep track of all this search stuff.
Wednesday, April 28, 2010
Tracking requirement a good excuse to finally use MongoDB
Me: Hey Blake I’m gonna use MongoDB to keep track of all this search stuff.
Blake: Cool.
Wednesday, April 28, 2010
Tracking requirement a good excuse to finally use MongoDB
Me: Hey Blake I’m gonna use MongoDB to keep track of all this search stuff.
Blake: Cool.
Matt (CEO): What’s that?
Wednesday, April 28, 2010
Tracking requirement a good excuse to finally use MongoDB
Me: Hey Blake I’m gonna use MongoDB to keep track of all this search stuff.
Blake: Cool.
Matt (CEO): What’s that?
Blake & Me: Something cool.
Wednesday, April 28, 2010
Tracking requirement a good excuse to finally use MongoDB
Me: Hey Blake I’m gonna use MongoDB to keep track of all this search stuff.
Blake: Cool.
Matt (CEO): What’s that?
Blake & Me: Something cool.
Matt (CEO): K
Wednesday, April 28, 2010
6 reasons to use MongoDB
• Easy to get running (~5 minutes on OSX)
• Open Source
• Support in multiple (computer) languages
• Prototype in Ruby, move to Java if needed
• Very active development
• Full featured
• Great ecosystem
Wednesday, April 28, 2010
MongoDB feels right
Wednesday, April 28, 2010
MongoDB feels right
Me: MongoDB gives me the warm fuzzies that Rails did.
Wednesday, April 28, 2010
MongoDB feels right
Me: MongoDB gives me the warm fuzzies that Rails did.
Darren: Just like that Nunemaker post.
Wednesday, April 28, 2010
MongoDB feels right
Me: MongoDB gives me the warm fuzzies that Rails did.
Darren: Just like that Nunemaker post.
Me: ? *runs to google*
Wednesday, April 28, 2010
MongoDB feels right
Me: MongoDB gives me the warm fuzzies that Rails did.
Darren: Just like that Nunemaker post.
Me: ? *runs to google*
Wednesday, April 28, 2010
Support is INSANE
11:35pm. Wednesday. Founder.60 second response time.
Wednesday, April 28, 2010
2.5 days? Really?
Wednesday, April 28, 2010
2.5 days? Really?
Yes
Wednesday, April 28, 2010
2.5 days? Really?
Yes*
Wednesday, April 28, 2010
0 Days 1 Day 2 Days 200+ Daysand on...
2.5 days to MongoDB
Wednesday, April 28, 2010
0 Days 1 Day 2 Days
Decision to use
MongoDB
200+ Daysand on...
2.5 days to MongoDB
Wednesday, April 28, 2010
0 Days 1 Day 2 Days
Decision to use
MongoDB
Play around with available OSS
200+ Daysand on...
2.5 days to MongoDB
Wednesday, April 28, 2010
0 Days 1 Day 2 Days
Decision to use
MongoDB
Play around with available OSS
Figure out document ‘schema’
200+ Daysand on...
2.5 days to MongoDB
Wednesday, April 28, 2010
0 Days 1 Day 2 Days
Decision to use
MongoDB
Play around with available OSS
Figure out document ‘schema’
Write some stuff on top of mongo-ruby-driver
200+ Daysand on...
2.5 days to MongoDB
Wednesday, April 28, 2010
0 Days 1 Day 2 Days
Decision to use
MongoDB
Play around with available OSS
Figure out document ‘schema’
Build test rig
Write some stuff on top of mongo-ruby-driver
200+ Daysand on...
2.5 days to MongoDB
Wednesday, April 28, 2010
0 Days 1 Day 2 Days
Decision to use
MongoDB
Play around with available OSS
Figure out document ‘schema’
Setup Configuration Management
Build test rig
Write some stuff on top of mongo-ruby-driver
200+ Daysand on...
2.5 days to MongoDB
Wednesday, April 28, 2010
0 Days 1 Day 2 Days
Decision to use
MongoDB
Play around with available OSS
Figure out document ‘schema’
Get production ready
Setup Configuration Management
Build test rig
Write some stuff on top of mongo-ruby-driver
200+ Daysand on...
2.5 days to MongoDB
Wednesday, April 28, 2010
0 Days 1 Day 2 Days
Decision to use
MongoDB
Play around with available OSS
Figure out document ‘schema’
Get production ready
Setup Configuration Management
Build test rig
Write some stuff on top of mongo-ruby-driver
Deploy
200+ Daysand on...
2.5 days to MongoDB
Wednesday, April 28, 2010
A stroll through Mongo’s OSS ecosystem
Wednesday, April 28, 2010
A stroll through Mongo’s OSS ecosystem
• mongodb (C++)
Wednesday, April 28, 2010
A stroll through Mongo’s OSS ecosystem
• mongodb (C++)
• mongo shell (JS via SpiderMonkey)
Wednesday, April 28, 2010
A stroll through Mongo’s OSS ecosystem
• mongodb (C++)
• mongo shell (JS via SpiderMonkey)
• mongo-ruby-driver
Wednesday, April 28, 2010
A stroll through Mongo’s OSS ecosystem
• mongodb (C++)
• mongo shell (JS via SpiderMonkey)
• mongo-ruby-driver
• mongo-java-driver
Wednesday, April 28, 2010
A stroll through Mongo’s OSS ecosystem
• mongodb (C++)
• mongo shell (JS via SpiderMonkey)
• mongo-ruby-driver
• mongo-java-driver
• mongo_record (Ruby)
Wednesday, April 28, 2010
A stroll through Mongo’s OSS ecosystem
• mongodb (C++)
• mongo shell (JS via SpiderMonkey)
• mongo-ruby-driver
• mongo-java-driver
• mongo_record (Ruby)
• mongo_mapper (Ruby)
Wednesday, April 28, 2010
A stroll through Mongo’s OSS ecosystem
• mongodb (C++)
• mongo shell (JS via SpiderMonkey)
• mongo-ruby-driver
• mongo-java-driver
• mongo_record (Ruby)
• mongo_mapper (Ruby)
• Integration w/ Rails
Wednesday, April 28, 2010
Document ‘schema’
Wednesday, April 28, 2010
Document ‘schema’
• What do documents look like?
Wednesday, April 28, 2010
Document ‘schema’
• What do documents look like?
• Example Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 1,
category_id: 1
}
Wednesday, April 28, 2010
Document ‘schema’
• What do documents look like?
• Example Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 1,
category_id: 1
}
• How do you update the documents?
Wednesday, April 28, 2010
Document ‘schema’
• What do documents look like?
• Example Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 1,
category_id: 1
}
• How do you update the documents?
• Mongo shell command
coll.update({‘city’: ‘Boston’,
‘state’: ‘MA’,
‘date’: 1272153600,
‘category_id’: 1},
{‘$inc’: {‘occurrences’: 1}},
true)
Wednesday, April 28, 2010
Document ‘schema’
• What do documents look like?
• Example Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 1,
category_id: 1
}
• How do you update the documents?
• Mongo shell command
coll.update({‘city’: ‘Boston’,
‘state’: ‘MA’,
‘date’: 1272153600,
‘category_id’: 1},
{‘$inc’: {‘occurrences’: 1}},
true)
Document to match
Wednesday, April 28, 2010
Document ‘schema’
• What do documents look like?
• Example Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 1,
category_id: 1
}
• How do you update the documents?
• Mongo shell command
coll.update({‘city’: ‘Boston’,
‘state’: ‘MA’,
‘date’: 1272153600,
‘category_id’: 1},
{‘$inc’: {‘occurrences’: 1}},
true)
Document to match
Operation to perform
Wednesday, April 28, 2010
Document ‘schema’
• What do documents look like?
• Example Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 1,
category_id: 1
}
• How do you update the documents?
• Mongo shell command
coll.update({‘city’: ‘Boston’,
‘state’: ‘MA’,
‘date’: 1272153600,
‘category_id’: 1},
{‘$inc’: {‘occurrences’: 1}},
true)
Document to match
Operation to perform
‘upsert’: update if it’s there
insert if it’s not
Wednesday, April 28, 2010
Cuhhraazy Indexes
Wednesday, April 28, 2010
Cuhhraazy Indexes
• Original Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 1,
category_id: 1
}
Wednesday, April 28, 2010
Cuhhraazy Indexes
• Original Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 1,
category_id: 1
}
• 40k cities, 200 days, 60 categories
Wednesday, April 28, 2010
Cuhhraazy Indexes
• Original Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 1,
category_id: 1
}
• 40k cities, 200 days, 60 categories
• 480M potential documents
Wednesday, April 28, 2010
Cuhhraazy Indexes
• Original Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 1,
category_id: 1
}
• 40k cities, 200 days, 60 categories
• 480M potential documents
• Composite indexes
• state_1_city_1
• state_1_date_1_city_1
• category_id_1_date_1
• date_1
Wednesday, April 28, 2010
Cuhhraazy Indexes
• Original Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 1,
category_id: 1
}
• 40k cities, 200 days, 60 categories
• 480M potential documents
• Composite indexes
• state_1_city_1
• state_1_date_1_city_1
• category_id_1_date_1
• date_1
• Flexibility & performance inquerying and aggregating
Wednesday, April 28, 2010
Cuhhraazy Indexes
• Original Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 1,
category_id: 1
}
• 40k cities, 200 days, 60 categories
• 480M potential documents
• Composite indexes
• state_1_city_1
• state_1_date_1_city_1
• category_id_1_date_1
• date_1
• Flexibility & performance inquerying and aggregating
• More Complex Document w/ Embedded Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 9581555,
category_id: 1,
mobile_sources: {
browsers: {
windows_mobile: 1,
palm_os: 0,
iphone_4g: 9481554
},
zip_codes: [“01518”]
}
}
Wednesday, April 28, 2010
Cuhhraazy Indexes
• Original Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 1,
category_id: 1
}
• 40k cities, 200 days, 60 categories
• 480M potential documents
• Composite indexes
• state_1_city_1
• state_1_date_1_city_1
• category_id_1_date_1
• date_1
• Flexibility & performance inquerying and aggregating
• More Complex Document w/ Embedded Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 9581555,
category_id: 1,
mobile_sources: {
browsers: {
windows_mobile: 1,
palm_os: 0,
iphone_4g: 9481554
},
zip_codes: [“01518”]
}
}
• Additional, complex data down the road?
Wednesday, April 28, 2010
Cuhhraazy Indexes
• Original Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 1,
category_id: 1
}
• 40k cities, 200 days, 60 categories
• 480M potential documents
• Composite indexes
• state_1_city_1
• state_1_date_1_city_1
• category_id_1_date_1
• date_1
• Flexibility & performance inquerying and aggregating
• More Complex Document w/ Embedded Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 9581555,
category_id: 1,
mobile_sources: {
browsers: {
windows_mobile: 1,
palm_os: 0,
iphone_4g: 9481554
},
zip_codes: [“01518”]
}
}
• Additional, complex data down the road?
• Add deep embedded indexes!
Wednesday, April 28, 2010
Cuhhraazy Indexes
• Original Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 1,
category_id: 1
}
• 40k cities, 200 days, 60 categories
• 480M potential documents
• Composite indexes
• state_1_city_1
• state_1_date_1_city_1
• category_id_1_date_1
• date_1
• Flexibility & performance inquerying and aggregating
• More Complex Document w/ Embedded Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 9581555,
category_id: 1,
mobile_sources: {
browsers: {
windows_mobile: 1,
palm_os: 0,
iphone_4g: 9481554
},
zip_codes: [“01518”]
}
}
• Additional, complex data down the road?
• Add deep embedded indexes!
• ‘mobile_sources.zip_codes’
Wednesday, April 28, 2010
Cuhhraazy Indexes
• Original Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 1,
category_id: 1
}
• 40k cities, 200 days, 60 categories
• 480M potential documents
• Composite indexes
• state_1_city_1
• state_1_date_1_city_1
• category_id_1_date_1
• date_1
• Flexibility & performance inquerying and aggregating
• More Complex Document w/ Embedded Document
{
city: “Boston”,
state: “MA”,
date: 1272153600,
occurrences: 9581555,
category_id: 1,
mobile_sources: {
browsers: {
windows_mobile: 1,
palm_os: 0,
iphone_4g: 9481554
},
zip_codes: [“01518”]
}
}
• Additional, complex data down the road?
• Add deep embedded indexes!
• ‘mobile_sources.zip_codes’
• ‘mobile_sources.browsers.iphone_4g’
Wednesday, April 28, 2010
Let’s write some Ruby
Wednesday, April 28, 2010
Let’s write some Rubyclass SearchStats
cattr_accessor :db, :collection
self.collection = $mongo_db.collection ‘searches‘
def self.record_search(city, state, category_id, vendor_id)
collection.update({:vendor_id => vendor_id,
:city => city,
:state => state,
:category_id => category_id,
:date => Time.now.utc.beginning_of_day.to_i},
{'$inc' => {:occurrence => 1}},
{:upsert => true})
end
end
Wednesday, April 28, 2010
Let’s write some Rubyclass SearchStats
cattr_accessor :db, :collection
self.collection = $mongo_db.collection ‘searches‘
def self.record_search(city, state, category_id, vendor_id)
collection.update({:vendor_id => vendor_id,
:city => city,
:state => state,
:category_id => category_id,
:date => Time.now.utc.beginning_of_day.to_i},
{'$inc' => {:occurrence => 1}},
{:upsert => true})
end
end
irb> SearchStats.record_search(‘Boston’, ‘MA’, 9, 42)
Wednesday, April 28, 2010
Let’s write some Rubyclass SearchStats
cattr_accessor :db, :collection
self.collection = $mongo_db.collection ‘searches‘
def self.record_search(city, state, category_id, vendor_id)
collection.update({:vendor_id => vendor_id,
:city => city,
:state => state,
:category_id => category_id,
:date => Time.now.utc.beginning_of_day.to_i},
{'$inc' => {:occurrence => 1}},
{:upsert => true})
end
end
irb> SearchStats.record_search(‘Boston’, ‘MA’, 9, 42)
mongo> db.searches.findOne()
{ "city" : "Boston",
"date" : 1272326400,
"impression" : 1,
"category_id" : 9,
"state" : "MA",
"vendor_id" : 42 }
Wednesday, April 28, 2010
TESTING
Wednesday, April 28, 2010
TESTING• Pretty much the same as anything else
Wednesday, April 28, 2010
TESTING• Pretty much the same as anything else
• test/unit/search_stats_test.rb
Wednesday, April 28, 2010
TESTING• Pretty much the same as anything else
• test/unit/search_stats_test.rb
• config/environments/test.rb
Wednesday, April 28, 2010
TESTING• Pretty much the same as anything else
• test/unit/search_stats_test.rb
• config/environments/test.rb
• $mongo_db = Mongo::Connection.new.db ‘mongo-sf-db’
Wednesday, April 28, 2010
TESTING• Pretty much the same as anything else
• test/unit/search_stats_test.rb
• config/environments/test.rb
• $mongo_db = Mongo::Connection.new.db ‘mongo-sf-db’
• SearchStatsTest#setup & #teardown
Wednesday, April 28, 2010
TESTING• Pretty much the same as anything else
• test/unit/search_stats_test.rb
• config/environments/test.rb
• $mongo_db = Mongo::Connection.new.db ‘mongo-sf-db’
• SearchStatsTest#setup & #teardown
• No transactions
Wednesday, April 28, 2010
TESTING• Pretty much the same as anything else
• test/unit/search_stats_test.rb
• config/environments/test.rb
• $mongo_db = Mongo::Connection.new.db ‘mongo-sf-db’
• SearchStatsTest#setup & #teardown
• No transactions
• ActiveSupport spoils us
Wednesday, April 28, 2010
TESTING• Pretty much the same as anything else
• test/unit/search_stats_test.rb
• config/environments/test.rb
• $mongo_db = Mongo::Connection.new.db ‘mongo-sf-db’
• SearchStatsTest#setup & #teardown
• No transactions
• ActiveSupport spoils us
• Gotta clean up after yourself
Wednesday, April 28, 2010
Other developers gotta use it too
Wednesday, April 28, 2010
Other developers gotta use it too
• Rails’ database.yml takes care of MySQL
Wednesday, April 28, 2010
Other developers gotta use it too
• Rails’ database.yml takes care of MySQL
• Gotta switch up MongoDBs on your own
Wednesday, April 28, 2010
Other developers gotta use it too
• Rails’ database.yml takes care of MySQL
• Gotta switch up MongoDBs on your own
• Simple config system file inspired by another Nunemaker post.
Wednesday, April 28, 2010
Other developers gotta use it too
• Rails’ database.yml takes care of MySQL
• Gotta switch up MongoDBs on your own
• Simple config system file inspired by another Nunemaker post.
• No database migrations!
Wednesday, April 28, 2010
Other developers gotta use it too
• Rails’ database.yml takes care of MySQL
• Gotta switch up MongoDBs on your own
• Simple config system file inspired by another Nunemaker post.
• No database migrations!
• Must write rake tasks to create indexes unless your ORM takes care of it for you
Wednesday, April 28, 2010
Making production ready
Wednesday, April 28, 2010
Making production ready
• Replication was iffy in 1.0.0 (solid now)
Wednesday, April 28, 2010
Making production ready
• Replication was iffy in 1.0.0 (solid now)
• Nightly dumps w/ mongoexport
Wednesday, April 28, 2010
Making production ready
• Replication was iffy in 1.0.0 (solid now)
• Nightly dumps w/ mongoexport
• Upload to S3
Wednesday, April 28, 2010
Making production ready
• Replication was iffy in 1.0.0 (solid now)
• Nightly dumps w/ mongoexport
• Upload to S3
• Monitor process w/ monit
Wednesday, April 28, 2010
Deploy
Wednesday, April 28, 2010
Deploy
• Original deployment uneventful
Wednesday, April 28, 2010
Deploy
• Original deployment uneventful
• Just worked
Wednesday, April 28, 2010
Deploy
• Original deployment uneventful
• Just worked
• Future deployments that required a database upgrade required downtime (bummer)
Wednesday, April 28, 2010
Deploy
• Original deployment uneventful
• Just worked
• Future deployments that required a database upgrade required downtime (bummer)
• We wanted to play with the geospatial search
Wednesday, April 28, 2010
Day 200
Wednesday, April 28, 2010
Day 200
• One production process
Wednesday, April 28, 2010
Day 200
• One production process
• One database
Wednesday, April 28, 2010
Day 200
• One production process
• One database
• Daily dumps using mongoexport + upload to S3
Wednesday, April 28, 2010
Day 200
• One production process
• One database
• Daily dumps using mongoexport + upload to S3
• Replication not running
Wednesday, April 28, 2010
Day 200
• One production process
• One database
• Daily dumps using mongoexport + upload to S3
• Replication not running
• Several collections
Wednesday, April 28, 2010
Day 200
• One production process
• One database
• Daily dumps using mongoexport + upload to S3
• Replication not running
• Several collections
• 100k to >10M documents
Wednesday, April 28, 2010
Day 200
• One production process
• One database
• Daily dumps using mongoexport + upload to S3
• Replication not running
• Several collections
• 100k to >10M documents
• Scores of deep, composite indexes
Wednesday, April 28, 2010
What tripped me up?
Wednesday, April 28, 2010
Simple Test
should "get impressions_by_date" do
record1 = {:date => 100, :impression => 10}
record2 = {:date => 90, :impression => 8}
Mongo::Vendor.collection.insert record1
Mongo::Vendor.collection.insert record2
assert_same_elements [record1, record2],
Mongo::Vendor.impressions_by_date
end
Wednesday, April 28, 2010
Crazy Failure
1) Failure:
test: Mongo::Vendor should get
impressions_by_date. (Mongo::VendorTest)
...
-<{{"date"=>100.0, "csum"=>10.0}=>1,
{"date"=>90.0, "csum"=>8.0}=>1}>
+<{{"date"=>90.0, "csum"=>8.0}=>1,
{"date"=>100.0, "csum"=>10.0}=>1}>
Wednesday, April 28, 2010
Wednesday, April 28, 2010
Mongo’s BSON OrderedHash doesn’t behave like ActiveSupport::OrderedHash
Wednesday, April 28, 2010
irb(main):001:0> require 'active_support'irb(main):002:0> include ActiveSupportirb(main):003:0> oh = OrderedHash.newirb(main):004:0> oh[:b] = 2irb(main):005:0> oh[:c] = 3irb(main):006:0> oh[:a] = 1irb(main):007:0> a = {:a => 1, :b => 2, :c => 3}irb(main):008:0> a == oh=> trueirb(main):009:0> oh == a=> true
Mongo’s BSON OrderedHash doesn’t behave like ActiveSupport::OrderedHash
Wednesday, April 28, 2010
irb(main):001:0> require 'active_support'irb(main):002:0> include ActiveSupportirb(main):003:0> oh = OrderedHash.newirb(main):004:0> oh[:b] = 2irb(main):005:0> oh[:c] = 3irb(main):006:0> oh[:a] = 1irb(main):007:0> a = {:a => 1, :b => 2, :c => 3}irb(main):008:0> a == oh=> trueirb(main):009:0> oh == a=> true
irb(main):001:0> require 'mongo'irb(main):002:0> oh = OrderedHash.newirb(main):003:0> oh[:b] = 2irb(main):004:0> oh[:c] = 3irb(main):005:0> oh[:a] = 1irb(main):006:0> a = {:a => 1, :b => 2, :c => 3}irb(main):007:0> a == oh=> true
Mongo’s BSON OrderedHash doesn’t behave like ActiveSupport::OrderedHash
Wednesday, April 28, 2010
irb(main):001:0> require 'active_support'irb(main):002:0> include ActiveSupportirb(main):003:0> oh = OrderedHash.newirb(main):004:0> oh[:b] = 2irb(main):005:0> oh[:c] = 3irb(main):006:0> oh[:a] = 1irb(main):007:0> a = {:a => 1, :b => 2, :c => 3}irb(main):008:0> a == oh=> trueirb(main):009:0> oh == a=> true
irb(main):001:0> require 'mongo'irb(main):002:0> oh = OrderedHash.newirb(main):003:0> oh[:b] = 2irb(main):004:0> oh[:c] = 3irb(main):005:0> oh[:a] = 1irb(main):006:0> a = {:a => 1, :b => 2, :c => 3}irb(main):007:0> a == oh=> true
Mongo’s BSON OrderedHash doesn’t behave like ActiveSupport::OrderedHash
irb(main):008:0> oh == a=> false
Wednesday, April 28, 2010
Wednesday, April 28, 2010
And it turns out...
Wednesday, April 28, 2010
And it turns out...
MongoDB will not add indexes to your collections.
Wednesday, April 28, 2010
And it turns out...
MongoDB will not add indexes to your collections.
Unless you tell it to.
Wednesday, April 28, 2010
add indexes
add indexes
add indexes
add indexes
add indexes
add indexes
add indexes
add indexes
add indexes
add indexes
add indexes
add indexes
add indexes
Wednesday, April 28, 2010
But it’ll help pick up the pieces
Before index> db.vendors.findOne({'date': {'$gt': 1271895200}}){ ... }Sun Apr 25 01:19:51 query v_production.vendors ntoreturn:1 reslen:114 nscanned:~10M { date: { $gt: 1271895200.0 } } nreturned:1 3492ms
During index> db.vendors.ensureIndex({'date': 1}) Sun Apr 25 01:20:17 building new index on { date: 1.0 } for v_production.vendorsSun Apr 25 01:20:17 Buildindex v_production.vendors idxNo:2 { _id: ObjId(4bd398d1dbf523027368d566), ns: "v_production.vendors", key: { date: 1.0 }, name: "date_1" }! ! ...Sun Apr 25 01:21:35 ! done building bottom layer, going to commitSun Apr 25 01:21:35 done for ~10M records 78.56secsSun Apr 25 01:21:35 insert v_production.system.indexes 78560ms
After Index> db.vendors.find({'date': 1272153600}).explain(){! ...! "nscanned" : 11220,! "millis" : 12,! ...}
Wednesday, April 28, 2010
But it’ll help pick up the pieces
Before index> db.vendors.findOne({'date': {'$gt': 1271895200}}){ ... }Sun Apr 25 01:19:51 query v_production.vendors ntoreturn:1 reslen:114 nscanned:~10M { date: { $gt: 1271895200.0 } } nreturned:1 3492ms
During index> db.vendors.ensureIndex({'date': 1}) Sun Apr 25 01:20:17 building new index on { date: 1.0 } for v_production.vendorsSun Apr 25 01:20:17 Buildindex v_production.vendors idxNo:2 { _id: ObjId(4bd398d1dbf523027368d566), ns: "v_production.vendors", key: { date: 1.0 }, name: "date_1" }! ! ...Sun Apr 25 01:21:35 ! done building bottom layer, going to commitSun Apr 25 01:21:35 done for ~10M records 78.56secsSun Apr 25 01:21:35 insert v_production.system.indexes 78560ms
After Index> db.vendors.find({'date': 1272153600}).explain(){! ...! "nscanned" : 11220,! "millis" : 12,! ...}
Full collection scan ~3.5s
Wednesday, April 28, 2010
But it’ll help pick up the pieces
Before index> db.vendors.findOne({'date': {'$gt': 1271895200}}){ ... }Sun Apr 25 01:19:51 query v_production.vendors ntoreturn:1 reslen:114 nscanned:~10M { date: { $gt: 1271895200.0 } } nreturned:1 3492ms
During index> db.vendors.ensureIndex({'date': 1}) Sun Apr 25 01:20:17 building new index on { date: 1.0 } for v_production.vendorsSun Apr 25 01:20:17 Buildindex v_production.vendors idxNo:2 { _id: ObjId(4bd398d1dbf523027368d566), ns: "v_production.vendors", key: { date: 1.0 }, name: "date_1" }! ! ...Sun Apr 25 01:21:35 ! done building bottom layer, going to commitSun Apr 25 01:21:35 done for ~10M records 78.56secsSun Apr 25 01:21:35 insert v_production.system.indexes 78560ms
After Index> db.vendors.find({'date': 1272153600}).explain(){! ...! "nscanned" : 11220,! "millis" : 12,! ...}
Full collection scan ~3.5s
Index added ~79sNo locks.
Wednesday, April 28, 2010
But it’ll help pick up the pieces
Before index> db.vendors.findOne({'date': {'$gt': 1271895200}}){ ... }Sun Apr 25 01:19:51 query v_production.vendors ntoreturn:1 reslen:114 nscanned:~10M { date: { $gt: 1271895200.0 } } nreturned:1 3492ms
During index> db.vendors.ensureIndex({'date': 1}) Sun Apr 25 01:20:17 building new index on { date: 1.0 } for v_production.vendorsSun Apr 25 01:20:17 Buildindex v_production.vendors idxNo:2 { _id: ObjId(4bd398d1dbf523027368d566), ns: "v_production.vendors", key: { date: 1.0 }, name: "date_1" }! ! ...Sun Apr 25 01:21:35 ! done building bottom layer, going to commitSun Apr 25 01:21:35 done for ~10M records 78.56secsSun Apr 25 01:21:35 insert v_production.system.indexes 78560ms
After Index> db.vendors.find({'date': 1272153600}).explain(){! ...! "nscanned" : 11220,! "millis" : 12,! ...}
Full collection scan ~3.5s
Index added ~79sNo locks.
12ms!
Wednesday, April 28, 2010
But it’ll help pick up the pieces
Before index> db.vendors.findOne({'date': {'$gt': 1271895200}}){ ... }Sun Apr 25 01:19:51 query v_production.vendors ntoreturn:1 reslen:114 nscanned:~10M { date: { $gt: 1271895200.0 } } nreturned:1 3492ms
During index> db.vendors.ensureIndex({'date': 1}) Sun Apr 25 01:20:17 building new index on { date: 1.0 } for v_production.vendorsSun Apr 25 01:20:17 Buildindex v_production.vendors idxNo:2 { _id: ObjId(4bd398d1dbf523027368d566), ns: "v_production.vendors", key: { date: 1.0 }, name: "date_1" }! ! ...Sun Apr 25 01:21:35 ! done building bottom layer, going to commitSun Apr 25 01:21:35 done for ~10M records 78.56secsSun Apr 25 01:21:35 insert v_production.system.indexes 78560ms
After Index> db.vendors.find({'date': 1272153600}).explain(){! ...! "nscanned" : 11220,! "millis" : 12,! ...}
Full collection scan ~3.5s
Index added ~79sNo locks.
12ms!
During index creation, system completely CRUD responsive. No impact to
MyPunchbowl.com
Wednesday, April 28, 2010
Up next: Top Secret Project
Wednesday, April 28, 2010
Up next: Top Secret Project
• Project involving
Wednesday, April 28, 2010
Up next: Top Secret Project
• Project involving
• SCREAMS for a document-based solution
Wednesday, April 28, 2010
Up next: Top Secret Project
• Project involving
• SCREAMS for a document-based solution
• mongosphinx
• MongoDB’s ad-hoc full text search ok, lacks infix matching and bulk index building
• Geo-spatial search not yet ellipsoidal (or even spherical)
Wednesday, April 28, 2010
Up next: Top Secret Project
• Project involving
• SCREAMS for a document-based solution
• mongosphinx
• MongoDB’s ad-hoc full text search ok, lacks infix matching and bulk index building
• Geo-spatial search not yet ellipsoidal (or even spherical)
• paperclip+mongomapper
Wednesday, April 28, 2010
Up next: Top Secret Project
• Project involving
• SCREAMS for a document-based solution
• mongosphinx
• MongoDB’s ad-hoc full text search ok, lacks infix matching and bulk index building
• Geo-spatial search not yet ellipsoidal (or even spherical)
• paperclip+mongomapper
• MyPunchbowl user/event/invite data mining/visualization
Wednesday, April 28, 2010
Up next: Top Secret Project
• Project involving
• SCREAMS for a document-based solution
• mongosphinx
• MongoDB’s ad-hoc full text search ok, lacks infix matching and bulk index building
• Geo-spatial search not yet ellipsoidal (or even spherical)
• paperclip+mongomapper
• MyPunchbowl user/event/invite data mining/visualization
• Charding + MapReduce = Nerdstorm
Wednesday, April 28, 2010
Up next: Top Secret Project
• Project involving
• SCREAMS for a document-based solution
• mongosphinx
• MongoDB’s ad-hoc full text search ok, lacks infix matching and bulk index building
• Geo-spatial search not yet ellipsoidal (or even spherical)
• paperclip+mongomapper
• MyPunchbowl user/event/invite data mining/visualization
• Charding + MapReduce = Nerdstorm
• Building tools to speed up ETL out of MySQL and into MongoDB
Wednesday, April 28, 2010
The end
Questions?
Wednesday, April 28, 2010
Contact Me
http://www.mypunchbowl.com
http://ryanangilly.com
@angilly
Wednesday, April 28, 2010
Links
Slide 3/4 logos, wget’d from all over the Internets.
Dunce Cap http://steynian.files.wordpress.com/2009/09/dunce-cap.jpg
http://railstips.org/blog/archives/2009/12/18/why-i-think-mongo-is-to-databases-what-rails-was-to-frameworks/
http://railstips.org/blog/archives/2009/11/10/config-so-simple-your-mama-could-use-it/
Vendors configuration http://gist.github.com/380213
Wednesday, April 28, 2010