mongodb at gilt groupe
TRANSCRIPT
February 2013Sean Sullivan
@
• software engineer
• ~2 years at Gilt
• work in Gilt’s Portland office
• back office applications
About me
• Gilt Groupe
• Gilt’s technology stack
• MongoDB at Gilt
• Q&A
Agenda
Gilt Groupehttp://www.gilt.com
flash sales
Everyday at 12 noon
what does Gilt sell?
Apparel
Kids merchandise
Home furnishings
Food
Local deals
Travel
Gilt technology stack
Gilt architecture
monolithicapplication
service-oriented architecture
2007 2013
2007
2013
service A
service B
service C
service D
service E
legacyweb app
gilt.com
Data storage @ Gilt
@
Why MongoDB?
• Ease of use
• Horizontal scaling
• High availability
• Automatic failover
Why MongoDB?
• Stability
• Support
• Drivers
• MongoDB 2.0
• sharded and non-sharded data
• Solid State Drives
• MMS for monitoring
Use case #1: user profiles
• user data in Postgres
• legacy Rails app expects to find user data in Postgres
• we wanted Gilt’s customer facing applications to retrieve user data from MongoDB
Challenges
• keep user data in both MongoDB and Postgres
• replicate from MongoDB to Postgres
Solution
Replicating user data
user service
legacyweb app
replicationservice
Replication service
• listens for RabbitMQ messages
‣ UserCreated message
‣ UserUpdated message
• retrieve data using REST API
• write data to Postgres using JDBC
Use case #2: feature configuration
Goal
manage the release of new features on gilt.com
How
feature configuration persisted in MongoDB
Rolling out a new feature
1. deploy new application code to production
2. enable feature for Gilt Tech employees
3. ... then enable for all Gilt employees
4. ... then enable for a subset of users
5. gradually ramp up to 100% of users
Feature configservice
Feature configuration
gilt.com
Use case #3: favorite brands
userpreference
service
AJAX
Favorite brands
Application development
mongo-java-driver
Morphia
Casbah
MongoDB Java driver
• main class: com.mongodb.Mongo
• Mongo object maintains a pool of connections
• Latest version: 2.10.1
MongoDB Java driver connection pool tuning
MongoOptions opts = new MongoOptions();
// example values
opts.connectionsPerHost = 50;
opts.threadsAllowedToBlockForConnectionMultiplier = 5;
com.mongodb.WriteConcern
• NORMAL
• SAFE
• REPLICAS_SAFE
• MAJORITY
• FSYNC_SAFE
• JOURNAL_SAFE
Morphia library
Morphia
• object-document mapper for Java
• built on top of mongo-java-driver
• map fields using Java annotations
// Morphia example
import com.google.code.morphia.annotations.*;
@Entity(value="features", noClassnameStored = true)
public class Feature {
@Id
ObjectId featureId;
@Property("feature_key") @Indexed(unique=true)
String featureKey;
@Property("release_to_percentage")
int releaseToPercentage;
}
http://code.google.com/p/morphia/
Casbah library
Casbah
• Scala toolkit for MongoDB
• built on top of the mongo-java-driver
• current version: 2.5.0
Casbah
• Scala idioms
• Scala collections
• fluid query syntax
https://twitter.com/max4f/status/230503836958199808
Gilt Scala code
trait MongoWriteConcern {
def withSession[T](f: => T)(implicit mongoDb: MongoDB) {
mongoDb.requestStart()
mongoDb.requestEnsureConnection()
try {
f
} finally {
mongoDb.requestDone()
}
}
}
Best Practices
Best practices
Connection tuning
explicitly configure Mongo Java Driver connection pool size
Best practices
Use caution when creating new indexes
“creating a new index on a production mongo server can basically cause it to stop working while it builds the index”
(source: Gilt production incident report)
http://docs.mongodb.org/manual/administration/indexes/#index-building-replica-sets
Minimizing impact of building a new index
• remove one secondary from replica set
• create/rebuild index on this instance
• rejoin replica set
• repeat on all remaining secondaries
• run rs.stepDown() on primary member
Best practicesuse short names for fields to avoid wasted space
http://christophermaier.name/blog/2011/05/22/MongoDB-key-names
{
city: “Portland”,
state: “Oregon”,
country: “US”
}
{
ci: “Portland”,
st: “Oregon”,
co: “US”
}
vs
Best practices
use explain() during development
db.collection.find(query).explain()
Best practices
use caution when choosing a shard key
“It is generally not a good idea to use the default ObjectId as the shard key”
source: http://stackoverflow.com/questions/9164356/sharding-by-objectid-is-it-the-right-way
Future
• shipping addresses
• discounts
• SKU’s and brands
More Gilt data in MongoDB
• tag aware sharding
Upgrade to MongoDB 2.2
Gilt Groupe is hiring!http://techjobs.gilt.com
Questions?
[email protected]@tinyrobots
The end
Bonus slides
Gilt tech talks
• Apache Camel and MongoDB @ Gilt http://bit.ly/OYO37K
• Deploying new features @ Gilt http://slidesha.re/OoCYfd
• Voldemort @ Gilt
http://bit.ly/b9Qhib
https://twitter.com/stripe/status/298858032421535744