java web development with mongodb (presented at devoxx 2010)
DESCRIPTION
In this presentation, we will try to answer the following- What is a document and a document database?- How does replication and sharding enable me to scale my application?- How does Java web development change when using MongoDB?- How do I deploy my application with MongoDBTRANSCRIPT
Alvin Richards [email protected]
Topics
OverviewData modelingReplication & ShardingDeveloping with JavaDeployment
Drinking from the fire hose
Part OneMongoDB Overview
Strong adoption of MongoDB
90,000 Database downloads per month
Over 1,000 Production Deployments
web 2.0 companies started out using thisbut now:- enterprises- financial industries
3 Reason - Performance- Large number of readers / writers- Large data volume- Agility (ease of development)
non-‐relational, next-‐generation operational datastores and databases
NoSQL Really Means:
RDBMS(Oracle, MySQL)
past : one-size-fits-all
RDBMS(Oracle, MySQL)
New Gen. OLAP
(vertica, aster, greenplum)
present : business intelligence and analytics is now its own segment.
RDBMS(Oracle, MySQL)
New Gen. OLAP
(vertica, aster, greenplum)
Non-relationalOperational
Stores(“NoSQL”)
futurewe claim nosql segment will be: * large* not fragmented* ‘platformitize-able’
Philosophy: maximize features -‐ up to the “knee” in the curve, then stop
depth of functionality
scalab
ility & perform
ance •memcached
• key/value
• RDBMS
Horizontally ScalableArchitectures
no joinsno complex transactions+
New Data ModelsImproved ways to develop
no joinsno complex transactions+
Platform and Language supportMongoDB is Implemented in C++ for best performance
Platforms 32/64 bit• Windows• Linux, Mac OS-X, FreeBSD, Solaris
ease of development a surprisingly big benefit : faster to code, faster to change, avoid upgrades and scheduled downtimemore predictable performancefast single server performance -> developer spends less time manually coding around the databasebottom line: usually, developers like it much better after trying
Platform and Language supportMongoDB is Implemented in C++ for best performance
Platforms 32/64 bit• Windows• Linux, Mac OS-X, FreeBSD, Solaris
Language drivers for• Java • Ruby / Ruby-on-Rails • C#• C / C++• Erlang • Python, Perl, JavaScript• Scala• others...
ease of development a surprisingly big benefit : faster to code, faster to change, avoid upgrades and scheduled downtimemore predictable performancefast single server performance -> developer spends less time manually coding around the databasebottom line: usually, developers like it much better after trying
Part TwoData Modeling in MongoDB
So why model data?
A brief history of normalization• 1970 E.F.Codd introduces 1st Normal Form (1NF)• 1971 E.F.Codd introduces 2nd and 3rd Normal Form (2NF, 3NF)• 1974 Codd & Boyce define Boyce/Codd Normal Form (BCNF)• 2002 Date, Darween, Lorentzos define 6th Normal Form (6NF)
Goals:• Avoid anomalies when inserting, updating or deleting• Minimize redesign when extending the schema• Make the model informative to users• Avoid bias towards a particular style of query
* source : wikipedia
The real benefit of relational
• Before relational• Data and Logic combined
• After relational• Separation of concerns• Data modeled independent of logic• Logic freed from concerns of data design
• MongoDB continues this separation
Relational made normalized data look like this
Document databases make normalized data look like this
Terminology
RDBMS MongoDB
Table Collection
Row(s) JSON Document
Index Index
Join Embedding & Linking
Partition Shard
Partition Key Shard Key
DB ConsiderationsHow can we manipulate
this data ?
• Dynamic Queries
• Secondary Indexes
• Atomic Updates
• Map Reduce
Considerations• No Joins• Document writes are atomic
Access Patterns ?
• Read / Write Ratio
• Types of updates
• Types of queries
• Data life-cycle
So today’s example will use...
Design Session
Design documents that simply map to your applicationpost = {author: “Hergé”, date: new Date(), text: “Destination Moon”, tags: [“comic”, “adventure”]}
>db.post.save(post)
>db.posts.find()
{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Hergé", date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)", text : "Destination Moon", tags : [ "comic", "adventure" ] } Notes:• ID must be unique, but can be anything you’d like• MongoDB will generate a default ID if one is not supplied
Find the document
Secondary index for “author”
// 1 means ascending, -1 means descending
>db.posts.ensureIndex({author: 1})
>db.posts.find({author: 'Hergé'}) { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Hergé", ... }
Add and index, find via Index
Verifying indexes exist
>db.system.indexes.find()
// Index on ID { name : "_id_", ns : "test.posts", key : { "_id" : 1 } }
// Index on author { _id : ObjectId("4c4ba6c5672c685e5e8aabf4"), ns : "test.posts", key : { "author" : 1 }, name : "author_1" }
Query operatorsConditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,
// find posts with any tags >db.posts.find({tags: {$exists: true}})
Query operatorsConditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,
// find posts with any tags >db.posts.find({tags: {$exists: true}})
Regular expressions: // posts where author starts with h >db.posts.find({author: /^h*/i })
Query operatorsConditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,
// find posts with any tags >db.posts.find({tags: {$exists: true}})
Regular expressions: // posts where author starts with h >db.posts.find({author: /^h*/i })
Counting: // posts written by Hergé >db.posts.find({author: “Hergé”}).count()
Extending the Schema new_comment = {author: “Kyle”, date: new Date(), text: “great book”}
>db.posts.update({_id: “...” }, { ‘$push’: {comments: new_comment}, ‘$inc’: {comments_count: 1}})
{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Hergé", date : "Sat Jul 24 2010 19:47:11 GMT-‐0700 (PDT)", text : "Destination Moon", tags : [ "comic", "adventure" ], comments_count: 1, comments : [ { author : "Kyle", date : "Sat Jul 24 2010 20:51:03 GMT-‐0700 (PDT)", text : "great book" } ]}
Extending the Schema
// create index on nested documents: >db.posts.ensureIndex({"comments.author": 1})
>db.posts.find({comments.author:”Kyle”})
Extending the Schema
// create index on nested documents: >db.posts.ensureIndex({"comments.author": 1})
>db.posts.find({comments.author:”Kyle”})
// find last 5 posts: >db.posts.find().sort({date:-1}).limit(5)
Extending the Schema
// create index on nested documents: >db.posts.ensureIndex({"comments.author": 1})
>db.posts.find({comments.author:”Kyle”})
// find last 5 posts: >db.posts.find().sort({date:-1}).limit(5)
// most commented post: >db.posts.find().sort({comments_count:-1}).limit(1)
When sorting, check if you need an index
Extending the Schema
Explain a query plan> db.blogs.find({author: 'Hergé'}).explain(){ "cursor" : "BtreeCursor author_1", "nscanned" : 1, "nscannedObjects" : 1, "n" : 1, "millis" : 5, "indexBounds" : { "author" : [ [ "Hergé", "Hergé" ] ] }
Watch for full table scans
> db.blogs.find({text: 'Destination Moon'}).explain() { "cursor" : "BasicCursor", "nscanned" : 1, "nscannedObjects" : 1, "n" : 1, "millis" : 0, "indexBounds" : { }}
Map Reduce
Map reduce : count tagsmapFunc = function () { this.tags.forEach(function (z) {emit(z, {count:1});});}
reduceFunc = function (k, v) { var total = 0; for (var i = 0; i < v.length; i++) { total += v[i].count; } return {count:total}; }
res = db.posts.mapReduce(mapFunc, reduceFunc)
>db[res.result].find() { _id : "comic", value : { count : 1 } } { _id : "adventure", value : { count : 1 } }
Group
• Equivalent to a Group By in SQL
• Specific the attributes to group the data
• Process the results in a Reduce function
Groupcmd = { key: { "author":true }, initial: {count: 0}, reduce: function(obj, prev) { prev.count++; }, };result = db.posts.group(cmd);
[ { "author" : "Hergé", "count" : 1 }, { "author" : "Kyle", "count" : 3 }]
Review
So Far:- Started out with a simple schema- Queried Data- Evolved the schema - Queried / Updated the data some more
Single Table Inheritance
>db.shapes.find() { _id: ObjectId("..."), type: "circle", area: 3.14, radius: 1} { _id: ObjectId("..."), type: "square", area: 4, d: 2} { _id: ObjectId("..."), type: "rect", area: 10, length: 5, width: 2}
// find shapes where radius > 0 >db.shapes.find({radius: {$gt: 0}})
// create index >db.shapes.ensureIndex({radius: 1})
One to Many- Embedded Array / Array Keys - slice operator to return subset of array - some queries hard e.g find latest comments across all documents
One to Many- Embedded Array / Array Keys - slice operator to return subset of array - some queries hard e.g find latest comments across all documents
- Embedded tree - Single document - Natural - Hard to query
One to Many- Embedded Array / Array Keys - slice operator to return subset of array - some queries hard e.g find latest comments across all documents
- Embedded tree - Single document - Natural - Hard to query
- Normalized (2 collections) - most flexible - more queries
Many - ManyExample: - Product can be in many categories- Category can have many products
Products- product_id
Category- category_id
Product_Categories- product_id- category_id
products: { _id: ObjectId("4c4ca23933fb5941681b912e"), name: "Destination Moon", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]}
Many - Many
products: { _id: ObjectId("4c4ca23933fb5941681b912e"), name: "Destination Moon", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Adventure", product_ids: [ ObjectId("4c4ca23933fb5941681b912e"), ObjectId("4c4ca30433fb5941681b9130"), ObjectId("4c4ca30433fb5941681b913a"]}
Many - Many
products: { _id: ObjectId("4c4ca23933fb5941681b912e"), name: "Destination Moon", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Adventure", product_ids: [ ObjectId("4c4ca23933fb5941681b912e"), ObjectId("4c4ca30433fb5941681b9130"), ObjectId("4c4ca30433fb5941681b913a"]}
//All categories for a given product>db.categories.find({product_ids: ObjectId("4c4ca23933fb5941681b912e")})
Many - Many
products: { _id: ObjectId("4c4ca23933fb5941681b912e"), name: "Destination Moon", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Adventure"}
Alternative
products: { _id: ObjectId("4c4ca23933fb5941681b912e"), name: "Destination Moon", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Adventure"}
// All products for a given category>db.products.find({category_ids: ObjectId("4c4ca25433fb5941681b912f")})
Alternative
products: { _id: ObjectId("4c4ca23933fb5941681b912e"), name: "Destination Moon", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Adventure"}
// All products for a given category>db.products.find({category_ids: ObjectId("4c4ca25433fb5941681b912f")})
// All categories for a given productproduct = db.products.find(_id : some_id)>db.categories.find({_id : {$in : product.category_ids}})
Alternative
TreesFull Tree in Document
{ comments: [ { author: “Kyle”, text: “...”, replies: [ {author: “Fred”, text: “...”, replies: []} ]} ]}
Pros: Single Document, Performance, Intuitive
Cons: Hard to search, Partial Results, 4MB limit
TreesParent Links- Each node is stored as a document- Contains the id of the parent
Child Links- Each node contains the id’s of the children- Can support graphs (multiple parents / child)
Array of Ancestors- Store Ancestors of a node { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" } { _id: "g", ancestors: [ "a", "b", "d" ], parent: "d" }
Array of Ancestors- Store Ancestors of a node { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" } { _id: "g", ancestors: [ "a", "b", "d" ], parent: "d" }
//find all descendants of b:>db.tree2.find({ancestors: ‘b’})
Array of Ancestors- Store Ancestors of a node { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" } { _id: "g", ancestors: [ "a", "b", "d" ], parent: "d" }
//find all descendants of b:>db.tree2.find({ancestors: ‘b’})
//find all ancestors of f:>ancestors = db.tree2.findOne({_id:’f’}).ancestors>db.tree2.find({_id: { $in : ancestors})
findAndModifyQueue example
//Example: find highest priority job and mark
job = db.jobs.findAndModify({ query: {inprogress: false}, sort: {priority: -1), update: {$set: {inprogress: true, started: new Date()}}, new: true})
Part ThreeReplication & Sharding
Scaling
• Data size only goes up• Operations/sec only go up• Vertical scaling is limited• Hard to scale vertically in the cloud• Can scale wider than higher
What is scaling?Well - hopefully for everyone here.
Traditional Horizontal Scaling
• read only slaves• caching• custom partitioning code
scaling isn’t newsharding isn’tmanual re-balancing is painful at best
New methods of Scaling
• relational database clustering• consistent hashing (Dynamo)• range based partitioning (BigTable/PNUTS)
Read Scalability : Replication
write
read
ReplicaSet 1
Primary
Secondary
Secondary
Basics• MongoDB replication is a bit like MySQL replication
Asynchronous master/slave at its core• Variations:
Master / slaveReplica Pairs (deprecated – use replica sets)Replica Sets
• A cluster of N servers• Any (one) node can be primary• Consensus election of primary• Automatic failover• Automatic recovery• All writes to primary• Reads can be to primary (default) or a secondary
Replica Sets
Replica Sets – Design Concepts
1. Write is durable once avilable on a majority of members
2. Writes may be visible before a cluster wide commit has been completed
3. On a failover, if data has not been replicated from the primary, the data is dropped (see #1).
Replica Set: Establishing
Member 1
Member 2
Member 3
Replica Set: Electing primary
Member 1
Member 2PRIMARY
Member 3
Replica Set: Failure of master
Member 1
Member 2DOWN
Member 3PRIMARY
negotiate new
master
Replica Set: Reconfiguring
Member 1
Member 2DOWN
Member 3PRIMARY
Replica Set: Member recovers
Member 1
Member 2RECOVER-
ING
Member 3PRIMARY
Replica Set: Active
Member 1
Member 2
Member 3PRIMARY
Set Member TypesNormal (priority == 1)Passive (priority == 0)Arbiter (no data, but can vote)
Write Scalability: Sharding
write
read
ReplicaSet 1
Primary
Secondary
Secondary
ReplicaSet 2
Primary
Secondary
Secondary
ReplicaSet 3
Primary
Secondary
Secondary
key range 0 .. 30
key range 31 .. 60
key range 61 .. 100
Sharding
• Scale horizontally for data size, index size, write and consistent read scaling
• Distribute databases, collections or a objects in a collection
• Auto-balancing, migrations, management happen with no down time
• Replica Sets for inconsistent read scaling
for inconsistent read scaling
Sharding
• Choose how you partition data• Can convert from single master to sharded system with no downtime• Same features as non-sharding single master• Fully consistent
Range Based
• collection is broken into chunks by range• chunks default to 200mb or 100,000 objects
Architecture
client
mongos ...mongos
mongodmongod
mongod mongod
mongod
mongod ...
Shards
mongod
mongod
mongod
ConfigServers
Config Servers
• Hold meta data of where chunks are located •1 or 3 of them (3 for availability)• changes are made with 2 phase commit• if a majority are down, meta data goes read only• system is online as long as 1/3 is up
Shards
• Hold the actual data •Can be master, master/slave or replica sets• Replica sets gives sharding + full auto-failover• Regular mongod processes
mongos
• Sharding Router (or Switch)• Acts just like a mongod to clients• Can have 1 or as many as you want• Can run on appserver so no extra network traffic
Writes
• Inserts : require shard key, routed• Removes: routed and/or scattered• Updates: routed or scattered
Queries
• By shard key: routed• Sorted by shard key: routed in order• By non shard key: scatter gather• Sorted by non shard key: distributed merge sort
Operations
• split: breaking a chunk into 2• migrate: move a chunk from 1 shard to another• balancing: moving chunks automatically to keep system in balance
Part FourJava Development
Library Choices• Raw MongoDB Driver
Map<String, Object> view of objectsRough but dynamic
• Morphia (type-safe mapper)POJOsAnnotation based (similar to JPA)Syntactic sugar and helpers
• OthersCode generators, other jvm languages
MongoDB Java Driver• BSON Package
TypesEncode/DecodeDBObject (Map<String, Object>)
Nested MapsDirectly encoded to binary format (BSON)
• MongoDB PackageMongoDBObject (BasicDBObject/Builder)DB/DBColletionDBQuery/DBCursor
BSON PackageTypes
int and longArray/ArrayListStringbyte[] – binDataDouble (IEEE 754 FP)Date (secs since epoch)NullBooleanJavaScript StringRegex
MongoDB Package• Mongo
Connection, ThreadSafeWriteConcern*
• DBAuth, Collections getLastError()Command(), eval()RequestStart/Done
• DBCollectionInsert/Save/Find/Remove/Update/FindAndModifyensureIndex
Simple ExampleDBCollection coll = new Mongo().getDB(“blogdb”);
ArrayList<String> tags = new ArrayList<String>();tags.add("comic");tags.add("adventure");
coll.save( new BasicDBObjectBuilder( “author”, “Hergé”). append(“text”, “Destination Moon”). append(“date”, new Date()). append(“tags”, tags);
Simple Example, AgainDBCollection coll = new Mongo().getDB(“blogdb”);
ArrayList<String> tags = new ArrayList<String>();tags.add("comic");tags.add("adventure");
Map<String, Object> fields = new …fields.add(“author”, “Hergé”); fields.add(“text”, “Destination Moon”);fields.add(“date”, new Date());fields.add(“tags”, tags);
coll.insert(new BasicDBObject(fields));
DBObject <-> (B/J)SON{author:”kyle”, text:“Destination Moon”,date: }
BasicDBObjectBuilder dbObj = new BasicDBObjectBuilder()
.append(“author”, “Hergé”)
.append(“text”, “Destination Moon”)
.append(“date”, new Date()) .get();
String text = (String)dbObj.get(“text”);
JSON.parse(…)DBObject dbObj = JSON.parse(“ {‘author’:‘Hergé’, ‘text’:‘Destination Moon’, ‘date’:‘Sat Jul 24 2010 19:47:11 GMT-‐0700 (PDT)’,}
”);
ListsDBObject dbObj = JSON.parse(“ {‘author’:‘Hergé’, ‘text’:‘Destination Moon’, ‘date’:‘Sat Jul 24 2010 19:47:11 GMT-‐0700 (PDT)’,}
”);
List<String> tags = new …tags.add(“comic”);tags.add(“adventure”);dbObj.put(“tags”, tags);
{…, tags: [‘comic’, ‘adventure’]}
Maps of MapsCan represent object graph/treeAlways keyed off String (field)
Morphia: MongoDB MapperMaps POJOType-safeAccess Patterns: DAO/Datastore/???Data TypesJPA likeMany concepts came from Objectify (GAE)
Annotations@Entity(“collectionName”)@Id@Transient (not transient)@Indexed(…)@Property(“fieldAlias”)@AlsoLoad({aliases})@Reference@Serialized[@Embedded]
Lifecycle Events@PrePersist@PreSave@PostPersist@PreLoad@PostLoad
EntityListenersEntityInterceptor
Basic POJO@Entityclass Person { @Id String author; @Indexed Date date; String text;}
Datastore Basicsget(class, id)find(class, […])save(entity, […])delete(query)getCount(query)update/First(query, upOps)findAndModify/Delete(query, upOps)
Add, Get, DeleteBlog entry = new Blog(“Hergé”, New Date(), “Destination Moon”)
Datastore ds = new Morphia().createDatastore()
ds.save(entry);
Blog foundEntry = ds.get(Blog.class, “Hergé”)
ds.delete(entry);
QueriesDatastore ds = …
Query q = ds.createQuery(Blog.class);
q.field(“author”).equal(“Hergé”).limit(5);
for(Blog e : q.fetch()) print(e);
Blog entry = q.field(“author”).startsWith(“H”).get();
UpdateDatastore ds = …Query q = ds.find(Blog.class, “author”, “Hergé”);UpdateOperation uo = ds.createUpdateOperations(cls)
uo.inc(“views”, 1).set(“lastUpdated”, new Date());
UpdateResults res = ds.update(q, uo);if(res.getUpdatedCount() > 0) //do something?
Update Operationsset(field, val)unset(field)
inc(field, [val])dec(field)
add(field, val)addAdd(field, vals)
removeFirst/Last(field)removeAll(field, vals)
Relationships[@Embedded]
Loaded/Saved with EntityUpdate
@Reference
Stored as DBRef(s)Loaded with EntityNot automatically saved
Key<T> (DBRef)
Stored as DBRef(s)Just a link, but resolvable by Datastore/Query
MongoDB features in Java
• Durability• Replication• Sharding• Connection options
Durability
What failures do you need to recover from?• Loss of a single database node?• Loss of a group of nodes?
Durability - Master only
• Write acknowledged when in memory on master only
Durability - Master + Slaves
• Write acknowledged when in memory on master + slave
• Will survive failure of a single node
Durability - Master + Slaves + fsync• Write acknowledged when in memory on master + slaves
• Pick a “majority” of nodes
• fsync in batches (since it blocking)
Setting default error checking// Do not check or report errors on writecom.mongodb.WriteConcern.NONE;
// Use default level of error check. Do not send// a getLastError(), but raise exction on errorcom.mongodb.WriteConcern.NORMAL;
// Send getLastError() after each write. Raise an// exception on errorcom.mongodb.WriteConcern.STRICT;
// Set the concerndb.setWriteConcern(concern);
Customized WriteConcern// Wait for three servers to acknowledge writeWriteConcern concern = new WriteConcern(3);
// Wait for three servers, with a 1000ms timeoutWriteConcern concern = new WriteConcern(3, 1000);
// Wait for 3 server, 100ms timeout and fsync // data to diskWriteConcern concern = new WriteConcern(3, 1000, true); // Set the concerndb.setWriteConcern(concern);
Using Replication from Java
slaveOk()- driver to send read requests to Secondaries- driver will always send writes to Primary
Can be set on-‐ DB.slaveOk()-‐ Collection.slaveOk()-‐ find(q).addOption(Bytes.QUERYOPTION_SLAVEOK);
Using sharding Java
Before sharding
coll.save( new BasicDBObjectBuilder(“author”, “Hergé”). append(“text”, “Destination Moon”). append(“date”, new Date());
Query q = ds.find(Blog.class, “author”, “Hergé”);
After sharding
No code change required!
Connection options
MongoOptions mo = new MongoOptions();
// Restrict number of connectionsmo.connectionsPerHost = MAX_THREADS + 5;
// Auto reconnection on connection failuremo.autoConnectRetry = true;
Part FiveDeploying MongoDB
Part FiveDeploying MongoDB
• Performance tuning• Sizing• O/S Tuning / File System layout• Backup
Backup
• Typically backups are driven from a slave• Eliminates impact to client / application traffic to master
Backup
•Two strategies• mogodump / mongorestore• fsync + lock
mongodump
• binary, compact object dump• each consistent object is written• not necessarily consistent from start to finish
fsync + lock
• fsync - flushes buffers to disk• lock - blocks writes
db.runCommand({fsync:1,lock:1})
• Use file-system / LVM / storage snapshot
• unlock db.$cmd.sys.unlock.findOne();
Slave delay
• Protection against app faults• Protection against administration mistakes
O/S Config
• RAM - lots of it
• Filesystem• EXT4 / XFS• Better file allocation & performance
• I/O• More disk the better• Consider RAID10 or other RAID configs
Monitoring
• Munin, Cacti, Nagios
Primary function: • Measure stats over time• Tells you what is going on with your system• Alerts when threshold reached
Remember me?
Summary
MongoDB makes building Java Web application simple
You can focus on what the apps needs to do
MongoDB has built-in
• Horizontal scaling (reads and writes)• Simplified schema evolution• Simplified deployed and operation• Best match for development tools and agile processes