mongo db 101 dc group
DESCRIPTION
An introduction to MongoDB. You can exercise the examples against 10gen's zips.json file.TRANSCRIPT
MongoDB 101
Agenda
• Document Structures and Corrollaries• Getting Started• CRUD Operations• Indexing
Agenda
• Aggregation• Joins and Transactions• Replica Sets• Sharding
Document JSON Example
{"_id" :
ObjectId("72f494c1c3df14726f1403b3"),"city" : "Vienna", "zipcode" : "22180","pop" : 20795 "surveyDate" : new Date()
}
Embedded Example{
"name" : {"first" : "Joe","last" : "Schmoe"
},"address" : {
"street" : "123 Maple Avenue""city" : "Ashburn""state" : "VA""zipcode" : "20148"
}"age" : 45
}
Another Example{ "title" : "MongoDB 101", "author" : "John Ragan", "content" : "My thoughts on MongoDB", "comments" : [
{ "name" : "Jake the Troll", "comment" : "My trollish comments",
{ "name" : "Dwight Merriman", "lastName" : ”Insightful comments from Dwight", },
{ "name" : "Jake the Troll", "comment" : "My even more trollish comments“,
], "tags" : [ "mongodb", "101" ]}
Documents
• Analogous to a database row• Schema-free• Keys and Values– Strings– Value types
• Special key: _id– unique
MongoDB and Relational Corollaries
Relational Database MongoDB
Database instance Mongo instance
Database(s) Database(s)
Table(s) Collection(s)
Row(s) Document(s)
Values Keys and Values
Easy to Get Started
Insert
zip = {"city" : "Ashburn","loc" : [ -77.480612, 39.039918],"pop" : 19416, "state" : "VA”,"_id" : "20148"
}db.census.insert(zip)
Insert
• JSON converted to BSON• Must be less than 16 Mgs (in BSON)• Adds _id unless already specified
Find
find() findOne()
db.census.findOne({"_id": "22180"}
)
Projection
> db.census.find({}, {"city" : 1, "state" : 1}){
"_id" : 22180,"city" : "VIENNA", "state" : "VA"
}
db.census.find({}, {"loc" : 0})
Query Conditionals
• $gt• $gte• $lt• $lte• $ne• $not
db.census.find({"_id" : {"$gte" : "70300", "$lte" : "70399"}})
In and Not In
• $in• $nin
db.census.find({"_id" : {"$in" : ["22180", "70301", "22030"]}})
OR Queries
db.census.find({"$or" : [
{"_id" : {"$in" : ["22180", "90210"]}},{"city" : "ASHBURN"}
]})
Regular Expressions
• Perl Compatible Regular Expression (PCRE)
db.census.find({"city" : /^ASHBU?/i}
)
Limits, Skips and Sorts
db.census.find({"city" : "CHICAGO"}
).skip(3).limit(4).sort("zipcode" : -1})
Update
update{ <criteria>, <new doc> }
db.census.update({_id : "22011"}, {city : "BROADLANDS"}
)
Update - $set and $unset
db.users.update({"name" : "joe"},{"$set" : {"favorite book" : "harry potter"}})
db.users.update({"name" : "joe"}, {"$unset" : {"favorite book" : 1}})
Delete
db.census.remove( <criteria> )
db.census.remove({city : "NORTH POLE"})
db.census.remove()
db.drop_collection("census")
Indexes
db.census.ensureIndex({"city" : 1})
FAST:db.census.find().sort("city" : 1})
SLOW:db.census.find().sort({"pop" : 1, "city" : 1})
db.census.ensureIndex({"pop" : 1, "city" : 1})
Index Ordering
db.census.ensureIndex({"pop" : 1, "city" : 1})
Fast or Slow?
db.census.find().sort({"pop" : -1, "city" : 1})
Other Index Options
db.census.ensureIndex({"city" : 1}, {
"name" : "myIndex","unique" : true,"dropDups", true
})
Explain
• explain will return information– indexes used for the query (if any)– stats about timing– the number of documents scanned
db.census.find({city:"CHICAGO"}).explain()
Aggregation Framework
• Largest and smallest cities in Virginia, California and Louisiana
MongoDB
• Relational Databases are Dead
MongoDB
• Relational Databases are Dead– Of course that is not true!– Right Tool for the Right Job
Why MongoDB?
• Schema flexibility• Developer speed• Horizontal scalability
Developer Flexibility
“An elephant should not always have to sit on your data before you persist it”
Increasing Horizontal Scalability
• No joins– Thus, no distributed joins
• No transactions– Thus, no distributed transactions
Life Without Joins
• Already denormalized or Reference Id’s• One to One relationships• One to Many relationships• Many to Many references
Life Without Transactions
• Document Level transaction boundaries• Nesting within documents• Two Phase commit
Update - $inc
{"url" : "www.example.com","pageviews" : 52}
db.analytics.update({"url" : "www.example.com"}, {"$inc" : {"pageviews" : 1}})
{"url" : "www.example.com","pageviews" : 53}
Replica Sets
• Primary-Secondary cluster– Automatic failover– Primary elected by cluster
• One Primary, many Secondary– Others
• Fully automatic– It handles voting, etc.
• 3 Node viable minimum
Demo Replica Set Failover
Sharding
• The process of splitting data up and storing different portions of the data on different machines
• Automatic vs. manual• Chunks– Shard Key
Mongod Mongod Mongod
Mongos
Client
Sharding
• Server types:– Shard
• holds a subset of a collection’s data. – Single mongod server– Replica set
– Mongos• router process and aggregates responses• Does not store anything
– config server• Stores cluster configuration: which data is on which shard.
• Start these in reverse
Summary
• Document Structures and Corrollaries• Getting Started• CRUD Operations• Indexing
Summary
• Aggregation• Joins and Transactions• Replica Sets• Sharding