mongouk 2011 schema design
DESCRIPTION
Schema design... how to I represent 1:1, 1:M and trees in JSON? In this talk we aim to explain schema design in MongoDB.TRANSCRIPT
![Page 2: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/2.jpg)
Topics
Introduction• Basic Data Modeling• Evolving a schema
Common patterns• Single table inheritance• One-to-Many & Many-to-Many• Trees• Queues
![Page 3: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/3.jpg)
So why model data?
http://www.flickr.com/photos/42304632@N00/493639870/
![Page 4: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/4.jpg)
A brief history of normalization• 1970 E.F.Codd introduces 1st Normal Form (1NF)• 1971 E.F.Codd introduces 2nd and 3rd Normal Form (2NF, 3NF)• 1974 Codd & Boyce define Boyce/Codd Normal Form (BCNF)• 2002 Date, Darween, Lorentzos define 6th Normal Form (6NF)
Goals:• Avoid anomalies when inserting, updating or deleting• Minimize redesign when extending the schema• Make the model informative to users• Avoid bias towards a particular style of query
* source : wikipedia
![Page 5: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/5.jpg)
The real benefit of relational
• Before relational• Data and Logic combined
• After relational• Separation of concerns• Data modeled independent of logic• Logic freed from concerns of data design
• MongoDB continues this separation
![Page 6: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/6.jpg)
Relational made normalized data look like this
![Page 7: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/7.jpg)
Document databases make normalized data look like this
![Page 8: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/8.jpg)
Terminology
RDBMS MongoDB
Table Collection
Row(s) JSON Document
Index Index
Join Embedding & Linking
Partition Shard
Partition Key Shard Key
![Page 9: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/9.jpg)
So today’s example will use...
![Page 10: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/10.jpg)
Design Session
Design documents that simply map to your application> post = {author: "Hergé", date: ISODate("2011-‐09-‐18T09:56:06.298Z"), text: "Destination Moon", tags: ["comic", "adventure"]}
> db.posts.save(post)
![Page 11: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/11.jpg)
> db.posts.find()
{ _id: ObjectId("4c4ba5c0672c685e5e8aabf3"), author: "Hergé", date: ISODate("2011-‐09-‐18T09:56:06.298Z"), text: "Destination Moon", tags: [ "comic", "adventure" ] } Notes:• ID must be unique, but can be anything you’d like• MongoDB will generate a default ID if one is not supplied
Find the document
![Page 12: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/12.jpg)
Secondary index for “author”
// 1 means ascending, -‐1 means descending
> db.posts.ensureIndex({author: 1})
> db.posts.find({author: 'Hergé'}) { _id: ObjectId("4c4ba5c0672c685e5e8aabf3"), date: ISODate("2011-‐09-‐18T09:56:06.298Z"), author: "Hergé", ... }
Add and index, find via Index
![Page 13: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/13.jpg)
Examine the query plan> db.blogs.find({author: 'Hergé'}).explain(){ "cursor" : "BtreeCursor author_1", "nscanned" : 1, "nscannedObjects" : 1, "n" : 1, "millis" : 5, "indexBounds" : { "author" : [ [ "Hergé", "Hergé" ] ] }}
![Page 14: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/14.jpg)
Query operatorsConditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,
// find posts with any tags> db.posts.find({tags: {$exists: true}})
![Page 15: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/15.jpg)
Query operatorsConditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,
// find posts with any tags> db.posts.find({tags: {$exists: true}})
Regular expressions:// posts where author starts with h> db.posts.find({author: /^h/i })
![Page 16: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/16.jpg)
Query operatorsConditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,
// find posts with any tags> db.posts.find({tags: {$exists: true}})
Regular expressions:// posts where author starts with h> db.posts.find({author: /^h/i })
Counting: // number of posts written by Hergé> db.posts.find({author: “Hergé”}).count()
![Page 17: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/17.jpg)
Extending the Schema new_comment = {author: “Kyle”, date: new Date(), text: “great book”}
> db.posts.update( {text: “Destination Moon” }, { ‘$push’: {comments: new_comment}, ‘$inc’: {comments_count: 1}})
![Page 18: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/18.jpg)
{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Hergé", date : ISODate("2011-‐09-‐18T09:56:06.298Z"), text : "Destination Moon", tags : [ "comic", "adventure" ], comments : [ { author : "Kyle", date : ISODate("2011-‐09-‐19T09:56:06.298Z"), text : "great book" } ], comments_count: 1 }
Extending the Schema
![Page 19: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/19.jpg)
// create index on nested documents:> db.posts.ensureIndex({"comments.author": 1})
> db.posts.find({comments.author:”Kyle”})
Extending the Schema
![Page 20: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/20.jpg)
// create index on nested documents:> db.posts.ensureIndex({"comments.author": 1})
> db.posts.find({comments.author:”Kyle”})
// find last 5 posts:> db.posts.find().sort({date:-‐1}).limit(5)
Extending the Schema
![Page 21: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/21.jpg)
// create index on nested documents:> db.posts.ensureIndex({"comments.author": 1})
> db.posts.find({comments.author:”Kyle”})
// find last 5 posts:> db.posts.find().sort({date:-‐1}).limit(5)
// most commented post:> db.posts.find().sort({comments_count:-‐1}).limit(1)
When sorting, check if you need an index
Extending the Schema
![Page 22: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/22.jpg)
Watch for full table scans
> db.blogs.find({text: 'Destination Moon'}).explain() { "cursor" : "BasicCursor", "nscanned" : 1, "nscannedObjects" : 1, "n" : 1, "millis" : 0, "indexBounds" : { }}
![Page 23: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/23.jpg)
Group
• Equivalent to a Group By in SQL
• Specific the attributes to group the data
• Process the results in a Reduce function
![Page 24: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/24.jpg)
Group - Count post by Authorcmd = { key: { "author":true }, initial: {count: 0}, reduce: function(obj, prev) { prev.count++; }, };result = db.posts.group(cmd);
[ { "author" : "Hergé", "count" : 1 }, { "author" : "Kyle", "count" : 3 }]
![Page 25: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/25.jpg)
Common Patterns
![Page 26: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/26.jpg)
Inheritance
![Page 27: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/27.jpg)
Single Table Inheritance - RDBMS
shapes tableid type area radius d length width
1 circle 3.14 1
2 square 4 2
3 rect 10 5 2
![Page 28: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/28.jpg)
Single Table Inheritance - MongoDB> db.shapes.find() { _id: "1", type: "circle",area: 3.14, radius: 1} { _id: "2", type: "square",area: 4, d: 2} { _id: "3", type: "rect", area: 10, length: 5, width: 2}
![Page 29: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/29.jpg)
Single Table Inheritance - MongoDB> db.shapes.find() { _id: "1", type: "circle",area: 3.14, radius: 1} { _id: "2", type: "square",area: 4, d: 2} { _id: "3", type: "rect", area: 10, length: 5, width: 2}
// find shapes where radius > 0 > db.shapes.find({radius: {$gt: 0}})
![Page 30: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/30.jpg)
Single Table Inheritance - MongoDB> db.shapes.find() { _id: "1", type: "circle",area: 3.14, radius: 1} { _id: "2", type: "square",area: 4, d: 2} { _id: "3", type: "rect", area: 10, length: 5, width: 2}
// find shapes where radius > 0 > db.shapes.find({radius: {$gt: 0}})
// create index> db.shapes.ensureIndex({radius: 1})
![Page 31: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/31.jpg)
One to ManyOne to Many relationships can specify• degree of association between objects• containment• life-cycle
![Page 32: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/32.jpg)
One to Many- Embedded Array / Array Keys - slice operator to return subset of array - some queries harder e.g find latest comments across all documents
blogs: { author : "Hergé", date : ISODate("2011-‐09-‐18T09:56:06.298Z"), comments : [ { author : "Kyle", date : ISODate("2011-‐09-‐19T09:56:06.298Z"), text : "great book" } ]}
![Page 33: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/33.jpg)
One to Many- Embedded tree - Single document - Natural - Hard to query
blogs: { author : "Hergé", date : ISODate("2011-‐09-‐18T09:56:06.298Z"), comments : [ { author : "Kyle", date : ISODate("2011-‐09-‐19T09:56:06.298Z"), text : "great book", replies: [ { author : “James”, ...} ] } ]}
![Page 34: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/34.jpg)
One to Many- Normalized (2 collections) - most flexible - more queries
blogs: { author : "Hergé", date : ISODate("2011-‐09-‐18T09:56:06.298Z"), comments : [ {comment : ObjectId(“1”)} ]}
comments : { _id : “1”, author : "James", date : ISODate("2011-‐09-‐19T09:56:06.298Z")}
![Page 35: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/35.jpg)
One to Many - patterns
- Embedded Array / Array Keys
- Embedded Array / Array Keys- Embedded tree- Normalized
![Page 36: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/36.jpg)
Many - ManyExample: - Product can be in many categories- Category can have many products
![Page 37: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/37.jpg)
products: { _id: ObjectId("10"), name: "Destination Moon", category_ids: [ ObjectId("20"), ObjectId("30”]}
Many - Many
![Page 38: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/38.jpg)
products: { _id: ObjectId("10"), name: "Destination Moon", category_ids: [ ObjectId("20"), ObjectId("30”]} categories: { _id: ObjectId("20"), name: "adventure", product_ids: [ ObjectId("10"), ObjectId("11"), ObjectId("12"]}
Many - Many
![Page 39: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/39.jpg)
products: { _id: ObjectId("10"), name: "Destination Moon", category_ids: [ ObjectId("20"), ObjectId("30”]} categories: { _id: ObjectId("20"), name: "adventure", product_ids: [ ObjectId("10"), ObjectId("11"), ObjectId("12"]}
//All categories for a given product> db.categories.find({product_ids: ObjectId("10")})
Many - Many
![Page 40: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/40.jpg)
products: { _id: ObjectId("10"), name: "Destination Moon", category_ids: [ ObjectId("20"), ObjectId("30”]} categories: { _id: ObjectId("20"), name: "adventure"}
Alternative
![Page 41: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/41.jpg)
products: { _id: ObjectId("10"), name: "Destination Moon", category_ids: [ ObjectId("20"), ObjectId("30”]} categories: { _id: ObjectId("20"), name: "adventure"}
// All products for a given category> db.products.find({category_ids: ObjectId("20")})
Alternative
![Page 42: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/42.jpg)
products: { _id: ObjectId("10"), name: "Destination Moon", category_ids: [ ObjectId("20"), ObjectId("30”]} categories: { _id: ObjectId("20"), name: "adventure"}
// All products for a given category> db.products.find({category_ids: ObjectId("20")})
// All categories for a given productproduct = db.products.find(_id : some_id)> db.categories.find({_id : {$in : product.category_ids}})
Alternative
![Page 43: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/43.jpg)
TreesFull Tree in Document
{ comments: [ { author: “Kyle”, text: “...”, replies: [ {author: “James”, text: “...”, replies: []} ]} ]}
Pros: Single Document, Performance, Intuitive
Cons: Hard to search, Partial Results, 16MB limit
![Page 44: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/44.jpg)
TreesParent Links- Each node is stored as a document- Contains the id of the parent
Child Links- Each node contains the id’s of the children- Can support graphs (multiple parents / child)
![Page 45: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/45.jpg)
Array of Ancestors- Store all Ancestors of a node { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" }
![Page 46: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/46.jpg)
Array of Ancestors- Store all Ancestors of a node { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" }
//find all descendants of b:
> db.tree2.find({ancestors: ‘b’})
//find all direct descendants of b:
> db.tree2.find({parent: ‘b’})
![Page 47: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/47.jpg)
Array of Ancestors- Store all Ancestors of a node { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" }
//find all descendants of b:
> db.tree2.find({ancestors: ‘b’})
//find all direct descendants of b:
> db.tree2.find({parent: ‘b’})
//find all ancestors of f:> ancestors = db.tree2.findOne({_id:’f’}).ancestors> db.tree2.find({_id: { $in : ancestors})
![Page 48: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/48.jpg)
Trees as PathsStore hierarchy as a path expression- Separate each node by a delimiter, e.g. “/”- Use text search for find parts of a tree
{ comments: [ { author: “Kyle”, text: “initial post”, path: “/” }, { author: “Jim”, text: “jim’s comment”, path: “/jim” }, { author: “Kyle”, text: “Kyle’s reply to Jim”, path : “/jim/kyle”} ] }
// Find the conversations Jim was part of > db.posts.find({path: /^jim/i})
![Page 49: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/49.jpg)
Queue• Need to maintain order and state• Ensure that updates to the queue are atomic
{ inprogress: false, priority: 1, ... }
![Page 50: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/50.jpg)
Queue• Need to maintain order and state• Ensure that updates to the queue are atomic
{ inprogress: false, priority: 1, ... }
// find highest priority job and mark as in-‐progressjob = db.jobs.findAndModify({ query: {inprogress: false}, sort: {priority: -‐1}, update: {$set: {inprogress: true, started: new Date()}}, new: true})
![Page 51: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/51.jpg)
Summary
Schema design is different in MongoDB
Basic data design principals stay the same
Focus on how the apps manipulates data
Rapidly evolve schema to meet your requirements
Enjoy your new freedom, use it wisely :-)
![Page 52: MongoUK 2011 Schema Design](https://reader033.vdocuments.us/reader033/viewer/2022042623/54f4ea3b4a7959b53d8b48eb/html5/thumbnails/52.jpg)
@mongodb
conferences, appearances, and meetupshttp://www.10gen.com/events
http://bit.ly/mongo1 Facebook | Twitter | LinkedIn
http://linkd.in/joinmongo
download at mongodb.org