the fine art of schema design in mongodb: dos and don'ts

56
The Fine Art of Schema Design: Dos and Don'ts Senior Solutions Architect, MongoDB Inc. Matias Cascallares [email protected]

Upload: matias-cascallares

Post on 02-Jul-2015

1.974 views

Category:

Software


1 download

DESCRIPTION

Schema design in MongoDB can be an art. Different trade offs should be considered when designing how to store your data. In this presentation we are going to cover some common scenarios, recommended practices and don'ts to avoid based on previous experiences

TRANSCRIPT

Page 1: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

The Fine Art of Schema Design: Dos and Don'ts

Senior Solutions Architect, MongoDB Inc.

Matias Cascallares

[email protected]

Page 2: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Who am I?• Originally from Buenos Aires,

Argentina

• Solutions Architect at MongoDB Inc based in Singapore

• Software Engineer, most of my experience in web environments

• In my toolbox I have Java, Python and Node.js

Page 3: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

RDBMs• Relational databases are made up of tables

• Tables are made up of rows:

• All rows have identical structure

• Each row has the same number of columns

• Every cell in a column stores the same type of data

Page 4: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

MONGODB IS A

DOCUMENT ORIENTED

DATABASE

Page 5: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Show me a document{ "name" : "Matias Cascallares", "title" : "Senior Solutions Architect", "email" : "[email protected]", "birth_year" : 1981, "location" : [ "Singapore", "Asia"], "phone" : { "type" : "mobile", "number" : "+65 8591 3870" } }

Page 6: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Document Model• MongoDB is made up of collections

• Collections are composed of documents

• Each document is a set of key-value pairs

• No predefined schema

• Keys are always strings

• Values can be any (supported) data type

• Values can also be an array

• Values can also be a document

Page 7: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Benefits of document model ..?

Page 8: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Flexibility• Each document can have different fields

• No need of long migrations, easier to be agile

• Common structure enforced at application level

Page 9: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Arrays • Documents can have field with array values

• Ability to query and index array elements

• We can model relationships with no need of different tables or collections

Page 10: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Embedded documents • Documents can have field with document values

• Ability to query and index nested documents

• Semantic closer to Object Oriented Programming

Page 11: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Indexing an array of documents

Page 12: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Relational Schema Design

Document Schema Design

Page 13: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Relational Schema Design

Focus on data

storage

Document Schema Design

Focus on data

usage

Page 14: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

SCHEMA DESIGN IS

AN ARThttps://www.flickr.com/photos/76377775@N05/11098637655/

Page 15: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

ImplementingRelations

https://www.flickr.com/photos/ravages/2831688538

Page 16: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

A task tracking app

Page 17: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Requirement #1"We need to store user information like name, email and their addresses… yes they can have more than one.”

— Bill, a project manager, contemporary

Page 18: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Relational

id name email title

1 Kate Powell

[email protected]

Regional Manager

id street city user_id

1 123 Sesame Street Boston 12 123 Evergreen Street New York 1

Page 19: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Let’s use the document model> db.user.findOne( { email: "[email protected]"} ) { _id: 1, name: "Kate Powell", email: "[email protected]", title: "Regional Manager", addresses: [ { street: "123 Sesame St", city: "Boston" }, { street: "123 Evergreen St", city: "New York" } ] }

Page 20: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Requirement #2"We have to be able to store tasks, assign them to users and track their progress…"

— Bill, a project manager, contemporary

Page 21: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Embedding tasks> db.user.findOne( { email: "[email protected]"} ) { name: "Kate Powell", // ... previous fields tasks: [ { summary: "Contact sellers", description: "Contact agents to specify our needs and time constraints", due_date: ISODate("2014-08-25T08:37:50.465Z"), status: "NOT_STARTED" }, { // another task } ] }

Page 22: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Embedding tasks• Tasks are unbounded items: initially we do not know

how many tasks we are going to have

• A user along time can end with thousands of tasks

• Maximum document size in MongoDB: 16 MB !

• It is harder to access task information without a user context

Page 23: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Referencing tasks> db.user.findOne({_id: 1}) { _id: 1, name: "Kate Powell", email: "kate.powell@...", title: "Regional Manager", addresses: [ { // address 1 }, { // address 2 } ] }

> db.task.findOne({user_id: 1}) { _id: 5, summary: "Contact sellers", description: "Contact agents to specify our ...", due_date: ISODate(), status: "NOT_STARTED", user_id: 1 }

Page 24: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Referencing tasks• Tasks are unbounded items and our schema supports

that

• Application level joins

• Remember to create proper indexes (e.g. user_id)

Page 25: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Embedding vs

Referencing

Page 26: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

One-to-many relations• Embed when you have a few number of items on ‘many'

side

• Embed when you have some level of control on the number of items on ‘many' side

• Reference when you cannot control the number of items on the 'many' side

• Reference when you need to access to ‘many' side items without parent entity scope

Page 27: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Many-to-many relations• These can be implemented with two one-to-many

relations with the same considerations

Page 28: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

RECIPE #1USE EMBEDDING

FOR ONE-TO-FEW RELATIONS

Page 29: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

RECIPE #2USE REFERENCING

FOR ONE-TO-MANY RELATIONS

Page 30: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Working witharrays

https://www.flickr.com/photos/kishjar/10747531785

Page 31: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Arrays are great!

Page 32: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

List of sorted elements> db.numbers.insert({ _id: "even", values: [0, 2, 4, 6, 8] });

> db.numbers.insert({ _id: "odd", values: [1, 3, 5, 7, 9] });

Page 33: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Access based on positiondb.numbers.find({_id: "even"}, {values: {$slice: [2, 3]}}) { _id: "even", values: [4, 6, 8] }

db.numbers.find({_id: "odd"}, {values: {$slice: -2}}) { _id: "odd", values: [7, 9] }

Page 34: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Access based on values// is number 2 even or odd? > db.numbers.find( { values : 2 } ) { _id: "even", values: [0, 2, 4, 6, 8] }

Page 35: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Like sorted sets> db.numbers.find( { _id: "even" } ) { _id: "even", values: [0, 2, 4, 6, 8] }

> db.numbers.update( { _id: "even"}, { $addToSet: { values: 10 } } );

Several times…!

> db.numbers.find( { _id: "even" } ) { _id: "even", values: [0, 2, 4, 6, 8, 10] }

Page 36: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Array update operators• pop

• push

• pull

• pullAll

Page 37: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

But…

Page 38: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Storage

{ _id: 1, name: "Nike Pump Air 180", tags: ["sports", "running"] }

db.inventory.update( { _id: 1}, { $push: { tags: "shoes" } } )

DocA DocCDocB

Page 39: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Empty

Storage

DocA DocCDocB

IDX IDX IDX

Page 40: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Empty

Storage

DocA DocC DocB

IDX IDX IDX

Page 41: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Why is expensive to move a doc?1. We need to write the document in another location ($$)

2. We need to mark the original position as free for new documents ($)

3. We need to update all those index entries pointing to the moved document to the new location ($$$)

Page 42: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Considerations with arrays• Limited number of items

• Avoid document movements

• Document movements can be delayed with padding factor

• Document movements can be mitigated with pre-allocation

Page 43: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

RECIPE #3AVOID EMBEDDING

LARGE ARRAYS

Page 44: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

RECIPE #4USE DATA MODELS THAT MINIMIZE THE

NEED FOR DOCUMENT

GROWTH

Page 45: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Denormalization

https://www.flickr.com/photos/ross_strachan/5146307757

Page 46: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Denormalization"…is the process of attempting to optimise the read performance of a database by adding redundant data …”

— Wikipedia

Page 47: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Products and comments> db.product.find( { _id: 1 } ) { _id: 1, name: "Nike Pump Air Force 180", tags: ["sports", "running"] }

> db.comment.find( { product_id: 1 } ) { score: 5, user: "user1", text: "Awesome shoes" } { score: 2, user: "user2", text: "Not for me.." }

Page 48: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Denormalizing> db.product.find({_id: 1}) { _id: 1, name: "Nike Pump Air Force 180", tags: ["sports", “running"], comments: [ { user: "user1", text: "Awesome shoes" }, { user: "user2", text: "Not for me.." } ] }

> db.comment.find({product_id: 1}) { score: 5, user: "user1", text: "Awesome shoes" } { score: 2, user: "user2", text: "Not for me.."}

Page 49: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

RECIPE #5DENORMALIZE

TO AVOID APP-LEVEL JOINS

Page 50: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

RECIPE #6DENORMALIZE ONLY WHEN YOU HAVE A

HIGH READ TO WRITE RATIO

Page 51: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Bucketing

https://www.flickr.com/photos/97608671@N02/13558864555/

Page 52: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

What’s the idea?• Reduce number of documents to be retrieved

• Less documents to retrieve means less disk seeks

• Using arrays we can store more than one entity per document

• We group things that are accessed together

Page 53: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

An exampleComments are showed in buckets of 2 comments

A ‘read more’ button loads next 2 comments

Page 54: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

Bucketing comments> db.comments.find({post_id: 123}) .sort({sequence: -1}) .limit(1) { _id: 1, post_id: 123, sequence: 8, // this acts as a page number comments: [ {user: [email protected], text: "Awesome shoes.."}, {user: [email protected], text: "Not for me..”} ] // we store two comments per doc, fixed size bucket }

Page 55: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

RECIPE #7USE BUCKETING TO STORE THINGS THAT ARE GOING TO BE ACCESSED AS A

GROUP

Page 56: The Fine Art of Schema Design in MongoDB: Dos and Don'ts

谢谢