webinar: mongodb for content management

Post on 09-Jul-2015

2.167 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

MongoDB's flexible schema makes it a great fit for your next content management application as its data model makes it easy to catalog multiple content types with diverse meta data. In this session, we'll review schema design for content management, using GridFS for storing binary files, and how you can leverage MongoDB's auto-sharding to partition your content across multiple servers.

TRANSCRIPT

Consulting Engineer, 10gen

Bryan Reinero

https://twitter.com/mongodb

MongoDB for Content Management

Agenda

• Sample Content Management System (CMS) Application

• Schema Design Considerations

• Viewing the Final Product

• Building Feeds and Querying Data

• Replication, Failover, and Scaling

• Further Resources

Sample CMS Application

CMS Application Overview

• Business news service

• Hundreds of stories per day

• Millions of website visitors per month

• Comments

• Related stories

• Tags

Viewing Stories (Web Site)

Headline

Date, Byline

Copy

Comments

Tags

Related Stories

Viewing Categories/Tags (Web Site)

Headline

Date, Byline

Lead Text

Headline

Date, Byline

Lead Text

Sample ArticleHeadline

Byline, Date, Comments

Copy

Related Stories

Image

Schema Design Considerations

Sample Relational DB Structure

story

id

headline

copy

authorid

slug

author

id

first_name

last_name

title

tag

id

name

comment

Id

storyid

name

Email

comment_text

related_story

id

storyid

related_storyid

link_story_tag

Id

storyid

tagid

Sample Relational DB Structure

• Number of queries per page load?

• Caching layers add complexity

• Tables may grow to millions of rows

• Joins will become slower over time as dbincreases in size

• Schema changes

• Scaling database to handle more reads

MongoDB Schema Design

• “Schemaless”, however, schema design is important

• JSON documents

• Design for the use case and work backwards

• Do not use a relational model in MongoDB

• No joins or transactions, most related information should be contained in the same document

• Atomic updates on documents, equivalent of transaction

{

_id: 375,

headline: ”Apple Reports Second Quarter Earnings",

date: ISODate("2012-07-14T01:00:00+01:00"),

slug: “apple-reports-second-quarter-earnings”,

byline: {

author: “Jason Zucchetto”,

title: “Lead Business Editor”

},

copy: “Apple reported second quarter revenue today…” ,

tags: [

”AAPL",

”Earnings”

],

Comments: [

{ name: “Frank”, comment: “Great story!”}

]

}

Sample MongoDB Schema

{

_id: 375,

headline: ”Apple Reports Second Quarter Earnings",

date: ISODate("2012-07-14T01:00:00+01:00"),

slug: “apple-reports-second-quarter-earnings”,

byline: {

author: “Jason Zucchetto”,

title: “Lead Business Editor”

},

copy: “Apple reported second quarter revenue today…”,

tags: [

”AAPL",

”Earnings”

],

image: “/images/aapl/tim-cook.jpg”,

ticker: “AAPL”

}

Adding Fields Based on Story

{

_id: 375,

headline: ”Apple Reports Second Quarter Earnings",

date: ISODate("2012-07-14T01:00:00+01:00"),

slug: “apple-reports-second-quarter-earnings”,

copy: “Apple reported second quarter revenue today…” ,

tags: [

”AAPL",

”Earnings”

],

Last25Comments: [

{ name: “Frank”, comment: “Great story!”},

{ name: “John”, comment: “This is interesting”}

]

}

High Comment Volume

{

_id: 375,

headline: ”Apple Reports Second Quarter Earnings",

date: ISODate("2012-07-14T01:00:00+01:00"),

slug: “apple-reports-second-quarter-earnings”,

RelatedStories: [

{

headline: “Google Reports on Revenue”,

date: ISODate("2012-07-15T01:00:00+01:00"),

slug: “goog-revenue-third-quarter”

}, {

headline: “Yahoo Reports on Revenue”,

date: ISODate("2012-07-15T01:00:00+01:00"),

slug: “yhoo-revenue-third-quarter”

}

]

}

Managing Related Stories

{ // Story Collection (sample document)

_id: 375,

headline: ”Apple Reports Second Quarter Earnings",

date: ISODate("2012-07-14T01:00:00+01:00"),

slug: “apple-reports-second-quarter-earnings”,

byline: {

author: “Jason Zucchetto”,

title: “Lead Business Editor”

},

copy: “Apple reported second quarter revenue today…” ,

tags: [

”AAPL",

”Earnings”

],

Last25Comments: [

{ name: “Frank”, comment: “Great story!”},

{ name: “John”, comment: “This is interesting”}

]

Full Sample Schema

image: “/images/aapl/tim-cook.jpg”,

ticker: “AAPL”,

RelatedStories: [

{

headline: “Google Reports on Revenue”,

date: ISODate("2012-07-15T01:00:00+01:00"),

slug: “goog-revenue-third-quarter”

}, {

headline: “Yahoo Reports on Revenue”,

date: ISODate("2012-07-15T01:00:00+01:00"),

slug: “yhoo-revenue-third-quarter”

}

]

}

{ // Comment collection (sample document)

_id: 1891, storyid: 375, name: “Frank”, comment: “Great story!”

}

Full Sample Schema (Contd.)

Querying and Indexing

// Inserting new stories are easy, just submit JSON document

db.cms.insert( { headline: “Apple Reports Revenue”... });

// Adding story tags

db.cms.update( { _id : 375 }, { $addToSet : { tags : "AAPL" } } )

// Adding a comment (if embedding comments in story)

db.cms.update( { _id : 375 }, { $addToSet : { comments: { name: „Jason‟, „comment: „Great Story‟} } } )

Inserting and Updating Stories

// Index on story slug

db.cms.ensureIndex( { slug : 1 });

// Index on story tags

db.cms.ensureIndex( { tags: 1 });

MongoDB Indexes for CMS

// All Story information

db.cms.find( { slug : “apple-reports-second-quarter-earnings” });

// All Stories for a given tag

db.cms.find( { tags: “Earnings” });

Querying MongoDB

Building Custom RSS Feeds

// Very simple to gather specific information for a feed

db.cms.find( { tags: { $in : [“Earnings”, “AAPL”] } }).sort({ date : -1 });

Query Tags and Sort by Date

Replication, Failover, and Scaling

Replication

• Extremely easy to set up

• Replica node can trail primary node and maintain a copy of the primary database

• Useful for disaster recovery, failover, backups, and specific workloads such as analytics

• When Primary goes down, a Secondary will automatically become the new Primary

Replication

Reading from Secondaries (Delayed Consistency)

Reading from Secondaries (Delayed Consistency)

Scaling Horizontally

• Important to keep working data set in RAM

• When working data set exceeds RAM, easy to add additional machines and segment data across machines (sharding)

Sharding with MongoDB

Additional Resources

• Use Case Tutorials: http://docs.mongodb.org/manual/use-cases/

• What others are doing: http://www.10gen.com/use-case/content-management

• This presentation & video recording: https://www.10gen.com/presentations/webinar

Consulting Engineer, 10gen

Bryan Reinero

https://twitter.com/mongodb

Thank You

top related