creating social features at branchout using mongodb

57
Building Social Features with MongoDB Nathan Smith BranchOut.com Jan. 22, 2013 Tuesday, January 22, 13

Upload: nathan-smith

Post on 15-Jan-2015

5.704 views

Category:

Technology


2 download

DESCRIPTION

Slides from the MongoDB MeetUp "IRC Bots and Activity Feeds with MongoDB - At BranchOut", presented by the San Francisco MongoDB User Group and 10gen. http://www.meetup.com/San-Francisco-MongoDB-User-Group/events/95713262/ Over the past year, we've used MongoDB to power more and more of BranchOut's functionality, including some cool social features such as a Facebook-like activity feed. In this talk, I discuss the design decisions that went into developing these features and outline how Mongo is used under the hood. I discuss not only what makes Mongo a good technology choice, but also list a few things about Mongo that need to be worked around. If you have any questions regarding these slides, feel free to reach out to me on Twitter: @nate510. Thanks!

TRANSCRIPT

Page 1: Creating social features at BranchOut using MongoDB

Building Social Features with MongoDB

Nathan SmithBranchOut.comJan. 22, 2013

Tuesday, January 22, 13

Page 2: Creating social features at BranchOut using MongoDB

BranchOut

• Connect with your colleagues (follow)

• Activity feed of their professional activity

• Timeline of an individual’s posts

A more social professional network

Tuesday, January 22, 13

Page 3: Creating social features at BranchOut using MongoDB

BranchOut

• 30M installed users

• 750MM total user records

• Average 300 connections per installed user

A more social professional network

Tuesday, January 22, 13

Page 4: Creating social features at BranchOut using MongoDB

MongoDB @ BranchOut

Tuesday, January 22, 13

Page 5: Creating social features at BranchOut using MongoDB

MongoDB @ BranchOut

• 100% MySQL until ~July 2012

Tuesday, January 22, 13

Page 6: Creating social features at BranchOut using MongoDB

MongoDB @ BranchOut

• 100% MySQL until ~July 2012

• Much of our data fits well into a document model

Tuesday, January 22, 13

Page 7: Creating social features at BranchOut using MongoDB

MongoDB @ BranchOut

• 100% MySQL until ~July 2012

• Much of our data fits well into a document model

• Our data design avoids RDBMS features

Tuesday, January 22, 13

Page 8: Creating social features at BranchOut using MongoDB

Follow System

Tuesday, January 22, 13

Page 9: Creating social features at BranchOut using MongoDB

Follow SystemBusiness logic

Tuesday, January 22, 13

Page 10: Creating social features at BranchOut using MongoDB

Follow System

• Limit of 2000 followees (people you follow)

Business logic

Tuesday, January 22, 13

Page 11: Creating social features at BranchOut using MongoDB

Follow System

• Limit of 2000 followees (people you follow)

• Unlimited followers

Business logic

Tuesday, January 22, 13

Page 12: Creating social features at BranchOut using MongoDB

Follow System

• Limit of 2000 followees (people you follow)

• Unlimited followers

• Both lists reflect updates in near-real time

Business logic

Tuesday, January 22, 13

Page 13: Creating social features at BranchOut using MongoDB

Follow SystemTraditional RDBMS (i.e. MySQL)

follower_uid followee_uid follow_time123 456 2013-01-22 15:43:00

456 123 2013-01-22 15:52:00

Tuesday, January 22, 13

Page 14: Creating social features at BranchOut using MongoDB

Follow SystemTraditional RDBMS (i.e. MySQL)

follower_uid followee_uid follow_time123 456 2013-01-22 15:43:00

456 123 2013-01-22 15:52:00

Advantage: Easy inserts, deletes

Tuesday, January 22, 13

Page 15: Creating social features at BranchOut using MongoDB

Follow SystemTraditional RDBMS (i.e. MySQL)

follower_uid followee_uid follow_time123 456 2013-01-22 15:43:00

456 123 2013-01-22 15:52:00

Advantage: Easy inserts, deletes

Disadvantage: Data locality, index size

Tuesday, January 22, 13

Page 16: Creating social features at BranchOut using MongoDB

Follow SystemMongoDB (first pass)

followee: { _id: 123 uids: [456, 567, 678]}

Tuesday, January 22, 13

Page 17: Creating social features at BranchOut using MongoDB

Follow SystemMongoDB (first pass)

Advantage: Compact data, read locality

followee: { _id: 123 uids: [456, 567, 678]}

Tuesday, January 22, 13

Page 18: Creating social features at BranchOut using MongoDB

Follow SystemMongoDB (first pass)

Advantage: Compact data, read locality

Disadvantage: Can’t display a user’s followers

followee: { _id: 123 uids: [456, 567, 678]}

Tuesday, January 22, 13

Page 19: Creating social features at BranchOut using MongoDB

db.follow.find({uids: 456}, {_id: 1});

Follow SystemCan’t display a user’s followers (easily)

followee: { _id: 123 uids: [456, 567, 678]}

...with multi-key index on uids

Tuesday, January 22, 13

Page 20: Creating social features at BranchOut using MongoDB

db.follow.find({uids: 456}, {_id: 1});

Follow SystemCan’t display a user’s followers (easily)

Expensive! Also, no guarantee of order.

followee: { _id: 123 uids: [456, 567, 678]}

...with multi-key index on uids

Tuesday, January 22, 13

Page 21: Creating social features at BranchOut using MongoDB

Follow SystemMongoDB (second pass)

followee: { _id: 1, uids: [2, 3]},followee: { _id: 2, uids: [1, 3]}

follower: { _id: 1, uids: [2]}, follower: { _id: 2, uids: [1]}follower: { _id: 3, uids: [1, 2]}

Tuesday, January 22, 13

Page 22: Creating social features at BranchOut using MongoDB

Follow SystemMongoDB (second pass)

Advantages: Local data, fast selects

followee: { _id: 1, uids: [2, 3]},followee: { _id: 2, uids: [1, 3]}

follower: { _id: 1, uids: [2]}, follower: { _id: 2, uids: [1]}follower: { _id: 3, uids: [1, 2]}

Tuesday, January 22, 13

Page 23: Creating social features at BranchOut using MongoDB

Follow SystemMongoDB (second pass)

Advantages: Local data, fast selects

Disadvantages: Follower doc size

followee: { _id: 1, uids: [2, 3]},followee: { _id: 2, uids: [1, 3]}

follower: { _id: 1, uids: [2]}, follower: { _id: 2, uids: [1]}follower: { _id: 3, uids: [1, 2]}

Tuesday, January 22, 13

Page 24: Creating social features at BranchOut using MongoDB

Follow SystemFollower document size

Tuesday, January 22, 13

Page 25: Creating social features at BranchOut using MongoDB

Follow SystemFollower document size

• Max Mongo doc size: 16MB

Tuesday, January 22, 13

Page 26: Creating social features at BranchOut using MongoDB

Follow SystemFollower document size

• Max Mongo doc size: 16MB

• Number of people who follow our community manager: 30MM

Tuesday, January 22, 13

Page 27: Creating social features at BranchOut using MongoDB

Follow SystemFollower document size

• Max Mongo doc size: 16MB

• Number of people who follow our community manager: 30MM

• 30MM uids × 8 bytes/uid = 240MB

Tuesday, January 22, 13

Page 28: Creating social features at BranchOut using MongoDB

Follow SystemFollower document size

• Max Mongo doc size: 16MB

• Number of people who follow our community manager: 30MM

• 30MM uids × 8 bytes/uid = 240MB

• Max followers per doc: ~2MM

Tuesday, January 22, 13

Page 29: Creating social features at BranchOut using MongoDB

Follow SystemMongoDB (final pass)

follower: { _id: “1”, uids: [2,3,4,...], count: 20001, next_page: 2},follower: { _id: “1_p2”, uids: [23,24,25,...], count: 10000}

followee: { _id: 1, uids: [2, 3]},followee: { _id: 2, uids: [1, 3]}

Tuesday, January 22, 13

Page 30: Creating social features at BranchOut using MongoDB

Follow SystemMongoDB (final pass)

follower: { _id: “1”, uids: [2,3,4,...], count: 20001, next_page: 2},follower: { _id: “1_p2”, uids: [23,24,25,...], count: 10000}

followee: { _id: 1, uids: [2, 3]},followee: { _id: 2, uids: [1, 3]}

follower: { _id: “1”, uids: [2,3,4,...], count: 10001, next_page: 3},follower: { _id: “1_p2”, uids: [23,24,25,...], count: 10000}

Tuesday, January 22, 13

Page 31: Creating social features at BranchOut using MongoDB

Follow SystemMongoDB (final pass)

Asynchronous thread manages follower documents

follower: { _id: “1”, uids: [2,3,4,...], count: 20001, next_page: 2},follower: { _id: “1_p2”, uids: [23,24,25,...], count: 10000}

followee: { _id: 1, uids: [2, 3]},followee: { _id: 2, uids: [1, 3]}

follower: { _id: “1”, uids: [2,3,4,...], count: 10001, next_page: 3},follower: { _id: “1_p2”, uids: [23,24,25,...], count: 10000}

Tuesday, January 22, 13

Page 32: Creating social features at BranchOut using MongoDB

Activity Feed

Tuesday, January 22, 13

Page 33: Creating social features at BranchOut using MongoDB

Push vs Pull architecture

Activity Feed

Tuesday, January 22, 13

Page 34: Creating social features at BranchOut using MongoDB

Push vs Pull architecture

Activity Feed

Tuesday, January 22, 13

Page 35: Creating social features at BranchOut using MongoDB

Push vs Pull architecture

Activity Feed

Tuesday, January 22, 13

Page 36: Creating social features at BranchOut using MongoDB

Business logic

Activity Feed

Tuesday, January 22, 13

Page 37: Creating social features at BranchOut using MongoDB

Business logic

• All connections and followees appear in your feed

Activity Feed

Tuesday, January 22, 13

Page 38: Creating social features at BranchOut using MongoDB

Business logic

• All connections and followees appear in your feed

• Reverse chron sort order (but should support other rankings)

Activity Feed

Tuesday, January 22, 13

Page 39: Creating social features at BranchOut using MongoDB

Business logic

• All connections and followees appear in your feed

• Reverse chron sort order (but should support other rankings)

• Support for evolving set of feed event types

Activity Feed

Tuesday, January 22, 13

Page 40: Creating social features at BranchOut using MongoDB

Business logic

• All connections and followees appear in your feed

• Reverse chron sort order (but should support other rankings)

• Support for evolving set of feed event types

• Tagging creates multiple feed events for the same underlying object

Activity Feed

Tuesday, January 22, 13

Page 41: Creating social features at BranchOut using MongoDB

Business logic

• All connections and followees appear in your feed

• Reverse chron sort order (but should support other rankings)

• Support for evolving set of feed event types

• Tagging creates multiple feed events for the same underlying object

• Feed events are not ephemeral -- Timeline

Activity Feed

Tuesday, January 22, 13

Page 42: Creating social features at BranchOut using MongoDB

Traditional RDBMS (i.e. MySQL)

activity_id uid event_time type oid1 oid21 123 2013-01-22 15:43:00 photo 123abc 789ghi

2 345 2013-01-22 15:52:00 status 456def foobar

Activity Feed

Tuesday, January 22, 13

Page 43: Creating social features at BranchOut using MongoDB

Traditional RDBMS (i.e. MySQL)

activity_id uid event_time type oid1 oid21 123 2013-01-22 15:43:00 photo 123abc 789ghi

2 345 2013-01-22 15:52:00 status 456def foobar

Advantage: Easy inserts

Activity Feed

Tuesday, January 22, 13

Page 44: Creating social features at BranchOut using MongoDB

Traditional RDBMS (i.e. MySQL)

activity_id uid event_time type oid1 oid21 123 2013-01-22 15:43:00 photo 123abc 789ghi

2 345 2013-01-22 15:52:00 status 456def foobar

Advantage: Easy inserts

Disadvantages: Rigid schema adapts poorly to new activity types, doesn’t scale

Activity Feed

Tuesday, January 22, 13

Page 45: Creating social features at BranchOut using MongoDB

MongoDB

ufc:{ _id: 123, // UID total_events: 18, 2013_01_total: 4, 2012_12_total: 8, 2012_11_total: 6, ...other counts...}

ufm:{ _id: “123_2013_01”, events: [ { uid: 123, type: “photo_upload”, content_id: “abcd9876”, timestamp: 1358824502, ...more metadata... }, ...more events... ]}

user_feed_card user_feed_month

Activity Feed

Tuesday, January 22, 13

Page 46: Creating social features at BranchOut using MongoDB

Algorithm

Activity Feed

Tuesday, January 22, 13

Page 47: Creating social features at BranchOut using MongoDB

Algorithm

1. Load user_feed_cards for all connections

Activity Feed

Tuesday, January 22, 13

Page 48: Creating social features at BranchOut using MongoDB

Algorithm

1. Load user_feed_cards for all connections

2. Calculate which user_feed_months to load

Activity Feed

Tuesday, January 22, 13

Page 49: Creating social features at BranchOut using MongoDB

Algorithm

1. Load user_feed_cards for all connections

2. Calculate which user_feed_months to load

3. Load user_feed_months

Activity Feed

Tuesday, January 22, 13

Page 50: Creating social features at BranchOut using MongoDB

Algorithm

1. Load user_feed_cards for all connections

2. Calculate which user_feed_months to load

3. Load user_feed_months

4. Aggregate events that refer to the same story

Activity Feed

Tuesday, January 22, 13

Page 51: Creating social features at BranchOut using MongoDB

Algorithm

1. Load user_feed_cards for all connections

2. Calculate which user_feed_months to load

3. Load user_feed_months

4. Aggregate events that refer to the same story

5. Sort (reverse chron)

Activity Feed

Tuesday, January 22, 13

Page 52: Creating social features at BranchOut using MongoDB

Algorithm

1. Load user_feed_cards for all connections

2. Calculate which user_feed_months to load

3. Load user_feed_months

4. Aggregate events that refer to the same story

5. Sort (reverse chron)

6. Load content, comments, etc. and build stories

Activity Feed

Tuesday, January 22, 13

Page 53: Creating social features at BranchOut using MongoDB

Performance

Activity Feed

Tuesday, January 22, 13

Page 54: Creating social features at BranchOut using MongoDB

Performance

• Response times average under 500 ms (98th percentile under 1 sec

Activity Feed

Tuesday, January 22, 13

Page 55: Creating social features at BranchOut using MongoDB

Performance

• Response times average under 500 ms (98th percentile under 1 sec

• Design expected to scale well horizontally

Activity Feed

Tuesday, January 22, 13

Page 56: Creating social features at BranchOut using MongoDB

Performance

• Response times average under 500 ms (98th percentile under 1 sec

• Design expected to scale well horizontally

• Need to continue to optimize

Activity Feed

Tuesday, January 22, 13

Page 57: Creating social features at BranchOut using MongoDB

Building Social Features with MongoDB

Nathan Smith BrO: http://branchout.com/nate

FB: http://facebook.com/neocortica Twitter: @nate510

Email: [email protected]

Aditya Agarwal on Facebook’s architecture: http://www.infoq.com/presentations/Facebook-Software-Stack

Dan McKinley on Etsy’s activity feed: http://www.slideshare.net/danmckinley/etsy-activity-feeds-architecture

Good Quora questions on activity feeds: http://www.quora.com/What-are-the-scaling-issues-to-keep-in-mind-while-developing-a-social-network-feed

http://www.quora.com/What-are-best-practices-for-building-something-like-a-News-Feed

Tuesday, January 22, 13