de normalised london aggregation framework overview
DESCRIPTION
TRANSCRIPT
![Page 1: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/1.jpg)
Chris Harris Email : [email protected]
Twitter : cj_harris5
DeNormalised London: Aggregation Framework Overview
Wednesday, 21 March 12
![Page 2: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/2.jpg)
Terminology
RDBMS MongoDBTable CollectionRow(s) JSON DocumentIndex IndexJoin Embedding & LinkingPartition ShardPartition Key Shard Key
Wednesday, 21 March 12
![Page 3: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/3.jpg)
Here is a “simple” SQL Modelmysql> select * from book;+----+----------------------------------------------------------+| id | title |+----+----------------------------------------------------------+| 1 | The Demon-Haunted World: Science as a Candle in the Dark || 2 | Cosmos || 3 | Programming in Scala |+----+----------------------------------------------------------+3 rows in set (0.00 sec)
mysql> select * from bookauthor;+---------+-----------+| book_id | author_id |+---------+-----------+| 1 | 1 || 2 | 1 || 3 | 2 || 3 | 3 || 3 | 4 |+---------+-----------+5 rows in set (0.00 sec)
mysql> select * from author;+----+-----------+------------+-------------+-------------+---------------+| id | last_name | first_name | middle_name | nationality | year_of_birth |+----+-----------+------------+-------------+-------------+---------------+| 1 | Sagan | Carl | Edward | NULL | 1934 || 2 | Odersky | Martin | NULL | DE | 1958 || 3 | Spoon | Lex | NULL | NULL | NULL || 4 | Venners | Bill | NULL | NULL | NULL |+----+-----------+------------+-------------+-------------+---------------+4 rows in set (0.00 sec)
Wednesday, 21 March 12
![Page 4: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/4.jpg)
The Same Data in MongoDB
{ "_id" : ObjectId("4dfa6baa9c65dae09a4bbda5"), "title" : "Programming in Scala", "author" : [ { "first_name" : "Martin", "last_name" : "Odersky", "nationality" : "DE", "year_of_birth" : 1958 }, { "first_name" : "Lex", "last_name" : "Spoon" }, { "first_name" : "Bill", "last_name" : "Venners" } ]}
Wednesday, 21 March 12
![Page 5: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/5.jpg)
What problem are we solving?
• Map/Reduce can be used for aggregation…• Currently being used for totaling, averaging, etc
• Map/Reduce is a big hammer• Simpler tasks should be easier
• Shouldn’t need to write JavaScript• Avoid the overhead of JavaScript engine
• We’re seeing requests for help in handling complex documents• Select only matching subdocuments or arrays
Wednesday, 21 March 12
![Page 6: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/6.jpg)
How will we solve the problem?
• New aggregation framework• Declarative framework (no JavaScript)• Describe a chain of operations to apply• Expression evaluation
• Return computed values• Framework: new operations added easily• C++ implementation
Wednesday, 21 March 12
![Page 7: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/7.jpg)
Aggregation - Pipelines
• Aggregation requests specify a pipeline• A pipeline is a series of operations• Members of a collection are passed
through a pipeline to produce a result• ps -ef | grep -i mongod
Wednesday, 21 March 12
![Page 8: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/8.jpg)
Example - twitter{ "_id" : ObjectId("4f47b268fb1c80e141e9888c"), "user" : { "friends_count" : 73, "location" : "Brazil", "screen_name" : "Bia_cunha1", "name" : "Beatriz Helena Cunha", "followers_count" : 102, }}
• Find the # of followers and # friends by location
Wednesday, 21 March 12
![Page 9: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/9.jpg)
Example - twitterdb.tweets.aggregate( {$match: {"user.friends_count": { $gt: 0 }, "user.followers_count": { $gt: 0 } } }, {$project: { location: "$user.location", friends: "$user.friends_count", followers: "$user.followers_count" } }, {$group: {_id: "$location", friends: {$sum: "$friends"}, followers: {$sum: "$followers"} } });
Wednesday, 21 March 12
![Page 10: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/10.jpg)
Example - twitterdb.tweets.aggregate( {$match: {"user.friends_count": { $gt: 0 }, "user.followers_count": { $gt: 0 } } }, {$project: { location: "$user.location", friends: "$user.friends_count", followers: "$user.followers_count" } }, {$group: {_id: "$location", friends: {$sum: "$friends"}, followers: {$sum: "$followers"} } });
Predicate
Wednesday, 21 March 12
![Page 11: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/11.jpg)
Example - twitterdb.tweets.aggregate( {$match: {"user.friends_count": { $gt: 0 }, "user.followers_count": { $gt: 0 } } }, {$project: { location: "$user.location", friends: "$user.friends_count", followers: "$user.followers_count" } }, {$group: {_id: "$location", friends: {$sum: "$friends"}, followers: {$sum: "$followers"} } });
Predicate
Parts of the document you want to project
Wednesday, 21 March 12
![Page 12: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/12.jpg)
Example - twitterdb.tweets.aggregate( {$match: {"user.friends_count": { $gt: 0 }, "user.followers_count": { $gt: 0 } } }, {$project: { location: "$user.location", friends: "$user.friends_count", followers: "$user.followers_count" } }, {$group: {_id: "$location", friends: {$sum: "$friends"}, followers: {$sum: "$followers"} } });
Predicate
Parts of the document you want to project
Function to apply to the
result set
Wednesday, 21 March 12
![Page 13: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/13.jpg)
Example - twitter{ "result" : [ { "_id" : "Far Far Away", "friends" : 344, "followers" : 789 },... ], "ok" : 1}
Wednesday, 21 March 12
![Page 14: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/14.jpg)
DemoDemo files are at https://gist.github.com/
2036709
Wednesday, 21 March 12
![Page 15: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/15.jpg)
Projections
• $project can reshape results• Include or exclude fields• Computed fields
• Arithmetic expressions• Pull fields from nested documents to the top• Push fields from the top down into new virtual
documents
Wednesday, 21 March 12
![Page 16: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/16.jpg)
Unwinding
• $unwind can “stream” arrays• Array values are doled out one at time in the
context of their surrounding documents• Makes it possible to filter out elements before
returning
Wednesday, 21 March 12
![Page 17: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/17.jpg)
Grouping
• $group aggregation expressions• Define a grouping key as the _id of the result• Total grouped column values: $sum• Average grouped column values: $avg• Collect grouped column values in an array or
set: $push, $addToSet• Other functions
• $min, $max, $first, $last
Wednesday, 21 March 12
![Page 18: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/18.jpg)
Sorting
• $sort can sort documents• Sort specifications are the same as today,
e.g., $sort:{ key1: 1, key2: -1, …}
Wednesday, 21 March 12
![Page 19: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/19.jpg)
Computed Expressions
• Available in $project operations• Prefix expression language
• $add:[“$field1”, “$field2”]• $ifNull:[“$field1”, “$field2”]• Nesting: $add:[“$field1”, $ifNull:[“$field2”,
“$field3”]]• Other functions….
• $divide, $mod, $multiply
Wednesday, 21 March 12
![Page 20: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/20.jpg)
Computed Expressions
• String functions• $toUpper, $toLower, $substr
• Date field extraction• $year, $month, $day, $hour...
• Date arithmetic• $ifNull• Ternary conditional
• Return one of two values based on a predicate
Wednesday, 21 March 12
![Page 21: De normalised london aggregation framework overview](https://reader038.vdocuments.us/reader038/viewer/2022110303/54bf76da4a795940398b4575/html5/thumbnails/21.jpg)
conferences, appearanceshttp://www.10gen.com/events
and meetupshttp://www.meetup.com/London-MongoDB-User-Group
download at mongodb.org
We’re Hiring !Chris Harris
Email : [email protected] : cj_harris5
Wednesday, 21 March 12