mongo db mug_2012-02-07

37
O Where Clause, Where Clause! Wherefore art thou Where Clause? (a.k.a. Aggregation for Reporting)

Upload: will-button

Post on 26-Jun-2015

229 views

Category:

Documents


0 download

DESCRIPTION

Data extraction for reporting on Mongo using

TRANSCRIPT

Page 1: Mongo db mug_2012-02-07

O Where Clause, Where Clause! Wherefore art thou Where Clause?

(a.k.a. Aggregation for Reporting)

Page 2: Mongo db mug_2012-02-07

Overview

• Discuss and demonstrate aggregating data • Specifically addresses reporting needs• Example study: Aggregating Video Game Stats

Page 3: Mongo db mug_2012-02-07

Disclaimer!

Page 4: Mongo db mug_2012-02-07

Kills

Page 5: Mongo db mug_2012-02-07

Sales

Page 6: Mongo db mug_2012-02-07

Player

Page 7: Mongo db mug_2012-02-07

Product

Page 8: Mongo db mug_2012-02-07

Dataset{

"_id" : ObjectId("50fc77ee364c74eba1afe1e3"),"fragdate" : ISODate("2012-12-24T00:00:19.901Z"),"gameId" : 1221,"gameName" : "Christmas Blitz","kill" : {

"_id" : ObjectId("50acfd45712e8bc7832ea7cb"),"username" : "player002","avatar" : "avatar.com/player002.png","displayname" : "Sniper the Clown","rank" : "Sniper","motto" : "If you run, you'll just die tired."

},"player" : {

"userid" : 1,"username" : "ArmyD00d1221","avatar" : "avatar.com/armyd00d1221.png","displayname" : "Army Grunt"

},"server" : "app01.fragzilla.com"

}

Aggregate the number of times each

player was killed

Page 9: Mongo db mug_2012-02-07

Report Details{

"_id" : ObjectId("50fc77ee364c74eba1afe1e3"),"fragdate" : ISODate("2012-12-24T00:00:19.901Z"),"gameId" : 1221,"gameName" : "Christmas Blitz","kill" : {

"_id" : ObjectId("50acfd45712e8bc7832ea7cb"),"username" : "player002","avatar" : "avatar.com/player002.png","displayname" : "Sniper the Clown","rank" : "Sniper","motto" : "If you run, you'll just die tired."

},"player" : {

"userid" : 1,"username" : "ArmyD00d1221","avatar" : "avatar.com/armyd00d1221.png","displayname" : "Army Grunt"

},"server" : "app01.fragzilla.com"

}

Only aggregate kills on these three players:• Sniper the Clown• Kurious Killer• My L1ttl3 P0wn13

Only on Dec 23, 2012Between 2pm and 10pm

Page 10: Mongo db mug_2012-02-07

Relational DB

IdfragDategameIDgameNameserverfkKilledfkPlayer

Kills IdusernameavatardisplayNamerankmotto

Killed

IdusernameavatardisplayNamerankmotto

Player

Could be the same table

Page 11: Mongo db mug_2012-02-07

Relational DB

IdfragDategameIDgameNameserverfkKilledfkPlayer

Kills IdusernameavatardisplayNamerankmotto

Killed

IdusernameavatardisplayNamerankmotto

Player

Could be the same table

SELECT tk.fragDate, k.id, count(k.id) FROM test.kills tkJOIN players p ON tk.fkPlayer = p.id JOIN killed k ON tk.fkKilled = k.idWHERE k.id IN (1,2,3)GROUP BY fragDate, k.id;

Page 12: Mongo db mug_2012-02-07

Sidenote: Exploration

• Software Engineering tends to have more clearly defined goals

• Report Engineering tends to have more clearly defined questions

Page 13: Mongo db mug_2012-02-07

Query From The Shell

Page 14: Mongo db mug_2012-02-07

Output

Page 15: Mongo db mug_2012-02-07

Next Step: Delimited Output

Page 16: Mongo db mug_2012-02-07

Display in Excel, R, Processing, etc

Page 17: Mongo db mug_2012-02-07

Aggregation: Big Picture

• Somewhere between Mongo Queries and Map/Reduce implementations

• Best suited for totaling and averaging functions• Similar functionality to SQL Group By clause

Complexity

MongoQueries

MongoAggregation Framework

Map/ReduceImplementations

Page 18: Mongo db mug_2012-02-07

Anatomy of Aggregation Framework

db.collection.aggregate( [ {do something},

{do something else},{do even more stuff}

])

Aggregate command

Pipeline Operators

Page 19: Mongo db mug_2012-02-07

Pipeline Operators

• Pipelines: transforms documents from the collection as they pass through– grep e server.log | less

• Expressions: produce output documents based on calculations performed on input documents

Page 20: Mongo db mug_2012-02-07

Pipelines:

$project$match$limit$skip$unwind$group$sort

Page 21: Mongo db mug_2012-02-07

Expressions$group Operators:$addToSet$first$last$max$min$avg$sum

Boolean Operators:$and$or$not

Comparison Operators:$cmp$eq$gt$lt$ne

Arithmetic Operators:$add$subtract$multiply$divide

String Operators:$strcasecmp$substr$toLower$toUpper

Date Operators:$year$month$hour

See http://docs.mongodb.org/manual/reference/aggregation/#aggregation-expression-operatorsFor an exhaustive list

Page 22: Mongo db mug_2012-02-07

Our Aggregation Query

Page 23: Mongo db mug_2012-02-07

Our Aggregation Query

All the magic goes between the []

Page 24: Mongo db mug_2012-02-07

Our Aggregation Query

$match:Provides a query-like interface to filter

documents out of the aggregation pipeline. The $match drops

documents that do not match the condition from the aggregation

pipeline, and it passes documents that match along the pipeline unaltered.

Page 25: Mongo db mug_2012-02-07

Our Aggregation Query

$project: Reshapes a document stream by renaming, adding, or

removing fields. Also use $project to create computed values or sub-

objects

Page 26: Mongo db mug_2012-02-07

Our Aggregation Query

$groupGroups documents together for the

purpose of calculating aggregate values based on a collection of

documents. Practically, group often supports tasks such as average page

views for each page in a website on a daily basis.

Page 27: Mongo db mug_2012-02-07

Our Aggregation Query

“numKills”: { $sum: “$numKills” }

Page 28: Mongo db mug_2012-02-07

Our Aggregation Query$sort

The $sort pipeline operator sorts all input documents and returns them to

the pipeline in sorted order.

{ $sort : { <sort-key> } }

Page 29: Mongo db mug_2012-02-07

Aggregation Output

{"result" : [

{"_id" : {"displayname" : "My L1ttl3 P0wn13","eventhour" : 21},"numKills" : 133},{"_id" : {"displayname" : "Kurious Killer","eventhour" : 21},"numKills" : 130},

// ******* Omitted for brevity *******{"_id" : {"displayname" : "Sniper the Clown","eventhour" : 2},"numKills" : 6}

],"ok" : 1

}

Produces a document with two fields: result and ok

Page 30: Mongo db mug_2012-02-07

Aggregation Output

{"result" : [

{"_id" : {"displayname" : "My L1ttl3 P0wn13","eventhour" : 21},"numKills" : 133},{"_id" : {"displayname" : "Kurious Killer","eventhour" : 21},"numKills" : 130},

// ******* Omitted for brevity *******{"_id" : {"displayname" : "Sniper the Clown","eventhour" : 2},"numKills" : 6}

],"ok" : 1

}

Page 31: Mongo db mug_2012-02-07

Recap: Aggregation Framework

db.collection.aggregate( [ {do something},

{do something else},{do even more stuff}

])

Aggregate command

Pipeline Operators

Page 32: Mongo db mug_2012-02-07

We’re not quite done…

We can’t really give something like this to our customers:

Page 33: Mongo db mug_2012-02-07

But if we had…

A database config.

Page 34: Mongo db mug_2012-02-07

But if we had…

To run our aggregation.

Page 35: Mongo db mug_2012-02-07

But if we had…

Inside a node server.

Page 36: Mongo db mug_2012-02-07
Page 37: Mongo db mug_2012-02-07

Q/A/Comments

Will [email protected]@wfbutton