mongodb: queries and aggregation framework with nba game data

27
MongoDB Queries and Aggregation Valeri Karpov Kernel Tools Engineer, MongoDB www.thecodebarbarian.com github.com/vkarpov15 @code_barbarian

Upload: valeri-karpov

Post on 26-Jun-2015

3.694 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: MongoDB: Queries and Aggregation Framework with NBA Game Data

MongoDB Queries and Aggregation

Valeri KarpovKernel Tools Engineer, MongoDB

www.thecodebarbarian.comgithub.com/vkarpov15

@code_barbarian

Page 2: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Introducing an Awesome Data Set

•Scraped basketball-reference.com

•Mad props to NPM module Cheerio

•Box scores for all 31,686 NBA games since 1985

•Download: http://bit.ly/1jlgs9u via S3

•Untar and run mongorestore

Page 3: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Data Set Structure

•Contains final score

•Contains box score for teams and players

Page 4: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Data Set Structure - High Level

•Contains _id, date

•Info on winning team and losing team

Page 5: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Data Set Structure - Box

•Box score contains detailed stats by team

Page 6: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Data Set Structure - Box

•And also for individual players:

Page 7: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Queries and Aggregation

•MongoDB has a rich query framework

•Aggregation framework is like SQL’s group by

Page 8: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Query Basics - findOne()

•When was Kobe Bryant’s 81 point game?

Page 9: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Query Basics - find()

•Which teams have lost despite scoring more than 150 points?

Page 10: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Query Basics - count()

•How many games did the Lakers win in the 1999-2000 season?

Page 11: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Query Basics - distinct()

•Which teams have lost a game despite having a player make at least 10 3 pointers?

Page 12: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Query Basics - $elemMatch operator

•When did Michael Jordan score 60 points in a losing effort?

Page 13: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Query Basics - $elemMatch operator

Page 14: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Query Basics - .sort() and .limit()

•What are the 5 highest point totals for a losing team?

Page 15: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Query Basics - .sort() and .limit()

•What are the 5 highest point totals for a losing team?

Page 16: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Aggregation

•Similar to SQL group by

•Filters and transforms data in pipeline stages

•Stages are chainable

•Accessible via the .aggregate() function in shell

Page 17: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Aggregation - Lakers Season PPG

•How many points did the Lakers average in games they won in the 2008-2009 season?

Page 18: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Aggregation - Lakers Season PPG

•How many points did the Lakers average in games they won in the 2008-2009 season?

Page 19: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Aggregation - $sort and $limit

•Compute the teams with the 5 best records in the 1999-2000 season

Page 20: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Aggregation - $sort and $limit

Page 21: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Aggregation - $sort and $limit

Page 22: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Aggregation - $unwind

•Random statistic: player with highest scoring average in games their team lost

Page 23: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Aggregation - $unwind

•Random statistic: player with highest scoring average in games their team lost

Page 24: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Aggregation - Fun With Steals

•How often does a team win when they record more steals than the other team?

Page 25: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Aggregation - Fun With Steals

Page 26: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Aggregation - Fun With Steals

Page 27: MongoDB: Queries and Aggregation Framework with NBA Game Data

*

Thanks for Listening!

Slides on Twitter, @code_barbarian