mongo db
DESCRIPTION
MongoDB is a popular NoSQL database. This presentation was delivered during a workshop. First it talks about NoSQL databases, shift in their design paradigm, focuses a little more on document based NoSQL databases and tries drawing some parallel from SQL databases. Second part, is for hands-on session of MongoDB using mongo shell. But the slides help very less. At last it touches advance topics like data replication for disaster recovery and handling big data using map-reduce as well as Sharding.TRANSCRIPT
![Page 1: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/1.jpg)
NoSQL Database
Akshay MathurSarang Shravagi
@akshaymathu, @_sarangs
{name: ‘mongo’, type: ‘db’}
![Page 2: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/2.jpg)
@akshaymathu, @_sarangs 2
Who uses MongoDB
![Page 3: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/3.jpg)
@akshaymathu, @_sarangs 3
Let’s Know Each Other
• Do you code?• OS?• Programing Language?• Why are you attending?
![Page 4: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/4.jpg)
@akshaymathu, @_sarangs 4
Akshay Mathur
• Managed development, testing and release teams in last 14+ years– Currently Principal Architect at
ShopSocially
• Founding Team Member of– ShopSocially (Enabling “social” for
retailers)– AirTight Neworks (Global leader of WIPS)
![Page 5: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/5.jpg)
@akshaymathu, @_sarangs 5
Sarang Shravagi
• 10gen Certified Developer and DBA• CS graduate from PICT Pune• 3+ years in Software Product
industry• Currently Senior Full-stack Developer
at ShopSocially
![Page 6: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/6.jpg)
@akshaymathu, @_sarangs 6
How we use MongoDB
Python MongoDB
MongoEngine
![Page 7: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/7.jpg)
@akshaymathu, @_sarangs 7
Where MongoDB Fits
![Page 8: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/8.jpg)
@akshaymathu, @_sarangs 8
Program Outline: Understanding NoSQL
• Data Landscape• Different Storage Needs• Design Paradigm Shift from SQL to
NoSQL• Different Datastores• Closer look to Document Storage• Drawing parallel from RDBMS
![Page 9: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/9.jpg)
@akshaymathu, @_sarangs 9
Program Outline: Hands on Lab
• Installation and basic configuration• Mongo Shell• Creating and Changing Schema• Create, Read, Update and Delete of Data• Analyzing Performance• Improving performance by creating
Indices• Assignment• Problem solving for the assignment
![Page 10: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/10.jpg)
@akshaymathu, @_sarangs 10
Program Outline: Advance Topics
• Handling Big Data– Introduction to Map/Reduce– Introduction to Data Partitioning
(Sharding)
• Disaster Recovery– Introduction to Replica set and High
Availability
![Page 11: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/11.jpg)
@akshaymathu, @_sarangs 11
Ground Rules
• Disturb Everyone– Not by phone rings– Not by local talks– By more information
and questions
![Page 12: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/12.jpg)
@akshaymathu, @_sarangs
Data Patterns & Storage Needs
![Page 13: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/13.jpg)
@akshaymathu, @_sarangs 13
Data at an Online Store
• Product Information• User Information• Purchase Information• Product Reviews• Site Interactions• Social Graph• Search Index
![Page 14: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/14.jpg)
@akshaymathu, @_sarangs
SQL to NoSQL
Design Paradigm Shift
![Page 15: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/15.jpg)
@akshaymathu, @_sarangs 15
SQL Storage
• Was designed when– Storage and data transfer was costly– Processing was slow– Applications were oriented more
towards data collection
• Initial adopters were financial institutions
![Page 16: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/16.jpg)
@akshaymathu, @_sarangs 16
SQL Storage
• Structured– schema
• Relational– foreign keys, constraints
• Transactional– Atomicity, Consistency, Isolation, Durability
• High Availability through robustness– Minimize failures
• Optimized for Writes• Typically Scale Up
![Page 17: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/17.jpg)
@akshaymathu, @_sarangs 17
NoSQL Storage
• Is designed when– Storage is cheap– Data transfer is fast–Much more processing power is
available• Clustering of machines is also possible
– Applications are oriented towards consumption of User Generated Content
– Better on-screen user experience is in demand
![Page 18: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/18.jpg)
@akshaymathu, @_sarangs 18
NoSQL Storage
• Semi-structured– Schemaless
• Consistency, Availability, Partition Tolerance
• High Availability through clustering– expect failures
• Optimized for Reads• Typically Scale Out
![Page 19: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/19.jpg)
@akshaymathu, @_sarangs
Different Datastores
Half Level Deep
![Page 20: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/20.jpg)
@akshaymathu, @_sarangs 20
SQL: RDBMS
• MySql, Postgresql, Oracle etc.• Stores data in tables having columns– Basic (number, text) data types
• Strong query language• Transparent values– Query language can read and filter on
them– Relationship between tables based on
values
• Suited for user info and transactions
![Page 21: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/21.jpg)
@akshaymathu, @_sarangs 21
NoSQL: Key/Value
• Redis, DynamoDB etc.• Stores a values against a key– Strings
• Values are opaque– Can not be part of query
• Suited for site interactions
![Page 22: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/22.jpg)
NoSQL: Key/Value
![Page 23: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/23.jpg)
@akshaymathu, @_sarangs 23
NoSQL: Document
• MongoDB, CouchDB etc.• Object Oriented data models– Stores data in document objects having
fields– Basic and compound (list, dict) data types
• SQL like queries• Transparent values– Can be part of query
• Suited for product info and its reviews
![Page 24: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/24.jpg)
NoSQL: Document
![Page 25: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/25.jpg)
@akshaymathu, @_sarangs 25
NoSQL: Column Family
• Cassandra, Big Table etc.• Stores data in columns• Transparent values– Can be part of query
• SQL like queries• Suited for search
![Page 26: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/26.jpg)
NoSQL: Column Family
![Page 27: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/27.jpg)
@akshaymathu, @_sarangs 27
NoSQL: Graph
• Neo4j• Stores data in form of nodes and
relationships• Query is in form of traversal• In-memory• Suited for social graph
![Page 28: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/28.jpg)
NoSQL: Graph
![Page 29: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/29.jpg)
![Page 30: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/30.jpg)
@akshaymathu, @_sarangs
Document Storage: Closer Look
![Page 31: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/31.jpg)
@akshaymathu, @_sarangs 31
MongoDB
• Document database• Powerful query language• Docs, sub-docs, indexes• Map/reduce• Replicas, shards, replicated shards• SDKs/drivers for so many languages
– C, C++, C#, Python, Erlang, PHP, Java, Javascript, NodeJS, Perl, Ruby, Scala
![Page 32: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/32.jpg)
@akshaymathu, @_sarangs 32
RDBMS: DB Design
![Page 33: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/33.jpg)
@akshaymathu, @_sarangs 33
RDBMS: Query
![Page 34: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/34.jpg)
@akshaymathu, @_sarangs 34
RDBMS MongoDB
RDBMS MongoDB
Database Database
Table Collection
Row Document
Column Field
Select c1, c2 from Table where c1 = ‘v1’ order by c2 limit n
Collection.objects(F1 = ‘v1’).order_by(‘c2’).limit(n)
![Page 35: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/35.jpg)
@akshaymathu, @_sarangs 35
MongoDB: Design
![Page 36: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/36.jpg)
@akshaymathu, @_sarangs 36
MongoDB: Query
• Movies.objects()
![Page 37: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/37.jpg)
@akshaymathu, @_sarangs 37
![Page 38: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/38.jpg)
Have you Installed?
http://www.mongodb.org/downloads
@akshaymathu, @_sarangs
![Page 39: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/39.jpg)
@akshaymathu, @_sarangs
Hands-on
Dive-in with Sarang
![Page 40: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/40.jpg)
@akshaymathu, @_sarangs 40
MongoDB: Core Binaries
• mongod– Database server
• mongo– Database client shell
• mongos– Router for Sharding
![Page 41: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/41.jpg)
@akshaymathu, @_sarangs 41
Getting Help
• For mongo shell–mongo –help• Shows options available for running the shell
• Inside mongo shell– Object.help()• Shows commands available on the object
![Page 42: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/42.jpg)
@akshaymathu, @_sarangs 42
Import Export Tools
• For objects–mongodump–mongorestore– bsondump–mongooplog
• For data items–mongoimport–mongoexport
![Page 43: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/43.jpg)
@akshaymathu, @_sarangs 43
Database Operations
• Database creation• Creating/changing collection• Data insertion• Data read• Data update• Creating indices• Data deletion• Dropping collection
![Page 44: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/44.jpg)
@akshaymathu, @_sarangs 44
Diagnostic Tools
• mongostat• mongoperf• mongosnif• mongotop
![Page 45: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/45.jpg)
@akshaymathu, @_sarangs 45
![Page 46: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/46.jpg)
@akshaymathu, @_sarangs 46
Assignment
• Go to http://www.velocitainc.com/mongo/– Tasks• assignments.txt
– Data• students.json
![Page 47: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/47.jpg)
@akshaymathu, @_sarangs
Disaster Recovery
Introduction to Replica Sets and
High Availability
![Page 48: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/48.jpg)
@akshaymathu, @_sarangs 48
Disasters
• Physical Failure– Hardware– Network
• Solution– Replica Sets• Provide redundant storage for High
Availability– Real time data synchronization
• Automatic failover for zero down time
![Page 49: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/49.jpg)
@akshaymathu, @_sarangs 49
Replication
![Page 50: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/50.jpg)
@akshaymathu, @_sarangs 50
Multi Replication
• Data can be replicated to multiple places simultaneously
• Odd number of machines are always needed in a replica set
![Page 51: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/51.jpg)
@akshaymathu, @_sarangs 51
Single Replication
• If you want to have only one or odd number of secondary, you need to setup an arbiter
![Page 52: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/52.jpg)
@akshaymathu, @_sarangs 52
Failover
• When primary fails, remaining machines vote for electing new primary
![Page 53: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/53.jpg)
@akshaymathu, @_sarangs
Handling Big Data
Introduction to Map/Reduce and Sharding
![Page 54: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/54.jpg)
@akshaymathu, @_sarangs 54
Large Data Sets
• Problem 1– Performance• Queries go slow
• Solution–Map/Reduce
![Page 55: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/55.jpg)
@akshaymathu, @_sarangs 55
Map Reduce
• A way to divide large query computation into smaller chunks
• May run in multiple processes across multiple machines
• Think of it as GROUP BY of SQL
![Page 56: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/56.jpg)
@akshaymathu, @_sarangs 56
Map/Reduce Example
• Map function digs the data and returns required values
![Page 57: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/57.jpg)
@akshaymathu, @_sarangs 57
Map/Reduce Example
• Reduce function uses the output of Map function and generates aggregated value
![Page 58: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/58.jpg)
@akshaymathu, @_sarangs 58
Large Data Sets
• Problem 2– Vertical Scaling of Hardware• Can’t increase machine size beyond a limit
• Solution– Sharding
![Page 59: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/59.jpg)
@akshaymathu, @_sarangs 59
Sharding
• A method for storing data across multiple machines
• Data is partitioned using Shard Keys
![Page 60: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/60.jpg)
@akshaymathu, @_sarangs 60
Data Partitioning: Range Based
• A range of Shard Keys stay in a chunk
![Page 61: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/61.jpg)
@akshaymathu, @_sarangs 61
Data Partitioning: Hash Bsed
• A hash function on Shard Keys decides the chunk
![Page 62: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/62.jpg)
@akshaymathu, @_sarangs 62
Sharded Cluster
![Page 63: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/63.jpg)
@akshaymathu, @_sarangs 63
Optimizing Shards: Splitting
• In a shard, when size of a chunk increases, the chunk is divided into two
![Page 64: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/64.jpg)
@akshaymathu, @_sarangs 64
Optimizing Shards: Balancing
• When number of chunks in a shard increase, a few chunks are migrated to other shard
![Page 65: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/65.jpg)
@akshaymathu, @_sarangs 65
Summary
• MongoDB is good– Stores objects as we use in programming
language– Flexible semi-structured design– Scales out to store big data– Embedded documents eliminates need for join
• MongoDB is bad– No multi-document query– De-normalized storage– No support for transactions
![Page 66: Mongo db](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c68b994a795997468b459c/html5/thumbnails/66.jpg)
@akshaymathu, @_sarangs 66
Thanks
@akshaymathu @_sarangs