mongo 101 - basics · 2018. 4. 25. · how different is mongodb from mysql/rdbs nosql and sql are...
Post on 30-Aug-2020
1 Views
Preview:
TRANSCRIPT
Mongo 101 - Basics
2
Who are we?
Adamo Tonete
● Senior Support Engineer● Joined Percona in 2015● 10+ years as a DBA● 5+ years working with NoSQL
products
Rick Golba
● Product Marketing Manager● Joined Percona in 2014 as a
Solutions Engineer● 20+ years as a SQL trainer● 3+ years working with NoSQL
products
● What is MongoDB?● How different is MongoDB from MySQL?● Common MongoDB topologies● CRUD: data management● Aggregations, Import/Export, and Backups● Schema design patterns● Replica-sets and Upgrades● Securing your setup - Demo● Common issues: How to detect, verify and address them using logs, Percona
Toolkit, and Percona Monitoring and Management (PMM)
Agenda
● In order to install the software you should have a laptop with at least:○ 2 cores○ 4 gb ram○ 500 MB disk○ Internet connection○ Git client installed
We strongly suggest using Linux but Windows machines will work as well
Requirements
Installing MongoDB
Now that we already know what MongoDB is, let's install the database
For MacOS and Linux
Go to mongodb.com/downloads
git clone https://github.com/adamobr/MongoDB3XLabs
cd MongoDB3XLabs
./run_single.sh
Installing MongoDB - Single instance
First commands:
show databases
use percona
db.collection.insert({today : new Date()})
db.collection.find()
Installing MongoDB - Single instance
First let's cleanup the previous instance
./reset_lab
./run_replicaset
./3.6/bin/mongo # to connect to the replica-set.
Installing MongoDB - Replica-set
Installing MongoDB - Replica-set
Let's run some commands to describe your environment:
rs.status()
rs.config()
db.serverStatus()
What is MongoDB?
What is MongoDB
● NoSQL● Document-oriented Database● Built for fast delivery and development● Easy-to-scale database● ...and a LOT more!
How different is MongoDB from MySQL?
How different is MongoDB from MySQL/RDBS
● Some features we will compare
○ Normalization○ Transactions○ Query language○ Data are stored○ Special indexes○ How to distribute and scale
How different is MongoDB from MySQL/RDBS
● NoSQL and SQL are not enemies; they are made to complement each other● While MongoDB is a young NoSQL database, MySQL has been in the market for a couple of
years as a mature relational database system● In some cases, using MongoDB as the main database is not the best thing to do● However, MongoDB can provide a very fast-growing environment without too much effort
How different is MongoDB from MySQL/RDBS
● Comparing data distribution
○ MongoDB expects data to grow beyond machine limitations○ MySQL does have a few add-ons that allow data distribution among instances, but they
were created later by a 3rd-party company○ MySQL expects to work in a single machine with full ACID, while MongoDB doesn't
expect ACID, MongoDB is limited to the CAP theorem
What is ACID?
Atomicity: single document level & no snapshotting for readsConsistency: primary = strong | secondaries = your choiceIsolation: not really, $isolated can helpDurability: configurable w:majority and/or j:true
How different is MongoDB from MySQL/RDBS
How different is MongoDB from MySQL/RDBS
● The CAP theorem was proposed by Eric Allen in 2000
● A distributed system can't have the 3 guarantees at the same time. One of them must be sacrificed.
How different is MongoDB from MySQL/RDBS
● Consistency● Availability● Partition Tolerance
Anyone will get the same response, data is consistent among instances
A
PC
How different is MongoDB from MySQL/RDBS
● Consistency● Availability● Partition Tolerance
System will always respond to requests, no downtime
A
PC
How different is MongoDB from MySQL/RDBS
● Consistence● Availability● Partition Tolerance
System can handle errors (network, hardware failure)
A
PC
How different is MongoDB from MySQL/RDBS
A
PC
Relational Databases
MySQL
PostgreSQL
Cassandra
Riak
MongoDB
Redis
How different is MongoDB from MySQL/RDBS
How different is MongoDB from MySQL/RDBS
At each intersection is a
single scalar value
{
"_id" : ObjectId("507f1f77bcf86cd799439011"),
"studentID" : 100,
"firstName" : "Jonathan",
"middleName" : "Eli",
"lastName" : "Tobin",
"classes" : [
{ "courseID" : "PHY101",
"grade" : "B",
"courseName" : "Physics 101",
"credits" : 3 },
{ "courseID" : "BUS101",
"grade" : "B+",
"courseName" : "Business 101",
"credits" : 3 }
]
How different is MongoDB from MySQL/RDBS
● Unlike MySQL, MongoDB doesn't have a predefined schema● Documents can have different fields with different data types, for example
{x : 1, y : ['test']}
and
{x : 'percona', y : ISODate('2018-01-01')}
are both valid MongoDB documents
How different is MongoDB from MySQL/RDBS
● No joins● Rich Geo Indexing● Schema-free
How different is MongoDB from MySQL/RDBS
● MongoDB doesn't use 3rd-form normalization● All documents must have as much information as necessary
○ Linked documents are acceptable but not recommended
How different is MongoDB from MySQL/RDBS
● High availability by default in MongoDB● Replica sets is the minimum suggested way to go to production● Shards can be used to increase read/write throughput - we will discuss that in
Topology
How different is MongoDB from MySQL/RDBS
● Machine costs● If we want to scale MongoDB, we can simply add more machines
○ This is not always true for MySQL
How different is MongoDB from MySQL/RDBS
How Similar is MongoDB to MySQL?
● … but these databases are not completely different
● They share
○ Security○ Indexing○ Multi-user access - 3.6 (session)○ Multi table○ Concurrency○ Several other database concepts
How Similar MongoDB is to MySQL
Database terms and concepts
How Similar MongoDB is to MySQL
MongoDB MySQL
Database Database
Collection Table
Document Row
Key : value pair Field
● Indexes are the fast way to find a specific row or document they are very similar in both databases
How Similar MongoDB is to MySQL
● Multi-user ● Concurrent operations
How Similar MongoDB is to MySQL
MongoDB Topologies
MongoDB Topologies
It is possible to deploy MongoDB using
● Single instance● Replica set● Sharded cluster
MongoDB Topologies - Single Instance
MongoDB Topologies
Single instance
● Commonly used for testing purposes● Percona doesn't recommend using single instances for production
MongoDB Topologies - Replica Set
Replica set
● Very common for small/medium environments● Asynchronous replication● Easy-to-scale reads● Doesn't scale writes● Can have delayed members● Rely on oplog
MongoDB Topologies
MongoDB Topologies - Sharded Cluster
MongoDB Topologies
Sharded Cluster
● Act very similarly to replica-sets, but they are used to scale reads/writes among shards
● Data is divided in shards● Data can migrate among shards● If not using the right shard key, they don’t scale well● Rely on oplogs + config servers
Installing MongoDB
Now that we already know what MongoDB is, let's install the database
For MacOS and Linux
Go to mongodb.com/downloads
git clone https://github.com/adamobr/MongoDB3XLabs
cd MongoDB3XLabs
./run_single.sh
Installing MongoDB - Single instance
First commands:
show databases
use percona
db.collection.insert({today : new Date()})
db.collection.find()
Installing MongoDB - Single instance
First let's cleanup the previous instance
./reset_lab
./run_replicaset
./3.6/bin/mongo # to connect to the replica-set.
Installing MongoDB - Replica-set
Installing MongoDB - Replica-set
Let's run some commands to describe your environment:
rs.status()
rs.config()
db.serverStatus()
MongoDB Operations
MongoDB Operations
● Creating and using a database● CRUD Operations - Create, Read, Update, Delete● Create - insert
• Collections• Documents
● Read - find• Using operators
● Update - update vs upsert• Write concern considerations
● Delete - remove
Connecting to the Mongo Shell
●When connecting to a mongo instance, you connect to the test database by default
●You will likely use the --port option to connect to a specific mongod instance. In this case we are connecting via localhost.
● If connecting remotely, specify a --host option where your mongod is running
Creating a Database➔ This command shows which databases exist
➔ This command creates and connects to the percona database
➔ This command shows which db you are currently connected to
➔ Why is the database we created not showing?
Creating a Database
●The database isn’t actually created until we write data to it
● If you disconnect from the shell after using a database, but writing no data it will be gone
●Let’s write some data so it stays
Inserting Documents
●db.collection_name.insert() is the basic syntax● In this case, we inserted a simple key : value pair●We used java syntax to create a random number between 0 and 1 with our
Math.random() function●To see our insert, we use the find() function- Because this collection only has 1 document inserted, we only see one result
Inserting Documents
● It is important to note that we did not actually create a collection with the previous command explicitly. The collection was created when we inserted a document into it.
●We can see the collections in the current database using the show collections command
●The system.indexes collection contains information about indexes in the database (percona)
Reading
●Before we explore find commands in the mongo shell, let’s insert some test data into our new collection
●MongoDB allows us to write a short script to generate test data directly into the shell
●This script generates 25 random numbers from 0-1 and inserts them into our new_collection collection
Reading
●Our results show 26 documents in our new_collection●The mongo shell caps to 20 results by default
- You can iterate more by typing it●The _id field is auto-generated if not specified. It is unique like a foreign key in
SQL.
57
Full Results
Reading
●This is a search specifically by _id
●An index was created by default on this when we created our document
Reading with Operators
●There are many operators to use when querying data in MongoDB, but we will focus on these
Reading with Operators
●How would we find numbers greater than 0.9 in our collection that we added random numbers to?
A.db.new_collection.find( "random_number" : $gt : .9 )
B.db.new_collection.find( { "random_number" : { $gt : .9 } } )
C.db.new_collection.find( { "random_number" > .9 } )
D.db.new_collection.find( { "random_number" : { $gt : .9 } )
Reading with Operators
●B is the correct answer
A.db.new_collection.find( "random_number" : $gt : .9 )
B.db.new_collection.find( { "random_number" : { $gt : .9 } } )
C.db.new_collection.find( { "random_number" > .9 } )
D.db.new_collection.find( { "random_number" : { $gt : .9 } )
Reading with Operators
●Here’s what it looks like in practice
●Now that we know how to find documents, let’s update some
Updating
●The general syntax for an update is:• db.collection_name.update( query , update , options )
●By default, updating only updates a single document. Setting the multi parameter allows modifying all documents found in the query.
●Let’s update our documents with a random number value greater than .9 to 1
Updating
Updating - Write Concern
●Write concerns can be customized per operation if specified. Default behavior is used otherwise.
●w:0 = Fire and forget, no acknowledgement
●w:1 = Acknowledgement from primary only
●w: "majority" = Acknowledgement from majority of nodes with data
Application
Primary
Secondary
Secondary
Deleting
●Remove all documents in a collection: db.collection.remove()●Partial removal, you must specify your criteria as shown below
Deleting
What is the correct syntax for deleting all records in our collection with a value less than .2 ?
A.db.new_collection.remove ( { "random_number" : { $lt : .2 } } )B.db.remove( { "random_number" : { $lt : .2 } } )C.db.new_collection.remove ( { $lt : .2 } )D.db.new_collection.remove ( { "random_number" < .2 } )
Deleting
A is the correct answer
A.db.new_collection.remove ( { "random_number" : { $lt : .2 } } )B.db.remove( { "random_number" : { $lt : .2 } } )C.db.new_collection.remove ( { $lt : .2 } )D.db.new_collection.remove ( { "random_number" < .2 } )
Deleting
●Dropping an entire collection can also be done with the drop command
Deleting
●Databases can also be dropped with the dropDatabase command
Aggregation, Export/Import and Backups
MongoDB Aggregations
MongoDB features aggregations to help us to write complex queries as some calculations, grouping can not be done with standard querying
So what is the aggregation framework?
Similar to OLAP more used for analytics
MongoDB Aggregations
Aggregation framework works with a pipeline and the most common case is
Match > Project > Sort/Group > Final
Each process is dependent of the previous one and the next process depend on the previous one
The output can be a cursor or a collection
MongoDB Aggregations
An aggregation example:
db.new_collection.aggregate([{$match : {random_number : {$gte : 0.5} }}, {$project : {_id : '$random_number'}}])
Values $gte 0.5
MongoDB import and export
Like other databases there are some tools to help us to export and import data
It is possible to export a single collection, with or without a filter
With some flags it is possible to generate a CSV file to load in a different database
MongoDB backup and restore
There are couple of methods to backup and restore a MongoDB instance
But mongodump is the most common method to save a copy of the data
● Disk snapshot● Hot backup
./mongodump -d percona -c test -o perconabackup
2018-04-22T20:26:59.665-0300 writing percona.test to
2018-04-22T20:26:59.666-0300 done dumping percona.test (1 document)
MongoDB mongodump
MongoDB mongorestore
MacPro13:bin adamo$ ./mongorestore -d perconabackup perconabackup/percona/
2018-04-22T20:28:12.109-0300 the --db and --collection args should only be used when restoring from a
BSON file. Other uses are deprecated and will not exist in the future; use --nsInclude instead
2018-04-22T20:28:12.110-0300 building a list of collections to restore from perconabackup/percona dir
2018-04-22T20:28:12.111-0300 reading metadata for perconabackup.test from
perconabackup/percona/test.metadata.json
2018-04-22T20:28:12.168-0300 restoring perconabackup.test from perconabackup/percona/test.bson
2018-04-22T20:28:12.261-0300 no indexes to restore
2018-04-22T20:28:12.261-0300 finished restoring perconabackup.test (1 document)
2018-04-22T20:28:12.261-0300 done
Schema design
Schema design
Although MongoDB is a schema free database there are good practices that need
to be followed
● Use suggestive collection names
● Avoid "joins"
● Keep the document pretty simple
● Keep field names short (mmap)
Schema design
● One to some (< 16MB)
{
_id : ObjectId('8123ad324723ds9fd83453'),
text : 'this is a really simple blog post'
comments : [
{_id : 1, comment : "I really liked your post"},
{_id : 2, comment : "too simple!"}
]
}
Schema design
● One to Thousands (maybe > 16MB)
{
_id : ObjectId('8123ad324723ds9fd83453'),
brand : 'Lemon'
parts : [
ObjectId(), ObjectId(), ObjectId(), ObjectId(), ObjectId()... ObjectId() …]
}
Schema design
Think about denormalization, how to transform a couple of tables in a single
document
Installing MongoDB
Now that we already know what MongoDB is, let's install the database
For MacOS and Linux
Go to mongodb.com/downloads
git clone https://github.com/adamobr/MongoDB3XLabs
cd MongoDB3XLabs
./run_single.sh
Installing MongoDB - Single instance
First commands:
show databases
use percona
db.collection.insert({today : new Date()})
db.collection.find()
Installing MongoDB - Single instance
First let's cleanup the previous instance
./reset_lab
./run_replicaset
./3.6/bin/mongo # to connect to the replica-set.
Installing MongoDB - Replica-set
Installing MongoDB - Replica-set
Let's run some commands to describe your environment:
rs.status()
rs.config()
db.serverStatus()
Replicaset Upgrades
Replica-sets and Upgrades
● Upgrades can be done without downtime
● Drivers help, it doesn't depend only on the instances
● New versions usually can be a member of an old replica-set
Replica-sets and Upgrades
In order to upgrade a replica-set, we will take advantage of high availability
Replica-sets and Upgrades
Removing a secondary or setting the instance as hidden
Replica-sets and Upgrades
Then drivers will see this configuration
Replica-sets and Upgrades
Repeat the process in the remaining secondaries
Replica-sets and Upgrades
Step the primary down and replace the remaining old instance
Let's talk about security
MongoDB Operations
● Default roles
● Enabling authentication and creating a root and a standard user
● Starting a replica-set environment with authentication
Default Roles
● All the roles listed below come by default in the MongoDB database serverhttps://docs.mongodb.com/manual/reference/built-in-roles/
read readWrite dbAdmin dbOwner userAdmin
clusterAdmin clusterManager clusterMonitor hostManager backup
restore readAnyDatabase readWriteAnyDatabase userAdminAnyDatabase
dbAdminAnyDatabase root __system
Enabling authentication
● Creating a root user and restarting the mongod process
mongo
use admin
> db.createUser({user : 'administrator', pwd : '123321', roles : ['root']})
Successfully added user: { "user" : "administrator", "roles" : [ "root" ] }
-- mongod.conf --
#security
security
authorization : enabled
-- service restart ---
./mongod --auth
Enabling authentication
● Checking access
use admin
db.auth('administrator','123321')
1
mongo -u administrator -p --authenticationDatabase admin
password:
> show dbs
local
percona
Creating a standard user
$ mongo
db.createUser({user: 'percona_user', pwd: '123', roles : [{ role :'read', db:
'percona'}]})
Successfully added user: {
"user" : "percona_user",
"roles" : [
{
"role" : "read",
"db" : "percona"
}
]
}
Starting a replica-set with keyfile
Starting a replica-set using keyfile
Empty
Secondary
Primary
I know your secret...
● Pre existing data instance with users in the admin database
Starting a replica-set with key file
sync'edPrimary
Ok, i can trust you.
Data...__system
client
Demo
./reset_lab.sh
./get_mongod_downloads.sh
cd 3.6/bin
mkdir data1 data2 data3
./mongod --dbpath data1 --logpath data/log.log --auth --fork --replSet myrs
./mongod --dbpath data2 --logpath data2/log.log --auth --fork --replSet myrs --port 27018
./mongod --dbpath data3 --logpath data3/log.log --auth --fork --replSet myrs --port 27019
./mongo
rs.initiate()
use admin
db.createUser({user : 'admin',pwd : '123', roles : ["root"]})
Use admin
db.auth('admin','123')
rs.add({_id : 2, host : 'localhost:27018', hidden : true, priority : 0, votes : 0})
Demo
./reset_lab.sh
./get_mongod_downloads.sh
cd 3.6/bin
mkdir data1 data2 data3
openssl rand -base64 756 > mykey.key && chmod 600 mykey.key
./mongod --dbpath data1 --logpath data1/log.log --auth --fork --replSet myrs --keyFile mykey.key
./mongod --dbpath data2 --logpath data2/log.log --auth --fork --replSet myrs --port 27018 --keyFile mykey.key
./mongod --dbpath data3 --logpath data3/log.log --auth --fork --replSet myrs --port 27019 --keyFile mykey.key
./mongo
rs.initiate()
use admin
db.createUser({user : 'admin',pwd : '123', roles : ["root"]})
use admin
db.auth('admin','123')
rs.add({_id : 2, host : 'localhost:27018'})
Common issues
Common issues
As all other databases we need to work proactively to keep the environment running as expected
The most common used commands to investigate the database health are
db.serverStatus()
db.currentOp()
rs.status()
rs.printSlaveReplicationInfo()
Common issues
Common issues
11
2
Rate Our Session
Thank You!
top related