version 7 · datamodel databasesystemstructure instance! databases! collecons! documents •...

31
NDBI040: Big Data Management and NoSQL Databases hƩp://www.ksi.mff.cuni.cz/~svoboda/courses/171-NDBI040/ PracƟcal Class 8 MongoDB MarƟn Svoboda [email protected]ff.cuni.cz 5. 12. 2017 Charles University in Prague, Faculty of MathemaƟcs and Physics Czech Technical University in Prague, Faculty of Electrical Engineering

Upload: others

Post on 01-Aug-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

NDBI040: Big Data Management and NoSQL Databasesh p://www.ksi.mff.cuni.cz/~svoboda/courses/171-NDBI040/

Prac cal Class 8

MongoDBMar n [email protected]

5. 12. 2017

Charles University in Prague, Faculty of Mathema cs and PhysicsCzech Technical University in Prague, Faculty of Electrical Engineering

Page 2: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Data ModelDatabase system structure

Instance→ databases→ collec ons→ documents

• Database• Collec on

Collec on of documents, usually of a similar structure• Document

MongoDB document = one JSON object– I.e. even a complex JSON object with other recursively nested

objects, arrays or valuesUnique immutable iden fier _idField name restric ons: _id, $, .

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 2

Page 3: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

CRUD Opera onsOverview

• db.collection.insert()Inserts a new document into a collec on

• db.collection.update()Modifies an exis ng document / documents orinserts a new one

• db.collection.remove()Deletes an exis ng document / documents

• db.collection.find()Finds documents based on filtering condi onsProjec on and / or sor ng may be applied too

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 3

Page 4: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Mongo ShellConnect to our NoSQL server

• SSH / PuTTY and SFTP / WinSCP• nosql.ms.mff.cuni.cz:42222

Start mongo shell• mongo

Try several basic commands• help

Displays a brief descrip on of database commands

• exitquit()

Closes the current client connec on

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 4

Page 5: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

DatabasesSwitch to your database

• use logindb = db.getSiblingDB('login')

Use your login name as a name for your database

List all the exis ng databases• show databases

show dbsdb.adminCommand('listDatabases')

Your database will be created later on implicitly

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 5

Page 6: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Collec onsCreate a new collec on for actors

• db.createCollection("actors")Suitable when crea ng collec ons with specific op onssince collec ons can also be created implicitly

List all collec ons in your database• show collections

db.getCollectionNames()

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 6

Page 7: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Insert Opera onInserts a new document / documents into a given collec on

dbdb .. collectioncollection .. insertinsert

(( documentdocument

[[ documentdocument

,,

]],, optionsoptions

))

• ParametersDocument: one or more documents to be insertedOp ons

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 7

Page 8: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Insert Opera onInsert a few new documents into the collec on of actors

db.actors.insert({ _id: "trojan", name: "Ivan Trojan" })

db.actors.insert({ _id: 2, name: "Jiri Machacek" })

db.actors.insert({ _id: ObjectId(), name: "Jitka Schneiderova" })

db.actors.insert({ name: "Zdenek Sverak" })

Retrieve all documents from the collec on of actorsdb.actors.find()

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 8

Page 9: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Update Opera onModifies / replaces an exis ng document / documents

dbdb .. collectioncollection .. updateupdate

(( queryquery ,, updateupdate

,, optionsoptions

))

• ParametersQuery: descrip on of documents to be updatedUpdate: modifica on ac ons to be appliedOp ons

Update operators• $set, $unset, $rename, $inc, $mul, $currentDate,

$push, $addToSet, $pop, $pull, …

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 9

Page 10: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Update Opera onUpdate the document of actor Ivan Trojan

db.actors.update({ _id: "trojan" },{ name: "Ivan Trojan", year: 1964 }

)

db.actors.update({ name: "Ivan Trojan", year: { $lt: 2000 } },{ name: "Ivan Trojan", year: 1964 }

)

• At most one document is updated• Its content is replaced with a new value

Check the current content of the documentdb.actors.find({ _id: "trojan" })

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 10

Page 11: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Update Opera onUse update method to insert a new actor

• Inserts a new document when upsert behavior was enabledand no document could be updated

db.actors.update({ _id: "geislerova" },{ name: "Anna Geislerova" },{ upsert: true }

)

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 11

Page 12: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Update Opera onTry to modify the document iden fier of an exis ng document

• Your request will be rejected sincedocument iden fiers are immutable

db.actors.update({ _id: "trojan" },{ _id: 1, name: "Ivan Trojan", year: 1964 }

)

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 12

Page 13: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Update Opera onUpdate the document of actor Ivan Trojan

db.actors.update({ _id: "trojan" },{

$set: { year: 1964, age: 52 },$inc: { rating: 1 },$push: { movies: { $each: [ "samotari", "medvidek" ] } }

})

Update mul ple documents at oncedb.actors.update(

{ year: { $lt: 2000 } },{ $set: { rating: 3 } },{ multi: true }

)

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 13

Page 14: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Save Opera onReplaces an exis ng / inserts a new document

dbdb .. collectioncollection .. savesave

(( documentdocument

,, optionsoptions

))

• ParametersDocument: document to be modified / insertedOp ons

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 14

Page 15: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Save Opera onUse save method to insert new actors

• Document iden fier must not be specified in the queryor must not yet exist in the collec on

db.actors.save({ name: "Tatiana Vilhelmova" })

db.actors.save({ _id: 6, name: "Sasa Rasilov" })

Use save method to update actor Ivan Trojan

• Document iden fier must be specified in the queryand must exist in the collec on

db.actors.save({ _id: "trojan", name: "Ivan Trojan", year: 1964 })

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 15

Page 16: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Remove Opera onRemoves a document / documents from a given collec on

dbdb .. collectioncollection .. removeremove

(( queryquery

,, optionsoptions

))

• ParametersQuery: descrip on of documents to be removedOp ons

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 16

Page 17: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Remove Opera onRemove selected documents from the collec on of actors

db.actors.remove({ _id: "geislerova" })

db.actors.remove({ year: { $lt: 2000 } },{ justOne: true }

)

Remove all the documents from the collec on of actorsdb.actors.remove({ })

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 17

Page 18: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Sample DataInsert the following actors into your emp ed collec on

{ _id: "trojan",name: "Ivan Trojan", year: 1964,movies: [ "samotari", "medvidek" ] }

{ _id: "machacek",name: "Jiri Machacek", year: 1966,movies: [ "medvidek", "vratnelahve", "samotari" ] }

{ _id: "schneiderova",name: "Jitka Schneiderova", year: 1973,movies: [ "samotari" ] }

{ _id: "sverak",name: "Zdenek Sverak", year: 1936,movies: [ "vratnelahve" ] }

{ _id: "geislerova",name: "Anna Geislerova", year: 1976 }

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 18

Page 19: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Find Opera onSelects documents from a given collec on

dbdb .. collectioncollection .. findfind

(( queryquery

,, projectionprojection

))

• ParametersQuery: descrip on of documents to be selectedProjec on: fields to be included / excluded in the result

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 19

Page 20: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

QueryingExecute and explain the meaning of the following queries

db.actors.find()

db.actors.find({ })

db.actors.find({ _id: "trojan" })

db.actors.find({ name: "Ivan Trojan", year: 1964 })

db.actors.find({ year: { $gte: 1960, $lte: 1980 } })

db.actors.find({ movies: { $exists: true } })

db.actors.find({ movies: "medvidek" })

db.actors.find({ movies: { $in: [ "medvidek", "pelisky" ] } })

db.actors.find({ movies: { $all: [ "medvidek", "pelisky" ] } })

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 20

Page 21: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

QueryingExecute and explain the meaning of the following queries

db.actors.find({ $or: [ { year: 1964 }, { rating: { $gte: 3 } } ] })

db.actors.find({ rating: { $not: { $gte: 3 } } })

db.actors.find({ }, { name: 1, year: 1 })

db.actors.find({ }, { movies: 0, _id: 0 })

db.actors.find({ }, { name: 1, movies: { $slice: 2 }, _id: 0 })

db.actors.find().sort({ year: 1, name: -1 })

db.actors.find().sort({ name: 1 }).skip(1).limit(2)

db.actors.find().sort({ name: 1 }).limit(2).skip(1)

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 21

Page 22: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Index StructuresMo va on

• Full collec on scanmust be conducted when searchingfor documents unless an appropriate index exists

Primary index• Unique index on values of the _id field• Created automa cally

Secondary indexes• Created manually for values of a given key field / fields• Always within just a single collec on

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 22

Page 23: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Index StructuresSecondary index crea on

dbdb .. collectioncollection .. createIndexcreateIndex (( keyskeys

,, optionsoptions

))

Defini on of keys (fields) to be involved

{{ fieldfield :: 11

-1-1

texttext

hashedhashed

2d2d

2dsphere2dsphere

,,

}}

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 23

Page 24: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Index StructuresIndex types

• 1, -1 – standard ascending / descending value indexesBoth scalar values and embedded documents can be indexed

• hashed – hash values of a single field are indexed• text – basic full-text index• 2d – points in planar geometry• 2dsphere – points in spherical geometry

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 24

Page 25: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Index StructuresIndex forms

• One key / mul ple keys (composed index)• Ordinary fields / array fields (mul -key index)

Index proper es• Unique – duplicate values are rejected (cannot be inserted)• Par al – only certain documents are indexed• Sparse – documents without a given field are ignored• TTL – documents are removed when a meout elapses

Just some type / form / property combina ons can be used!

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 25

Page 26: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

Index StructuresExecute the following query and study its execu on plan

db.actors.find({ movies: "medvidek" })

db.actors.find({ movies: "medvidek" }).explain()

Create a mul key index for movies of actorsdb.actors.createIndex({ movies: 1 })

Examine the execu on plan once again

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 26

Page 27: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

MapReduceExecutes aMapReduce job on a selected collec on

dbdb .. collectioncollection .. mapReducemapReduce

(( map functionmap function ,, reduce functionreduce function

,, optionsoptions

))

• ParametersMap: JavaScript implementa on of the Map func onReduce: JavaScript implementa on of the Reduce func onOp ons

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 27

Page 28: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

MapReduceMap func on

• Current document is accessible via this• emit(key, value) is used for emissions

Reduce func on• Intermediate key and values are provided as arguments• Reduced value is published via return

Op ons• query: only matching documents are considered• sort: they are processed in a specific order• limit: at most a given number of them is processed• out: output is stored into a given collec on

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 28

Page 29: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

MapReduce: ExampleCount the number of movies filmed in each year, star ng in 2005

db.movies.mapReduce(function() {

emit(this.year, 1);},function(key, values) {

return Array.sum(values);},{

query: { year: { $gte: 2005 } },sort: { year: 1 },out: "statistics"

})

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 29

Page 30: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

MapReduceImplement and execute the following MapReduce jobs

• Find a list of actors (their names sorted alphabe cally)for each year (they were born)

values.sort()Use out: { inline: 1 } op on

• Calculate the overall number of actors for each moviethis.movies.forEach(function(m) { … })Array.sum(values)Use out: { inline: 1 } op on once again

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 30

Page 31: Version 7 · DataModel Databasesystemstructure Instance! databases! collecons! documents • Database • Collecon Colleconofdocuments,usuallyofasimilarstructure • Document MongoDBdocument=oneJSONobject

ReferencesDocumenta on

• h ps://docs.mongodb.com/v3.2/

NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 31