back to basics webinar 1 - introduction to nosql

38

Upload: joe-drumgoole

Post on 06-Apr-2017

250 views

Category:

Software


1 download

TRANSCRIPT

Page 1: Back to Basics Webinar 1 - Introduction to NoSQL
Page 2: Back to Basics Webinar 1 - Introduction to NoSQL

Code JoeD gets you a 25% discount off the list priceEarly Bird Registration Ends May 13, 2016

Page 3: Back to Basics Webinar 1 - Introduction to NoSQL

Back to Basics 2016 : Webinar 1

Introduction to NoSQLJoe Drumgoole

Director of Developer Advocacy, EMEAMongoDB

@jdrumgoole

V1.0

Page 4: Back to Basics Webinar 1 - Introduction to NoSQL

Welcome!

Page 5: Back to Basics Webinar 1 - Introduction to NoSQL

5

Course Agenda

Date Time Webinar05-May-2016 14:00 GMT Introduction to NoSQL24-May-2016 14.00 GMT Your First MongoDB Application14-Jun-2016 14:00 GMT Schema Design – Thinking in Documents05-July-2016 14:00 GMT Advanced Indexing : Text and Geo-Spatial Indexes14-July-2016 14:00 GMT Introduction to the Aggregation Framework11-Aug-2016 14:00 GMT Production Deployment

Page 6: Back to Basics Webinar 1 - Introduction to NoSQL

6

Agenda for Today

• Why NoSQL• The different types of NoSQL database• Detailed overview of MongoDB• MongoDB data durability – Replica Sets• MongoDB scalability – Sharding• Q&A

Page 7: Back to Basics Webinar 1 - Introduction to NoSQL

7

Relational

Expressive Query Language& Secondary Indexes

Strong Consistency

Enterprise Management& Integrations

Page 8: Back to Basics Webinar 1 - Introduction to NoSQL

8

The World Has Changed

Data Risk Time Cost

Page 9: Back to Basics Webinar 1 - Introduction to NoSQL

9

NoSQL

Scalability& Performance

Always On,Global Deployments

FlexibilityExpressive Query Language& Secondary Indexes

Strong Consistency

Enterprise Management& Integrations

Page 10: Back to Basics Webinar 1 - Introduction to NoSQL

10

Nexus Architecture

Scalability& Performance

Always On,Global Deployments

FlexibilityExpressive Query Language& Secondary Indexes

Strong Consistency

Enterprise Management& Integrations

Page 11: Back to Basics Webinar 1 - Introduction to NoSQL

11

Types of NoSQL Database

• Key/Value Stores• Column Stores• Graph Stores• Multi-model Databases• Document Stores

Page 12: Back to Basics Webinar 1 - Introduction to NoSQL

12

Key Value Stores

• An associative array• Single key lookup• Very fast single key lookup• Not so hot for “reverse lookups”

Key Value

12345 4567.3456787

12346 { addr1 : “The Grange”, addr2: “Dublin” }

12347 “top secret password”

12358 “Shopping basket value : 24560”

12787 12345

Page 13: Back to Basics Webinar 1 - Introduction to NoSQL

13

Revision : Row Stores (RDBMS)

• Store data aligned by rows (traditional RDBMS, e.g MySQL)• Reads retrieve a complete row everytime• Reads requiring only one or two columns are wasteful

ID Name Salary Start Date

1 Joe D $24000 1/Jun/1970

2 Peter J $28000 1/Feb/1972

3 Phil G $23000 1/Jan/1973

1 Joe D $24000 1/Jun/1970 2 Peter J $28000 1/Feb/1972 3 Phil G $23000 1/Jan/1973

Page 14: Back to Basics Webinar 1 - Introduction to NoSQL

14

How a Column Store Does it

1 2 3

ID Name Salary Start Date

1 Joe D $24000 1/Jun/1970

2 Peter J $28000 1/Feb/1972

3 Phil G $23000 1/Jan/1973

Joe D Peter J Phil G $24000 $28000 $23000 1/Jun/1970 1/Feb/1972 1/Jan/1973

Page 15: Back to Basics Webinar 1 - Introduction to NoSQL

15

Why is this Attractive?

• A series of consecutive seeks can retrieve a column efficiently• Compressing similar data is super efficient• So reads can grab more data off disk in a single seek• How do I align my rows? By order or by inserting a row ID• IF you just need a small number of columns you don’t need to

read all the rows• But:

– Updating and deleting by row is expensive• Append only is preferred• Better for OLAP than OLTP

Page 16: Back to Basics Webinar 1 - Introduction to NoSQL

16

Graph Stores

• Store graphs (edges and vertexes)• E.g. social networks• Designed to allow efficient traversal• Optimised for representing connections• Can be implemented as a key value stored with the ability to store

links• If your use case is not a graph you don’t need a graph database

Page 17: Back to Basics Webinar 1 - Introduction to NoSQL

17

Multi-Model Databases

• Combine multiple storage/access models• Often Graph plus “something else”• Fixes the “polyglot persistence” issue of keeping multiple

independent databases consistent• The “new new thing” in NoSQL Land• Expect to hear more noise about these kinds of databases

Page 18: Back to Basics Webinar 1 - Introduction to NoSQL

18

Document Store• Not PDFs, Microsoft Word or HTML• Documents are nested structures created using Javascript Object Notation (JSON)

{ name : “Joe Drumgoole”,title : “Director of Developer Advocacy”,Address : {

address1 : “Latin Hall”,address2 : “Golden Lane”,eircode : “D09 N623”,

}expertise: [ “MongoDB”, “Python”, “Javascript” ],employee_number : 320,location : [ 53.34, -6.26 ]

}

Page 19: Back to Basics Webinar 1 - Introduction to NoSQL

19

MongoDB Documents are Typed

{

name : “Joe Drumgoole”,

title : “Director of Developer Advocacy”,

Address : {

address1 : “Latin Hall”,

address2 : “Golden Lane”,

eircode : “D09 N623”,

}

expertise: [ “MongoDB”, “Python”, “Javascript” ],

employee_number : 320,

location : [ 53.34, -6.26 ]

}

Strings

Nested Document

Array

Integer

Geo-spatial Coordinates

Page 20: Back to Basics Webinar 1 - Introduction to NoSQL

20

MongoDB Understands JSON Documents

• From the very first version it was a native JSON database• Understands and can index the sub-structures• Stores JSON as a binary format called BSON• Efficient for encoding and decoding for network transmission• MongoDB can create indexes on any document field• (We will cover these areas in detail later on in the course)

Page 21: Back to Basics Webinar 1 - Introduction to NoSQL

21

Why Documents?• Dynamic Schema• Elimination of Object/Relational Mapping Layer• Implicit denormalisation of the data for performance

Page 22: Back to Basics Webinar 1 - Introduction to NoSQL

22

Why Documents?• Dynamic Schema• Elimination of Object/Relational Mapping Layer• Implicit denormalisation of the data for performance

Page 23: Back to Basics Webinar 1 - Introduction to NoSQL

23

MongoDB is Full Featured

Rich Queries

• Find Paul’s cars• Find everybody in London with a car

between 1970 and 1980

Geospatial • Find all of the car owners within 5km of Trafalgar Sq.

Text Search • Find all the cars described as having leather seats

Aggregation • Calculate the average value of Paul’s car collection

Map Reduce

• What is the ownership pattern of colors by geography over time (is purple trending in China?)

Page 24: Back to Basics Webinar 1 - Introduction to NoSQL

24

High Availability and Data Durability – Replica Sets

SecondarySecondary

Primary

Page 25: Back to Basics Webinar 1 - Introduction to NoSQL

25

Replica Set Creation

SecondarySecondary

Primary

Heartbeat

Page 26: Back to Basics Webinar 1 - Introduction to NoSQL

26

Replica Set Node Failure

SecondarySecondary

Primary

No Heartbeat

Page 27: Back to Basics Webinar 1 - Introduction to NoSQL

27

Replica Set Recovery

SecondarySecondary

HeartbeatAnd Election

Page 28: Back to Basics Webinar 1 - Introduction to NoSQL

28

New Replica Set – 2 Nodes

SecondaryPrimary

HeartbeatAnd New Primary

Page 29: Back to Basics Webinar 1 - Introduction to NoSQL

29

Replica Set Repair

SecondaryPrimary

Secondary

Rejoin and resync

Page 30: Back to Basics Webinar 1 - Introduction to NoSQL

30

Replica Set Stable

SecondaryPrimary

Secondary

Heartbeat

Page 31: Back to Basics Webinar 1 - Introduction to NoSQL

31

Scalability with Sharding

Shard 1 Shard 2 Shard N

Page 32: Back to Basics Webinar 1 - Introduction to NoSQL

32

Scalability with Sharding

• Shard key partitions the content• MongoDB automatically balances the cluster• Shards can be added dynamically to a live system• Rebalancing happens in the background• Shard key is immutable• Shard key can vector queries to a specific shard• Queries without a shard key are sent to all members

Page 33: Back to Basics Webinar 1 - Introduction to NoSQL

33

Scalability with ShardingMongoS MongoS

Shard 1 Shard 2 Shard N

Shard Key

Page 34: Back to Basics Webinar 1 - Introduction to NoSQL

34

Query Routing

• With a sharded cluster we use a routing layer to guide queries• We use a daemon called MongoS (Mongo Shard Router)• Daemon is stateless• Can run as many as required• Typically one per app server

Page 35: Back to Basics Webinar 1 - Introduction to NoSQL

35

Summary

• Why NoSQL exists• The types of NoSQL database• The key features of MongoDB• Data durability in MongoDB• Scalability in MongoDB

Page 36: Back to Basics Webinar 1 - Introduction to NoSQL

36

Next Webinar – Your First MongoDB Application

• 24th May 2016 – 14:00 GMT.• Make sure to register if you haven’t already• Learn how to build your first MongoDB application• Create databases and collections• Look at queries• Build indexes• Start to understand performance• Register at: http://bit.ly/1UA4BGM• Send feedback to [email protected]

Page 37: Back to Basics Webinar 1 - Introduction to NoSQL

Q&A

Page 38: Back to Basics Webinar 1 - Introduction to NoSQL