back to basics webinar 1 - introduction to nosql
TRANSCRIPT
Code JoeD gets you a 25% discount off the list priceEarly Bird Registration Ends May 13, 2016
Back to Basics 2016 : Webinar 1
Introduction to NoSQLJoe Drumgoole
Director of Developer Advocacy, EMEAMongoDB
@jdrumgoole
V1.0
Welcome!
5
Course Agenda
Date Time Webinar05-May-2016 14:00 GMT Introduction to NoSQL24-May-2016 14.00 GMT Your First MongoDB Application14-Jun-2016 14:00 GMT Schema Design – Thinking in Documents05-July-2016 14:00 GMT Advanced Indexing : Text and Geo-Spatial Indexes14-July-2016 14:00 GMT Introduction to the Aggregation Framework11-Aug-2016 14:00 GMT Production Deployment
6
Agenda for Today
• Why NoSQL• The different types of NoSQL database• Detailed overview of MongoDB• MongoDB data durability – Replica Sets• MongoDB scalability – Sharding• Q&A
7
Relational
Expressive Query Language& Secondary Indexes
Strong Consistency
Enterprise Management& Integrations
8
The World Has Changed
Data Risk Time Cost
9
NoSQL
Scalability& Performance
Always On,Global Deployments
FlexibilityExpressive Query Language& Secondary Indexes
Strong Consistency
Enterprise Management& Integrations
10
Nexus Architecture
Scalability& Performance
Always On,Global Deployments
FlexibilityExpressive Query Language& Secondary Indexes
Strong Consistency
Enterprise Management& Integrations
11
Types of NoSQL Database
• Key/Value Stores• Column Stores• Graph Stores• Multi-model Databases• Document Stores
12
Key Value Stores
• An associative array• Single key lookup• Very fast single key lookup• Not so hot for “reverse lookups”
Key Value
12345 4567.3456787
12346 { addr1 : “The Grange”, addr2: “Dublin” }
12347 “top secret password”
12358 “Shopping basket value : 24560”
12787 12345
13
Revision : Row Stores (RDBMS)
• Store data aligned by rows (traditional RDBMS, e.g MySQL)• Reads retrieve a complete row everytime• Reads requiring only one or two columns are wasteful
ID Name Salary Start Date
1 Joe D $24000 1/Jun/1970
2 Peter J $28000 1/Feb/1972
3 Phil G $23000 1/Jan/1973
1 Joe D $24000 1/Jun/1970 2 Peter J $28000 1/Feb/1972 3 Phil G $23000 1/Jan/1973
14
How a Column Store Does it
1 2 3
ID Name Salary Start Date
1 Joe D $24000 1/Jun/1970
2 Peter J $28000 1/Feb/1972
3 Phil G $23000 1/Jan/1973
Joe D Peter J Phil G $24000 $28000 $23000 1/Jun/1970 1/Feb/1972 1/Jan/1973
15
Why is this Attractive?
• A series of consecutive seeks can retrieve a column efficiently• Compressing similar data is super efficient• So reads can grab more data off disk in a single seek• How do I align my rows? By order or by inserting a row ID• IF you just need a small number of columns you don’t need to
read all the rows• But:
– Updating and deleting by row is expensive• Append only is preferred• Better for OLAP than OLTP
16
Graph Stores
• Store graphs (edges and vertexes)• E.g. social networks• Designed to allow efficient traversal• Optimised for representing connections• Can be implemented as a key value stored with the ability to store
links• If your use case is not a graph you don’t need a graph database
17
Multi-Model Databases
• Combine multiple storage/access models• Often Graph plus “something else”• Fixes the “polyglot persistence” issue of keeping multiple
independent databases consistent• The “new new thing” in NoSQL Land• Expect to hear more noise about these kinds of databases
18
Document Store• Not PDFs, Microsoft Word or HTML• Documents are nested structures created using Javascript Object Notation (JSON)
{ name : “Joe Drumgoole”,title : “Director of Developer Advocacy”,Address : {
address1 : “Latin Hall”,address2 : “Golden Lane”,eircode : “D09 N623”,
}expertise: [ “MongoDB”, “Python”, “Javascript” ],employee_number : 320,location : [ 53.34, -6.26 ]
}
19
MongoDB Documents are Typed
{
name : “Joe Drumgoole”,
title : “Director of Developer Advocacy”,
Address : {
address1 : “Latin Hall”,
address2 : “Golden Lane”,
eircode : “D09 N623”,
}
expertise: [ “MongoDB”, “Python”, “Javascript” ],
employee_number : 320,
location : [ 53.34, -6.26 ]
}
Strings
Nested Document
Array
Integer
Geo-spatial Coordinates
20
MongoDB Understands JSON Documents
• From the very first version it was a native JSON database• Understands and can index the sub-structures• Stores JSON as a binary format called BSON• Efficient for encoding and decoding for network transmission• MongoDB can create indexes on any document field• (We will cover these areas in detail later on in the course)
21
Why Documents?• Dynamic Schema• Elimination of Object/Relational Mapping Layer• Implicit denormalisation of the data for performance
22
Why Documents?• Dynamic Schema• Elimination of Object/Relational Mapping Layer• Implicit denormalisation of the data for performance
23
MongoDB is Full Featured
Rich Queries
• Find Paul’s cars• Find everybody in London with a car
between 1970 and 1980
Geospatial • Find all of the car owners within 5km of Trafalgar Sq.
Text Search • Find all the cars described as having leather seats
Aggregation • Calculate the average value of Paul’s car collection
Map Reduce
• What is the ownership pattern of colors by geography over time (is purple trending in China?)
24
High Availability and Data Durability – Replica Sets
SecondarySecondary
Primary
25
Replica Set Creation
SecondarySecondary
Primary
Heartbeat
26
Replica Set Node Failure
SecondarySecondary
Primary
No Heartbeat
27
Replica Set Recovery
SecondarySecondary
HeartbeatAnd Election
28
New Replica Set – 2 Nodes
SecondaryPrimary
HeartbeatAnd New Primary
29
Replica Set Repair
SecondaryPrimary
Secondary
Rejoin and resync
30
Replica Set Stable
SecondaryPrimary
Secondary
Heartbeat
31
Scalability with Sharding
Shard 1 Shard 2 Shard N
32
Scalability with Sharding
• Shard key partitions the content• MongoDB automatically balances the cluster• Shards can be added dynamically to a live system• Rebalancing happens in the background• Shard key is immutable• Shard key can vector queries to a specific shard• Queries without a shard key are sent to all members
33
Scalability with ShardingMongoS MongoS
Shard 1 Shard 2 Shard N
Shard Key
34
Query Routing
• With a sharded cluster we use a routing layer to guide queries• We use a daemon called MongoS (Mongo Shard Router)• Daemon is stateless• Can run as many as required• Typically one per app server
35
Summary
• Why NoSQL exists• The types of NoSQL database• The key features of MongoDB• Data durability in MongoDB• Scalability in MongoDB
36
Next Webinar – Your First MongoDB Application
• 24th May 2016 – 14:00 GMT.• Make sure to register if you haven’t already• Learn how to build your first MongoDB application• Create databases and collections• Look at queries• Build indexes• Start to understand performance• Register at: http://bit.ly/1UA4BGM• Send feedback to [email protected]
Q&A