when to use mongodb...and when you should not

50

Upload: mongodb

Post on 21-Jan-2017

2.448 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: When to Use MongoDB...and When You Should Not
Page 2: When to Use MongoDB...and When You Should Not

When should you use MongoDB

…. And when you should not….

Jake AngermanSr. Solutions Architect

MongoDB

Page 3: When to Use MongoDB...and When You Should Not

Agenda

• What is MongoDB?• What is MongoDB for?• What does MongoDB do very well…. and less well• What do customers do very well with MongoDB, and

what they do not do• Some unusual use cases• When you should use MongoDB

Page 4: When to Use MongoDB...and When You Should Not

CREATE APPLICATIONS NEVER BEFORE POSSIBLE

AGILE SCALABLE

Page 5: When to Use MongoDB...and When You Should Not

Factors Driving Modern Applications

Data• 90% data created in last 2 years

• 80% enterprise data is unstructured

• Unstructured data growing 2X rate

of structured data

Mobile• 2 Billion smartphones in 2015

• Mobile now >50% internet use

• 26 Billion devices on IoT by

2020

Social• 72% of internet use is social media

• 2 Billion active users monthly

• 93% of businesses use social media

Cloud• Compute costs declining 33% YOY

• Storage costs declining 38% YOY

• Network costs declining 27% YOY

Page 6: When to Use MongoDB...and When You Should Not

Who is Generating Data?

1.7B Internet users in 2015

Page 7: When to Use MongoDB...and When You Should Not

What is Generating Data?

2B smart phones in 2015

Page 8: When to Use MongoDB...and When You Should Not

Internet in 1971

Page 9: When to Use MongoDB...and When You Should Not

Internet in 2015

Page 10: When to Use MongoDB...and When You Should Not

Systems of Engagement• Fueled by mobile devices and sensors• Focus on Communication, Collaboration, Contextual• "The planet is wiring itself a new nervous system."• Enterprise IT must embrace consumer technology, not

the other way around• Systems of record are no longer adequate

1991

2011

IBM 3370 hard disk (571MB), 1979Edgar Codd, 1971 Lotus 1-2-3, 1983

Page 11: When to Use MongoDB...and When You Should Not

What is MongoDB for?

• The data store for all systems of engagement – Demanding, real-time SLAs– Diverse, mixed data sets– Massive concurrency– Globally deployed over multiple sites– No downtime tolerated– Able to grow with user needs– High uncertainty in sizing– Fast scaling needs– Delivers a seamless and consistent experience

Page 12: When to Use MongoDB...and When You Should Not

Expressive Query

Language

StrongConsistency

Secondary Indexes

Flexibility

Scalability

Performance

Relational

Page 13: When to Use MongoDB...and When You Should Not

NoSQL

Expressive Query

Language

StrongConsistency

Secondary Indexes

Flexibility

Scalability

Performance

Page 14: When to Use MongoDB...and When You Should Not

Expressive Query

Language

StrongConsistency

Secondary Indexes

Flexibility

Scalability

Performance

Relational NoSQL

Relational + NoSQL

Page 15: When to Use MongoDB...and When You Should Not

Expressive Query

Language

StrongConsistency

Secondary Indexes

Flexibility

Scalability

Performance

Nexus Architecture

Relational + NoSQL

Page 16: When to Use MongoDB...and When You Should Not

What MongoDB is NOT

• An analytical suite– Not competing with SAS or SPSS

• A data warehouse technology– Not competing with Teradata, Netezza, Vertica

• A BI tool– Not competing with Tableau or QlikView

• Backoffice transaction processing– Not competing with IBM Mainframes

• Backend for a billing system or general ledger system– Not competing with Oracle RAC

• A search engine– Not competing with Elasticsearch or SOLR

Page 17: When to Use MongoDB...and When You Should Not

MongoDB and Enterprise IT Stack

Page 18: When to Use MongoDB...and When You Should Not

MongoDB and Enterprise IT Stack

OLTP OLAP

Page 19: When to Use MongoDB...and When You Should Not

MongoDB Strategic Advantages

Horizontally Scalable-Sharding

AgileFlexible

High Performance &Strong Consistency

Application

HighlyAvailable-Replica Sets

{ author: “eliot”, date: new Date(), text: “MongoDB”, tags: [“database”, “flexible”, “JSON”]}

Page 20: When to Use MongoDB...and When You Should Not

Document Data Model

Relational MongoDB

{ first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location:

[45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } ]}

Page 21: When to Use MongoDB...and When You Should Not

Do More With Your Data

MongoDB{ first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } }}

Rich QueriesFind Paul’s cars

Find everybody in London with a car built between 1970 and 1980

Geospatial Find all of the car owners within 5km of Trafalgar Sq.

Text Search Find all the cars described as having leather seats

Aggregation Calculate the average value of Paul’s car collection

Map ReduceWhat is the ownership pattern of colors by geography over time?(is purple trending up in China?)

Page 22: When to Use MongoDB...and When You Should Not

How Databases Stack UpRequirement RDBMS Key/value Wide column MongoDB

Hierarchical data Poor Poor Good Great

Dynamic schema Poor Poor Poor Great

Native OOP lang Poor Great Great Great

Software cost Poor Great Great Great

Performance Poor Great Great Great

Scale Poor Great Great Great

Data consistency Great Poor Poor Great

Rich querying Great Poor Poor Great

Ease of use Good Good Poor Great

Page 23: When to Use MongoDB...and When You Should Not

Requirement RDBMS Key/value Wide column MongoDB

Hierarchical data Poor Poor Good Great

Dynamic schema Poor Poor Poor Great

Native OOP lang Poor Great Great Great

Software cost Poor Great Great Great

Performance Poor Great Great Great

Scale Poor Great Great Great

Data consistency Great Poor Poor Great

Rich querying Great Poor Poor Great

Ease of use Good Good Poor Great

How Databases Stack Up

VALUE OF NOSQL

Page 24: When to Use MongoDB...and When You Should Not

Requirement RDBMS Key/value Wide column MongoDB

Hierarchical data Poor Poor Good Great

Dynamic schema Poor Poor Poor Great

Native OOP lang Poor Great Great Great

Software cost Poor Great Great Great

Performance Poor Great Great Great

Scale Poor Great Great Great

Data consistency Great Poor Poor Great

Rich querying Great Poor Poor Great

Ease of use Good Good Poor Great

How Databases Stack Up

VALUE OF NOSQL

VALUE OF MONGODB

Page 25: When to Use MongoDB...and When You Should Not

MongoDB does well

• Straightforward replication• High performance on mixed workloads

of reads, inserts, and updates• Scaling on demand• Location based deployment• Geo spatial queries• High Availability and auto failover• Flexible schema & secondary indexing• Agile development in most

programming languages• Commodity infrastructure• Real time analytics• Text indexing• Data consistency• Compression

As a database, where does MongoDB shine?

Easy to initiateAll reads, mixed, and mostly writes

No expensive overprovisioningOne cluster can span the globeEasy to build relevant mobile appsLow stress operationsNo need for complex data modelingNo need to give up your favorite development languageNo vendor lock-in through hardwareGet value from data right away !Basic search featureSimpler app design With new version 3.0

Page 26: When to Use MongoDB...and When You Should Not

MongoDB does less well

• Resource management *

• Collection scanning under load *

• Absolute write availability

• Faceted search

• Joins across collections

• SQL*

• Transactions over multiple docs

As a database, where does MongoDB shine?

Needs to be done at infrastructure level

Concurrent scans can disrupt the working setConsistency vs Availability

Core value of search engines

Doc model mitigates need for this

Some partial solutions (ODBC)

Pushed to application level. Rarely needed with good schema design

Page 27: When to Use MongoDB...and When You Should Not

MongoDB Use Cases

Single View Internet of Things Mobile Real-Time Analytics

Catalog Personalization Content Management

Page 28: When to Use MongoDB...and When You Should Not

MongoDB is good for

• Single View• Internet of Things – sensor data• Mobile apps – geospatial• Real-time analytics• Catalog• Personalization• Content management• Inventory management• Personalization engines• Shopping cart• Dependent datamarts• Archiving for fast lookup• Collaboration tools• Messaging applications• Log file aggregation• Caching• Adserving• ……

Use Cases where MongoDB shines

Mixture of analytics and archiving

Build information from data as it comes in

Extract from DW for analysisLarge volume, targeted queriesSharing in near real timeTwitter-like appsE.g., SPLUNKEnable massive reads on consolidated data

Page 29: When to Use MongoDB...and When You Should Not

MongoDB is less good for

• Search engine

• Slicing and dicing of data requiring joins and full collections scans

• Nanosecond latency writing (real time tick data)

• Uptime beyond 99.999%, instant failover

• Batch processing

Use cases where MongoDB shines

Text indexing only for elementary uses

Working set should fit in RAM.

Specialty DBs like Kdb are built for this

MongoDB needs a few seconds for a failover

That’s what Hadoop is for….

Note: transaction processing does not require database transactions. Move money from account A to account B is never instantaneous and requires actual processing…. Usually in batch

Page 30: When to Use MongoDB...and When You Should Not

Strategic Reporting

Data Consolidation

Data Warehouse

Real-time orBatch

Engagement Applicaiton

Engagement Applicaiton

Operational Data Hub Benefits• Real-time• Complete details• Agile• Higher customer

retention• Increase wallet share• Proactive exception

handling

Operational Reporting

Cards

Loans

Deposits

Cards Data Source 1

LoansData Source 2

Deposits

Data Source n

Page 31: When to Use MongoDB...and When You Should Not

Data Hub for Large Investment Bank

Feeds & Batch data• Pricing• Accounts• Securities Master• Corporate actions

Source Master Data

(RDBMS)

Batch

Batch Batch

Batch

Batch

Batch

Batch

DestinationData

(RDBMS)

Each represents• People $• Hardware $• License $• Reg penalty $• & other downstream

problems

Page 32: When to Use MongoDB...and When You Should Not

Data Hub for Large Investment Bank

Feeds & Batch data• Pricing• Accounts• Securities Master• Corporate actions

Source Master Data

(RDBMS)

Batch

Batch Batch

Batch

Batch

Batch

Batch

DestinationData

(RDBMS)

Each represents• People $• Hardware $• License $• Reg penalty $• & other downstream

problems

• Delays up to 36 hours in distributing data by batch

• Charged multiple times globally for same data

• Incurring regulatory penalties from missing SLAs

• Had to manage 20 distributed systems with same data

Page 33: When to Use MongoDB...and When You Should Not

Data Hub for Large Investment Bank

Feeds & Batch data• Pricing• Accounts• Securities Master• Corporate actions

Real-time

Real-time Real-time

Real-time

Real-time

Real-time

Real-time

Each represents• No people $• Less hardware $• Less license $• No penalty $• & many less problems

MongoDB Secondaries

MongoDB Primary

Page 34: When to Use MongoDB...and When You Should Not

Data Hub for Large Investment Bank

Feeds & Batch data• Pricing• Accounts• Securities Master• Corporate actions

Real-time

Real-time Real-time

Real-time

Real-time

Real-time

Real-time

Each represents• No people $• Less hardware $• Less license $• No penalty $• & many less problems

MongoDB Secondaries

MongoDB Primary

• Will save about $40,000,000 in costs and penalties over 5 years

• Only charged once for data

• Data in sync globally and read locally

• Capacity to move to one global shared data service

Page 35: When to Use MongoDB...and When You Should Not

IoT: Large Industrial Vehicle Manufacturer

Shard 1Secondary

Shard 2Secondary

Shard 3Secondary

Shard 1Primary

Shard 1Secondary

Shard 1Primary

Shard 1Secondary

Shard 1Primary

Shard 1Secondary

Central Hub

RegionalHub

RegionalHub

RegionalHub

Page 36: When to Use MongoDB...and When You Should Not

Molecular Similarity Database

• Store Chemical Compound Fingerprints• Find compounds which are “close” to a given

compound• Tanimoto association coefficient compares two

compounds based on their common fingerprints

• Aggregation framework $setIntersection

Source: Chemical Similarity Search in MongoDB by Matt Swain

01001011 [2, 5, 7, 8, …]

Page 37: When to Use MongoDB...and When You Should Not

Equity Price Database• Equity prices: 77M float64 equals 600MB• 3.5M rows/sec Python, 15M rows/sec Java

- versus 15 to 40 seconds for proprietary tick database• MongoDB throughput doubles as worker threads double

Source: James Blackburn

Page 38: When to Use MongoDB...and When You Should Not

Seismic Modeling• 2000 x 2000 x 2000 cubic data set• 8 billion floats• Relational model can take several

minutes for some calculations• MongoDB query performs in ~1 second

{   "_id": ObjectId("55e7358e1a317d0fb177b31e"),   "x": 100,   "y": 25,   "z": [0.8506244646719524,     0.18891124618195854,     0.14090160846138955, ...    ] }

Page 39: When to Use MongoDB...and When You Should Not

• Store files larger than 16MB i.e. video, images• Atomically sync files with their metadata• Shard and distribute around the cluster

GridFS

doc.jpg doc.jpg(meta data) doc.jpg

(1)

GridFSAPI

fs.files fs.chunks

Drive

Page 40: When to Use MongoDB...and When You Should Not

What database do you need for your business?

Page 41: When to Use MongoDB...and When You Should Not

What vehicle do you want for a race?

Page 42: When to Use MongoDB...and When You Should Not

WHAT ARE YOU TRYING TO ACHIEVE?

Page 43: When to Use MongoDB...and When You Should Not

The important aspect of MongoDB

• MongoDB was not designed for niche use cases• MongoDB strives to have excellent

characteristics applicable to a very broad range of use cases

MongoDB is the most balanced database for Enterprise applications and performance

Page 44: When to Use MongoDB...and When You Should Not

Technical: Why MongoDB

• High performance (1000’s – millions queries / sec) - reads & writes

• Need flexible schema, rich querying with any number of secondary indexes

• Need for replication across multiple data centers, even globally

• Need to deploy rapidly and scale on demand (start small and fast, grow easily)

• 99.999% availability

• Real time analysis in the database, under load

• Geospatial querying• Processing in real time, not in

batch• Need to promote agile coding

methodologies• Deploy over commodity

computing and storage architectures

• Point in Time recovery• Need strong data consistency• Advanced security

Page 45: When to Use MongoDB...and When You Should Not

Technical: Why MongoDB

• High performance (1000’s – millions queries / sec) - reads & writes

• Need flexible schema, rich querying with any number of secondary indexes

• Need for replication across multiple data centers, even globally

• Need to deploy rapidly and scale on demand (start small and fast, grow easily)

• 99.999% availability

• Real time analysis in the database, under load

• Geospatial querying• Processing in real time, not in

batch• Need to promote agile coding

methodologies• Deploy over commodity

computing and storage architectures

• Point in Time recovery• Need strong data consistency• Advanced security

If 3 or more apply to you….

you should consider MongoDB

Page 46: When to Use MongoDB...and When You Should Not

Business: Why MongoDB

• Management tooling and services• Ease of hiring • Commercial license• Ease of developer adoption• Global Support• Global Professional Services• IT ecosystem integration• Company stability• De facto standard for next generation database

Page 47: When to Use MongoDB...and When You Should Not

Business: Why MongoDB

• Management tooling and services• Ease of hiring • Commercial license• Ease of developer adoption• Global Support• Global Professional Services• IT ecosystem integration• Company stability• De facto standard for next generation databaseIf 2 or m

ore are relevant to you….

you should consider MongoDB

Page 48: When to Use MongoDB...and When You Should Not

Summary

• MongoDB is for Systems of Engagement• Complements search engines, Hadoop and Data

Warehouses– Does not replace these technologies

• Wide range of use cases – and that’s the core point !– Excellent across many possible use cases, not just a few

• Recognized by Gartner and Forrester• De facto standard for next generation database• Enterprise maturity and integration

Page 49: When to Use MongoDB...and When You Should Not

We Can HelpMongoDB Enterprise AdvancedThe best way to run MongoDB in your data center

MongoDB Cloud ManagerThe easiest way to run MongoDB in the cloud

Production SupportIn production and under control

Development SupportLet’s get you running

ConsultingWe solve problems

TrainingGet your teams up to speed

Page 50: When to Use MongoDB...and When You Should Not