mongodb capacity planning

Post on 15-Jul-2015

575 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

MongoDB Capacity Planning

Norberto Leite

Technical Evangelist

norberto@mongodb.com

@nleite

Capacity Planning

• What is Capacity Planning ?

• Why is it important

• Which resources are affected?

• How to do it?

https://tingbudongchine.files.wordpress.com/2012/08/lemonde1.jpeg

What is Capacity Planning?

Fine Art of …

Requirements

Fine Art of …

Requirements

Resources

Preparing for Launch

• Developers are about to finish final Sprint

• Code is good (so they say )

• You feeling confortable to launch soon

• How to deploy?

Requirements

• Availability

– Uptime requirements: RPO and RTO

• Throughput

– Average read/writes/users

– Peek throughput

– Operations per second ? per day? per month?

• Responsiveness

– What's the acceptable latency?

• Higher during peek time?

RTO=recovery time objective RPO=recovery point objective

Resources

Resources

• CPU

• Storage

• Memory

• Network

Requirements vs Resources

Throughput

Availability

Responsiveness

Resource Usage

• Storage

• IOPS

• Size

• Data & Loading

Patterns

• CPU

• Speed

• Cores

• Memory

• Working Set

• Network

• Latency

• Throughput

Why is that Important?

Why

• Once we launch, we don't want to have avoidable down

time due to poorly selected HW

• As our success grows we want to stay in front of the

demand curve

• We want to meet business and users expectations

• We want to keep our jobs!

• Don't be the "goat"

Under allocation

Over Capacity

Over spending

Important Aspects

• Capacity

– Under

– Over

– Just Right?

• Prediction Models

– User/Load

– OPS/Request

– System Behavior (stress testing anyone?)

• Change Velocity

– Data / Resource-Allocation / Provisioning

– Minimum Viable Product?

– Future Releases / Roadmap

Important Aspects

• When?

– Not too early

– Before is too late!

– Iterative Process

Launch Version 2

Important Aspects

• When?

– Not too early

– Before is too late!

– Iterative Process

Launch Version 2

Important Aspects

• When?

– Not too early

– Before is too late!

– Iterative Process

Launch Version 2

http://www.mandywalker.com.au/wp-content/uploads/2013/07/Wall-with-Tools.jpg

Which resources are affected?

CPU

• Non-indexed Data

• Sorting

• Aggregation

– Map/Reduce

– Aggregation Framework

• Data

– Fields

– Nesting

– Arrays/Embedded-Docs

Network

• Latency

– WriteConcern

– ReadPreference

– Batching

• Throughput

– Update/Write Patterns

– Reads/Queries

Network

• Latency

– W:?

– Nearst

– Bulk Write Operations

• Throughput

– Use $set operator

– Filtering fields on queries

Storage

• Active

• Archival

• Loading Patterns

• Integration (BI/DW)

Storage Capability

Type IOPS

7200 rpm SATA ~ 75 – 100

15000 rpm SAS ~ 175 – 210

http://en.wikipedia.org/wiki/IOPS

Storage Capability

Type IOPS

7200 rpm SATA ~ 75 – 100

15000 rpm SAS ~ 175 – 210

SSD Intel X25-E (SLC) ~ 5000

SSD Intel X25-M G2 (MLC) ~ 8000

http://en.wikipedia.org/wiki/IOPS

Storage Capability

Type IOPS

7200 rpm SATA ~ 75 – 100

15000 rpm SAS ~ 175 – 210

SSD Intel X25-E (SLC) ~ 5000

SSD Intel X25-M G2 (MLC) ~ 8000

Amazon EBS ~ 100

Amazon EBS Provisioned Up to 2000

Amazon EBS Provisioned IOPS (SSD) ~3000

http://en.wikipedia.org/wiki/IOPS

Storage Capability

Type IOPS

7200 rpm SATA ~ 75 – 100

15000 rpm SAS ~ 175 – 210

SSD Intel X25-E (SLC) ~ 5000

SSD Intel X25-M G2 (MLC) ~ 8000

Amazon EBS ~ 100

Amazon EBS Provisioned Up to 2000

Amazon EBS Provisioned IOPS (SSD) ~3000

FusionIO ~135 000

Violin Memory 6000 ~ 1 000 000

http://en.wikipedia.org/wiki/IOPS

Higher IOPS higher the Cost!!!

Storage Considerations

• Work out how much data you need to write per unit of

time!

• Databases will use storage to persist data

– More data = Bigger indexes = More Storage

• MongoDB Stores Information into Documents

• BSON Format

– http://bsonspec.org/

Memory

• Working Set

– Active Data in Memory

– Measured Over Periods

• And other operations

– Sorting

– Aggregation

– Connections

• MongoDB Storage Engine

– VMMAP

– Memory Mapped Files

Memory Mapped Files

Memory Usage

• Data & Indexes memory mapped into virtual address

space

• Data access is paged into RAM

• OS evicts using LRU

• More frequently used pages stay in RAM

http://blogdailyherald.com/wp-content/uploads/2013/05/3879-animated_gif-chuck_norris-dodgeball-thumbs_up.gif

How to do it!

Basic Rules

• Determine your Working Set

• Use good Measuring and Monitoring practices

• Plan ahead but be flexible!

• Iterate

– Review Requirements

– Review Capacity

Working Set

Number of Active Users on

the system at any one time

Number of distinct pages

accessed per second

Working Set

Working Set

4 distinct pages per second

RAM

Disk

Working Set

4 distinct pages per second

RAM

Disk

Worst case 4 disk accesses

Working Set

6 distinct pages per second

RAM

Disk

Working Set

6 distinct pages per second

Disk

Working Set

6 distinct pages per second

Worst case disk access on every op

Memory & Storage

MOPs

PFs

Working Set

• Capacity sizing to hold working set + indexes

• Allow room to grow

• If working set is larger than RAM and you can't

reasonably add more resources

– Shard!

– Lots of little instances vs few big instances

• Think about architecture

– Local disk vs central storage

– How many copies of data do I need for availability

reasons

Measuring & Monitoring

• What to measure

– IOPS

– Page Faults

– Resident Memory (Working Set)

– Connections

– Lock %

• How to measure and monitor

– iostat

– vmstat

– mongostat

– mongopref

– MMS

iostat

vmstat

mongostat

• Quick overview of the status of mongodb nodes

mongoperf

• Utility to check disk I/O performance

mongotop

• Utility to track the time spent reading and writing per

namespace

MMS

MMS

• Comprehensive Tool– Monitoring

– Backup

– Deployment

MMS

• Comprehensive Tool– Monitoring

– Backup

– Deployment

Monitoring

• Key Metrics

– Storage

– Memory

– CPU

– Network

– Application Metrics

Models

• Load / Users

– Response Time / TTFB

• System Performance

– Peak Usage

– Min Usage

Velocity of Change

• Limitations -> takes time

– Data Movement

– Allocation / Provisioning (servers/mem/disk)

• Improvement

– Limit Size of Change

– Increase Frequency

– MEASURE its effect

– Practice

http://www.humanandnatural.com/data/media/178/badan_jaran_desert_oasis_china.jpg

Long story short …

Capacity Planning is …

• Needed

– Involves resource allocation

– Hardware specification and sizing

– Cost!

• Vital

– Translate Requirements and Expectations into Experience and Functionality

• And meeting those

• Requires understanding your application

– Measuring resource needs

– Monitoring

– Iterating

– Repeating process

For More Information

Resource Location

Case Studies mongodb.com/customers

Presentations mongodb.com/presentations

Free Online Training education.mongodb.com

Webinars and Events mongodb.com/events

Documentation docs.mongodb.org

MongoDB Downloads mongodb.com/download

Additional Info info@mongodb.com

http://cl.jroo.me/z3/v/D/C/e/a.baa-Too-many-bicycles-on-the-van.jpg

Questions?

@nleite

norberto@mongodb.com

top related