mongodb capacity planning
TRANSCRIPT
Capacity Planning
• What is Capacity Planning ?
• Why is it important
• Which resources are affected?
• How to do it?
https://tingbudongchine.files.wordpress.com/2012/08/lemonde1.jpeg
What is Capacity Planning?
Fine Art of …
Requirements
Fine Art of …
Requirements
Resources
Preparing for Launch
• Developers are about to finish final Sprint
• Code is good (so they say )
• You feeling confortable to launch soon
• How to deploy?
Requirements
• Availability
– Uptime requirements: RPO and RTO
• Throughput
– Average read/writes/users
– Peek throughput
– Operations per second ? per day? per month?
• Responsiveness
– What's the acceptable latency?
• Higher during peek time?
RTO=recovery time objective RPO=recovery point objective
Resources
Resources
• CPU
• Storage
• Memory
• Network
Requirements vs Resources
Throughput
Availability
Responsiveness
Resource Usage
• Storage
• IOPS
• Size
• Data & Loading
Patterns
• CPU
• Speed
• Cores
• Memory
• Working Set
• Network
• Latency
• Throughput
Why is that Important?
Why
• Once we launch, we don't want to have avoidable down
time due to poorly selected HW
• As our success grows we want to stay in front of the
demand curve
• We want to meet business and users expectations
• We want to keep our jobs!
• Don't be the "goat"
Under allocation
Over Capacity
Over spending
Important Aspects
• Capacity
– Under
– Over
– Just Right?
• Prediction Models
– User/Load
– OPS/Request
– System Behavior (stress testing anyone?)
• Change Velocity
– Data / Resource-Allocation / Provisioning
– Minimum Viable Product?
– Future Releases / Roadmap
Important Aspects
• When?
– Not too early
– Before is too late!
– Iterative Process
Launch Version 2
Important Aspects
• When?
– Not too early
– Before is too late!
– Iterative Process
Launch Version 2
Important Aspects
• When?
– Not too early
– Before is too late!
– Iterative Process
Launch Version 2
http://www.mandywalker.com.au/wp-content/uploads/2013/07/Wall-with-Tools.jpg
Which resources are affected?
CPU
• Non-indexed Data
• Sorting
• Aggregation
– Map/Reduce
– Aggregation Framework
• Data
– Fields
– Nesting
– Arrays/Embedded-Docs
Network
• Latency
– WriteConcern
– ReadPreference
– Batching
• Throughput
– Update/Write Patterns
– Reads/Queries
Network
• Latency
– W:?
– Nearst
– Bulk Write Operations
• Throughput
– Use $set operator
– Filtering fields on queries
Storage
• Active
• Archival
• Loading Patterns
• Integration (BI/DW)
Storage Capability
Type IOPS
7200 rpm SATA ~ 75 – 100
15000 rpm SAS ~ 175 – 210
http://en.wikipedia.org/wiki/IOPS
Storage Capability
Type IOPS
7200 rpm SATA ~ 75 – 100
15000 rpm SAS ~ 175 – 210
SSD Intel X25-E (SLC) ~ 5000
SSD Intel X25-M G2 (MLC) ~ 8000
http://en.wikipedia.org/wiki/IOPS
Storage Capability
Type IOPS
7200 rpm SATA ~ 75 – 100
15000 rpm SAS ~ 175 – 210
SSD Intel X25-E (SLC) ~ 5000
SSD Intel X25-M G2 (MLC) ~ 8000
Amazon EBS ~ 100
Amazon EBS Provisioned Up to 2000
Amazon EBS Provisioned IOPS (SSD) ~3000
http://en.wikipedia.org/wiki/IOPS
Storage Capability
Type IOPS
7200 rpm SATA ~ 75 – 100
15000 rpm SAS ~ 175 – 210
SSD Intel X25-E (SLC) ~ 5000
SSD Intel X25-M G2 (MLC) ~ 8000
Amazon EBS ~ 100
Amazon EBS Provisioned Up to 2000
Amazon EBS Provisioned IOPS (SSD) ~3000
FusionIO ~135 000
Violin Memory 6000 ~ 1 000 000
http://en.wikipedia.org/wiki/IOPS
Higher IOPS higher the Cost!!!
Storage Considerations
• Work out how much data you need to write per unit of
time!
• Databases will use storage to persist data
– More data = Bigger indexes = More Storage
• MongoDB Stores Information into Documents
• BSON Format
– http://bsonspec.org/
Memory
• Working Set
– Active Data in Memory
– Measured Over Periods
• And other operations
– Sorting
– Aggregation
– Connections
• MongoDB Storage Engine
– VMMAP
– Memory Mapped Files
Memory Mapped Files
Memory Usage
• Data & Indexes memory mapped into virtual address
space
• Data access is paged into RAM
• OS evicts using LRU
• More frequently used pages stay in RAM
http://blogdailyherald.com/wp-content/uploads/2013/05/3879-animated_gif-chuck_norris-dodgeball-thumbs_up.gif
How to do it!
Basic Rules
• Determine your Working Set
• Use good Measuring and Monitoring practices
• Plan ahead but be flexible!
• Iterate
– Review Requirements
– Review Capacity
Working Set
Number of Active Users on
the system at any one time
Number of distinct pages
accessed per second
Working Set
Working Set
4 distinct pages per second
RAM
Disk
Working Set
4 distinct pages per second
RAM
Disk
Worst case 4 disk accesses
Working Set
6 distinct pages per second
RAM
Disk
Working Set
6 distinct pages per second
Disk
Working Set
6 distinct pages per second
Worst case disk access on every op
Memory & Storage
MOPs
PFs
Working Set
• Capacity sizing to hold working set + indexes
• Allow room to grow
• If working set is larger than RAM and you can't
reasonably add more resources
– Shard!
– Lots of little instances vs few big instances
• Think about architecture
– Local disk vs central storage
– How many copies of data do I need for availability
reasons
Measuring & Monitoring
• What to measure
– IOPS
– Page Faults
– Resident Memory (Working Set)
– Connections
– Lock %
• How to measure and monitor
– iostat
– vmstat
– mongostat
– mongopref
– MMS
iostat
vmstat
mongostat
• Quick overview of the status of mongodb nodes
mongoperf
• Utility to check disk I/O performance
mongotop
• Utility to track the time spent reading and writing per
namespace
MMS
MMS
• Comprehensive Tool– Monitoring
– Backup
– Deployment
MMS
• Comprehensive Tool– Monitoring
– Backup
– Deployment
Monitoring
• Key Metrics
– Storage
– Memory
– CPU
– Network
– Application Metrics
Models
• Load / Users
– Response Time / TTFB
• System Performance
– Peak Usage
– Min Usage
Velocity of Change
• Limitations -> takes time
– Data Movement
– Allocation / Provisioning (servers/mem/disk)
• Improvement
– Limit Size of Change
– Increase Frequency
– MEASURE its effect
– Practice
http://www.humanandnatural.com/data/media/178/badan_jaran_desert_oasis_china.jpg
Long story short …
Capacity Planning is …
• Needed
– Involves resource allocation
– Hardware specification and sizing
– Cost!
• Vital
– Translate Requirements and Expectations into Experience and Functionality
• And meeting those
• Requires understanding your application
– Measuring resource needs
– Monitoring
– Iterating
– Repeating process
For More Information
Resource Location
Case Studies mongodb.com/customers
Presentations mongodb.com/presentations
Free Online Training education.mongodb.com
Webinars and Events mongodb.com/events
Documentation docs.mongodb.org
MongoDB Downloads mongodb.com/download
Additional Info [email protected]
http://cl.jroo.me/z3/v/D/C/e/a.baa-Too-many-bicycles-on-the-van.jpg
Questions?
@nleite