sizing your couchbase cluster: couchbase connect 2015

39
HOW MANY NODES? PROPERLY SIZING YOUR COUCHBASE CLUSTER Perry Krug Sr. Solutions Architect

Upload: couchbase

Post on 26-Jul-2015

190 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Sizing Your Couchbase Cluster: Couchbase Connect 2015

HOW MANY NODES?PROPERLY SIZING YOUR COUCHBASE CLUSTERPerry KrugSr. Solutions Architect

Page 3: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 3

Application Server

Size Couchbase Server

Sizing == performance Serve reads out of RAM Enough IO for writes and disk operations Mitigate inevitable failures

Reading Data Writing Data

Couchbase Server

Give medocument A

Here is document A

A

Couchbase Server

Please storedocument A

OK, I storeddocument A

A

Application Server

Page 4: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 4

Scaling out permits matching of aggregate flow rates so queues do not grow

network networknetwork

Couchbase Server Couchbase Server Couchbase Server

Application Server Application ServerApplication Server

Page 5: Sizing Your Couchbase Cluster: Couchbase Connect 2015

5 Factors of Sizing

Page 6: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 6

How many nodes?

5 Key Factors determine number of nodes needed:

1) RAM2) Disk3) CPU4) Network5) Data Distribution/Safety

(per-bucket, multiple buckets aggregate)Couchbase Servers

Web application server

Application user

Page 7: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 7

RAM sizing

1)Total RAM Managed document cache:

Working set Metadata Active+Replicas

Index caching (I/O buffer)

Keep working set in RAM for best read performance

Server

Give medocument A

Here is document A

A

A

A

Reading Data

Application Server

Page 8: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 8

Working set depends on your application

Late stage social game

Many users no longer active; few logged in at

any given time.

Ad NetworkAny cookie can show

up at any time.

Business applicationUsers logged in during

the day. Day moves around the globe.

working/total set = 1working/total set = .01 working/total set = .33

Couchbase Server Couchbase Server Couchbase Server

Page 9: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 9

RAM Sizing - View/Index cache (disk I/O)

File system cache availability for the index has a big impact performance:

Test runs based on 10 million items with 16GB bucket quota and 4GB, 8GB system RAM availability for indexes

Performance results show that by doubling system cache availability query latency reduces by half throughput increases by 50%

Leave RAM free with quotas

Page 10: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 10

Disk Sizing: Space and I/O

2) Disk Sustained write rate Rebalance capacity Backups XDCR Views/Indexes Compaction Total dataset:

(active + replicas + indexes)

Append-only

I/O

Space

Please storedocument A

OK, I storeddocument A

A

Server

A

A

Writing Data

Application Server

Page 11: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 11

Disk Sizing: Space and I/O Disk Writes are Buffered

Bursts of data expand the disk write queue Sustained writes need corresponding throughput

Disk throughput affected by disk speed SSD > 10K RPM > EBS SSDs give a huge boost to write throughput and

startup/warmup times RAID can provide redundancy and increase throughput

Throughput = read/write+compaction+indexing+XDCR 2.1 introduces multiple disk threads

Default is 3 (1 writer / 2 readers), max is 8 combined

Best to configure different paths for data and indexes Plan on about 3x space (append-only, compaction,

backups, etc)

Page 12: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 12

CPU sizing

3)CPU Disk writing Views/compaction/XDCR RAM r/w performance not impacted Min. production requirement:

4 cores+1 per bucket+1 core per Design Doc+1 core per XDCR stream

Page 13: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 13

Network sizing

4) Network Client traffic Replication (writes) Rebalancing XDCR

Reads+Writes

Replication (multiply writes) and Rebalancing

network networknetwork

Couchbase ServerCouchbase Server Couchbase Server

Application ServerApplication ServerApplication Server

Page 14: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 14

Network Considerations

Low latency, high throughput (LAN) - within cluster

Eliminate router hops: Within Cluster nodes Between clients and cluster

Check who else is sharing the network Increase bandwidth by:

Add more nodes (will scale linearly) Upgrade routers/switches/NIC’s/etc

Page 15: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 15

Data Distribution

5)Data Distribution / Safety (assuming one replica): 1 node = Single point of failure 2 nodes = +Replication 3+ nodes = Best for production

Autofailover Upgrade-ability Further scale-ability

Note: Many applications will need more than 3 nodes

Servers fail, be prepared. The more nodes, the less impact a failure will have.

Page 16: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 16

How many nodes recap

5 Key Factors determine number of nodes needed:

1) RAM2) Disk3) CPU4) Network5) Data Distribution/Safety

(per-bucket, multiple buckets aggregate)

Couchbase Servers

Web application server

Application user

Page 17: Sizing Your Couchbase Cluster: Couchbase Connect 2015

Deployment Considerations

Page 18: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 18

Hardware Minimums

RAM: At least ~4GB (highly dependent on data set)

Disk: Fastest “local” storage available-SSD is better-RAID 0 or 10, not 5

CPU (minimums): 8 cores+ 1-per bucket+ 1-per design document+ 1-per XDCR stream

Hardware requirements/recommendations are the intersection of what’s needed versus what’s available.

Page 19: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 19

Hardware Considerations

Designed for commodity hardware Scale out, not up…more smaller nodes better

than less larger ones (can scale up later) Tested and deployed in EC2 Physical hardware offers best performance and

efficiency Certain considerations with using VM’s:

RAM use inefficient / Disk IO usually not as fast Local storage better than shared SAN 1 Couchbase VM per physical host You will generally need more nodes Don’t overcommit

Page 20: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 20

Couchbase in AWS

R3 or C3 instances best value for performance Higher RAM-to-CPU ratios Come with SSD’s

Disk Choice: SSD’s are best Ephemeral is okay Single EBS not great, use LVM/RAID Views/indexes on ephemeral, main data on EBS or both

on SSD Backups: Use cbbackup locally on each node and

migrate to EBS/S3 Can use EBS snapshots

Page 21: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 21

Couchbase in AWS

Deploy across AZ’s with rack/zone awareness Use a EIP/public-hostname instead of private IP:

Easier connectivity from outside AWS Easier restoration/better availability Couchbase XDCR across regions must use hostname

In AWS as with any cloud/virtual deployment, you will likely need more nodes than you would with a physical infrastructure

Page 22: Sizing Your Couchbase Cluster: Couchbase Connect 2015

Effects of…

Page 23: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 23

Views/Indexes

Effect on scale/sizing: Increase the CPU and disk IO requirements More complex views require more CPU More view output requires more disk IO More RAM should be left out of the quota for better IO

caching Indication:

Indexes significantly behind data writes (or growing delays)

What do to: Make sure you follow best practices in view writing Add more nodes to distribute processing “work” Look into SSD’s

Page 24: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 24

XDCR

Effect on scale/sizing: XDCR is CPU Intensive Disk IO will double Memory needs to be sized accordingly (bi-directional

may mean more data) Indication:

A rising XDCR queue on source What to do:

More nodes on source and destination will drain queue faster (scales linearly)

Tune replication streams according to CPU availability

Page 25: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 25

As your workload grows… Effects on scale/sizing:

More reads:• Individual documents will not be impacted (static working

set)• Views may require faster disks, more disk IO caching

More writes will increase disk IO needs Indications:

Cache miss ratio rising Growing disk write queue / XDCR queue Compaction not keeping up

What to do: Revise sizing calculations and add more nodes if needed

Most applications don’t need to scale the number of nodes based upon normal workload variation.

Page 26: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 26

As your dataset grows… Effects on scale/sizing:

Your RAM needs will grow: Metadata needs increase with item count Is your working set increasing? Your disk space will likely grow (duh?)

Indications: Dropping resident ratio Rising ejections/cache miss ratio

What to do: Revise sizing calculations, add more nodes Remove un-needed data

This is the most common need for scaling and will most likely result in needing more nodes

Page 27: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 27

Rebalancing

Yes there is resource utilization during a rebalance but a “properly” sized cluster should not have any effect on performance during a rebalance: Distribution of data and work across all nodes Managed caching layer separates RAM-based

performance from IO utilization Rebalance automatically manages working set in RAM Rebalance automatically throttles itself if needed Can be stopped midway without endangering data or

progress

Proper sizing includes not maxing out all resources: leave some headroom in preparation

Page 28: Sizing Your Couchbase Cluster: Couchbase Connect 2015

Couchbase 4.0

Page 29: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 29

Sizing Couchbase Server 4.0

Multi-Dimensional Scalability (MDS) – Optionally Scale each service independently: Data Index Query

5 factors still apply: RAM Disk CPU Network Data Safety/Distribution

Page 30: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 30

Sizing Couchbase Server 4.0 - Data

Data Service in 4.0 same as previous Couchbase Server: Enough RAM to cache reads Enough Disk to eventually persist writes CPU primarily for Views and XDCR At least 3 nodes – Replication at the bucket level

Minimum requirements: 4GB RAM, 8 Cores CPU

Page 31: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 31

Sizing Couchbase Server 4.0 - Index

Index service new to 4.0 (a.k.a. GSI or “Secondary Indexes”): Primarily RAM and Disk IO bound ForestDB persistence engine At least 2 nodes for HA, each index replicated

individually

Minimum Requirements: 8GB RAM, 8 core CPU, “fast disk”

Note: 4.0 is still in beta, final sizing numbers are being formulated

Page 32: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 32

Sizing Couchbase Server 4.0 - Query

Query Service new to 4.0 (a.k.a. N1QL) Primarily CPU bound Optimized for multi-core systems Very low RAM and disk requirements At least 2 nodes for HA – Queries automatically load

balanced

Minimum Requirements: 4GB RAM, 16+ Core CPU

Note: 4.0 is still in beta, final sizing numbers are being formulated

Page 33: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 33

Sizing Couchbase Server 4.0 - MDS

Multi-Dimensional Scalability (MDS) Option 1: All 3 services enabled on all nodes – Size for

aggregate requirements (Data+Index+Query) Option 2: Separated services – Size nodes independently

for different workloads. i.e.:

• Data Service: More nodes with more RAM, less disk, less CPU

• Index Service: Fewer nodes with more RAM, more disk, less CPU

• Query Service: Fewer nodes with less RAM, less disk, more CPU

Page 34: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 34

Sizing Couchbase Server 4.0 - MDS

Independent Load Distribution Modular Architecture to Construct the Database for

Your Need Pick HW Capacity – scale up and/or scale out Pick Services Layout - overlap and/or isolate services Pick Data/Index Partitioning

Couchbase Cluster

Index ServiceQuery

ServiceData Service

node1 node8

Page 35: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 35

Sizing is tricky business…

Work with the Couchbase Team

Validate your “on-paper” numbers with testing

Constantly monitor production

Page 36: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 36

Dive in…

Gather your workload and dataset requirements: Item counts and sizes, read/write/delete ratios

Review our documentation and formulas Test, Deploy, Monitor…rinse and repeat

Page 37: Sizing Your Couchbase Cluster: Couchbase Connect 2015

©2015 Couchbase Inc. 37

Want more?

Lots of details and best practices in our documentation:

http://www.couchbase.com/docs/

And my sizing blog:http://blog.couchbase.com/how-many-nodes-part-1-introduction-sizing-couchbase-server-20-

cluster

Page 38: Sizing Your Couchbase Cluster: Couchbase Connect 2015
Page 39: Sizing Your Couchbase Cluster: Couchbase Connect 2015

Get Started with Couchbase Server 4.0: www.couchbase.com/beta

Get Trained on Couchbase: training.couchbase.com

Thank you [email protected] | @couchbase