TUNING N1QL QUERY PERFORMANCE & SCALE IN COUCHBASE SERVER 4.0Cihan BiyikogluDir. Product Management
1
©2015 Couchbase Inc. 2
Goals Deeper look at query performance and scale
Look at Query and Index Service Scale Characteristics Understand Query Execution Flow Understand Index Usage
Tune queries with a few techniques
©2015 Couchbase Inc. 3
Agenda Part I - Architectural Overview
New Cluster Architecture with Couchbase Server 4.0 Query Processing & Indexing
Part II - Optimizing Queries Execution Plans and Operators Optimizing Queries - Filtering, Index Selection and Joins Optimizing Apps - Consistency Dials
QA
Demos & More Demos…
©2015 Couchbase Inc. 4
DisclaimerCouchbase Server 4.0 and ForestDB are still in development and the final version of the products may not be identical in details discussed on this session.
Architecture OverviewPart I
©2015 Couchbase Inc. 6
Couchbase Server Cluster Architecture
STORAGE
Couchbase Server 1
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Managed CacheStorage
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 2
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 3
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 4
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 5
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 6
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
Managed CacheStorage
Managed CacheStorage
Managed CacheStorage
Managed CacheStorage
Managed CacheStorage
©2014 Couchbase Inc.
Couchbase Server Cluster Architecture
STORAGE
Couchbase Server 1
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Managed CacheStorage
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 2
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 3
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 4
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 5
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 6
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
Managed CacheStorage
Managed CacheStorage
Managed CacheStorage
Managed CacheStorage
Managed CacheStorage
Query Processing Overview
©2015 Couchbase Inc. 9
Query Execution Submitting Queries in N1QL
Stateless Connectivity through REST Load-Balance across Query Service nodes Prepared vs Ad-hoc Query Execution Consistency Dials – more on this later…
©2015 Couchbase Inc. 10
Query Execution Parallelization factor is #cores on Query Service
Node
Execution Flow
©2015 Couchbase Inc. 11
Query Service - Capacity Management
Scaling the Query Service Pro: Load Balance Queries across all nodes Con: Compete with Index and Data Workloads
Index Service
Couchbase Cluster
Query ServiceData Service
node1 node8
©2015 Couchbase Inc. 12
Query Service - Capacity ManagementScaling the Query Service
Added CPU: higher intra-query parallelization Added RAM: improved caching with larger result sets Added Node: better availability and load balancing
Couchbase Cluster
node1 node8
Data ServiceIndex Service
Query Service
Indexing Overview
©2015 Couchbase Inc. 14
Indexing in Couchbase Server 4.0 Multiple Indexers
GSI – Index ServiceNew indexing for N1QL for low latency queries without compromising on mutation performance (insert/update/delete)Independently partitioned and independently scalable indexes in Indexing Service
Map/Reduce Views – Data ServicePowerful programmable indexer for complex reporting and indexing logic. Full partition alignment and paired scalability with Data Service.
Spatial View – Data ServiceIncremental R-tree indexing for powerful bounding-box queriesFull partition alignment and paired scalability with Data Service
New
Index Scan
©2015 Couchbase Inc. 15
Which to choose – GSI vs Views
Workloads New GSI in v4.0
Map/Reduce Views
Complex Reporting
Just In Time Pre-aggregated
Workload Optimization
Optimized for Scan Latency & Throughput
Optimized for Insertion
Flexible Index Logic
N1QL Functions Javascript
Secondary Lookups
Single Node Lookup Scatter-Gather
Tunable Consistency
Staleness false or ok or everything in between
Staleness false or ok
©2015 Couchbase Inc. 16
Which to choose – GSI vs Views
Capabilities New GSI in v4.0
Map/Reduce Views
Partitioning Model Independent – Indexing Service
Aligned to Data – Data Service
Scale Model Independently Scale Index Service
Scale with Data Service
Fetch with Index Key Single Node Scatter-Gather
Range Scan Single Node Scatter-Gather
Grouping, Aggregates With N1QL Built-in with Views API
Caching Managed Not Managed
Storage ForestDB Couchstore
Availability Multiple Identical Indexes load balanced
Replica Based
©2015 Couchbase Inc. 17
Query Service - Capacity Management
Scaling the Index Service Pro: Load balance scans across all nodes Con: Compete with Query and Data Workloads
Index Service
Couchbase Cluster
Query ServiceData Service
node1 node8
©2015 Couchbase Inc. 18
Index Service Capacity Management
Scaling the Index Service Added RAM: better caching of indexes Added CPU: faster index maintenance & parallelized index scans Add Faster IO Path: faster index persistence Added Node: better availability and load balancing
Couchbase Cluster
node1 node8
Data Service
Index Service
Query Service
Optimizing QueriesPart II
©2015 Couchbase Inc. 20
Execution Plans & Explain EXPLAIN query
Plan is assembled into an execution flow expressed through the operators
Operators stream results up and down the stream
Sequence ParallelPrimary
Scan
InitialProjectFetch
InitialProjectFetch
InitialProjectFetch
…
Limit
©2015 Couchbase Inc. 21
Operators Main Operations
ScansPrimaryScan: Scan of the Primary Index based on document keysIndexScan: Scan of the Secondary Index based on a predicate
Fetch Fetch: Reach into the Data service with a document key
Projection OperationsInitialProject: reducing the stream size to the fields involved in query. FinalProject: final shaping of the result to the requested JSON shape
©2015 Couchbase Inc. 22
Operators cont. Operator AssemblyParallel: execute all child operations in parallelSequence: execute child items in a sequence
Filtering OperatorsFilter: Apply a filter expression (ex. WHERE field = “value”)Limit: limit the number of items returned to NOffset: start returning items from a specified item count
©2015 Couchbase Inc. 23
Operators cont. Join OperatorsJoin: Join left and right keyspaces on attributes and document key
Unnest: Join operation between a parent and a child with a nested array where parent is repeated for each child array item.
Nest: Grouping operation between a parent and a child array where child array is embedded into the parent.
DEMOExecution Plans
Demo #1
Common Techniques for Tuning Queries
©2015 Couchbase Inc. 26
Minimize Items Scanned Primary Index Scan vs. Index Scan
Primary Index can only filter on document keys thus typically means “full-scan” of the bucket
Secondary Index is typically done with predicates and are smaller in size thus better to scan
Index Selection: Based on matching expressions matching in Index and WHERE clause
DEMO #2SELECT name,updated FROM `beer-sample` WHERE type="beer" AND abv>0 ORDER BY name LIMIT 10;
Vs.
CREATE INDEX i_type on `beer-sample`(type) USING GSI;SELECT name,updated FROM `beer-sample` WHERE type="beer" AND abv>0 ORDER BY name LIMIT 10;
©2015 Couchbase Inc. 27
Minimize Items Scanned HINT index usage to queries
There can be multiple indexes with to choose from and you can hint index choice to us.
SELECT name,updated FROM `beer-sample` USE INDEX(i_type using gsi) WHERE type="beer" AND abv>0 ORDER BY name LIMIT 10;
©2015 Couchbase Inc. 28
Minimize Items Scanned Limit & Filters help eliminate rows early in the
execution plan With Limit, Upstream operators are signaled to stop by limit when
enough rows accumulate Ex: Remember to Filter on Document type with buckets that contain
multiple types.
DEMO #3SELECT b1.name as beer_name, b2.name as brewery_name, b2.country FROM `beer-sample` AS b1 JOIN `beer-sample` AS b2 on KEYS b1.brewery_idWHERE abv>0;
vs
SELECT b1.name as beer_name, b2.name as brewery_name, b2.country FROM `beer-sample` AS b1 JOIN `beer-sample` AS b2 on KEYS b1.brewery_id WHERE b1.type="beer” and abv>0;
©2015 Couchbase Inc. 29
Joins Joins are efficient by nature
Left hand value is joined to the right hand document key with nested loop.
Query: Get brewery location for each beer:SELECT …FROM `beer-sample` AS b1 JOIN `beer-sample` AS b2 on KEYS b1.brewery_idWHERE b1.type="beer”;
For each document with type=“beer” take b1.brewery_id and look for and equal document key in b2.
Optimizing Applications
©2015 Couchbase Inc. 31
New Consistency Settings! View Stale-ness
Ok: unbounded – query what’s available in the index/view now
False: query after all changes up to the request timestamp (and maybe more) has been indexed for a given index or view.
New Indexes with Couchbase Server 4.0 Improves granularity of the consistency logical-
timestamp. New: Scan Consistency can be set to any logical
timestamp
Indicate stale=false to stale=ok and everything in between
©2015 Couchbase Inc. 32
Flexible Consistency Settings Time
t1 insert (k1, v1)…
t2 do other business logic computation…
t3 issue query/read on (k1,v1) with t3 vs t1Catch up all the
indexes to t3 and then issue query
Identical to “stale=false”
Catch up all the indexes to t1 and then issue query
Improved efficiency over “stale=false”
Recap
©2015 Couchbase Inc. 34
Recap New Unique Query and Indexing Architecture
Workload isolation with MDS gives you a great performance and scale advancement.
Familiar Concepts from your past life will help tune queries Understand Execution Plans Understand Indexes and Index Selection Filter & Limit aggressively Understand JOINs
Use powerful new Consistency Dials for best efficiency
Get Started Today Couchbase Server 4.0 & N1QL
Couchbase.com/beta
Q&ACihan Biyikoglu
[email protected]@cihangirb
Thank you.