scaling solrcloud to a large number of collections: presented by shalin shekhar mangar, lucidworks

Scaling SolrCloud to a Large Number of CollectionsShalin Shekhar Mangar Lucidworks Inc.

Apache Solr has tremendous momentum

8M+ total downloads

Solr is both established & growing

250,000+monthly downloads

Largest community of developers.

2500+open Solr jobs.

The most widely used search solution on the planet.

Lucene/Solr Revolutionworld’s largest open sourceuser conference dedicated

to Lucene/Solr.

Solr has tens of thousands of applications in production.

You use Solr everyday.

Solr Scalability is unmatched

The traditional search use-case

Example: eCommerce Product Catalogue

• One large index distributed across multiple nodes • A large number of users searching on the same data • Searches happen across the entire cluster

“The limits of the possible can only be defined by going beyond them into the impossible.”

—Arthur C. Clarke

Analyze, measure and optimize

•Analyze and find missing features •Setup a performance testing environment on AWS •Devise tests for stability and performance •Find bugs and bottlenecks and fixes

Problem #1: Cluster state and updates

•The SolrCloud cluster state has information about all collections, their shards and replicas

•All nodes and (Java) clients watch the cluster state •Every state change is notified to all nodes •Limited to (slightly less than) 1MB by default •1 node restart triggers a few 100 watcher fires and pulls

from ZK for a 100 node cluster (three states: down, recovering and active)

Solution: Split cluster state and scale

• Each collection gets it’s own state node in ZK • Nodes selectively watch only those states

which they are a member of • Clients cache state and use smart cache

updates instead of watching nodes • http://issues.apache.org/jira/browse/

SOLR-5473

http://issues.apache.org/jira/browse/SOLR-5473

Problem #2: Overseer Performance

• Thousands of collections create a lot of state updates

• Overseer falls behind and replicas can’t recover or can’t elect a leader

• Under high indexing/search load, GC pauses can cause overseer queue to back up

Solution - Improve the overseer

• Optimize polling for new items in overseer queue (SOLR-5436)

• Dedicated overseers nodes (SOLR-5476) • New Overseer Status API (SOLR-5749) • Asynchronous execution of collection

commands (SOLR-5477, SOLR-5681)

Problem #3: Moving data around

• Not all users are born equal - A tenant may have a few very large users

• We wanted to be able to scale an individual user’s data — maybe even as it’s own collection

• SolrCloud can split shards with no downtime but it only splits in half

• No way to ‘extract’ user’s data to another collection or shard

Solution: Improved data management

• Shard can be split on arbitrary hash ranges (SOLR-5300)

• Shard can be split by a given key (SOLR-5338, SOLR-5353)

• A new ‘migrate’ API to move a user’s data to another (new) collection without downtime (SOLR-5308)

Problem #4: Exporting data

• Lucene/Solr is designed for finding top-N search results

• Trying to export full result set brings down the system due to high memory requirements as you go deeper

Solution: Distributed deep paging

Testing scale at scale

• Performance goals: 6 billion documents, 4000 queries/sec, 400 updates/sec, 2 seconds NRT sustained performance

• 5% large collections (50 shards), 15% medium (10 shards), 85% small (1 shard) with replication factor of 3

• Target hardware: 24 CPUs, 126G RAM, 7 SSDs (460G) + 1 HDD (200G)

• 80% traffic served by 20% of the tenants

Test Infrastructure

Logging

How to manage large clusters?

• Tim Potter wrote the Solr Scale Toolkit • Fabric based tool to setup and manage

SolrCloud clusters in AWS complete with collectd and SiLK

• Backup/Restore from S3. Parallel clone commands.

• Open source! • https://github.com/LucidWorks/solr-scale-tk

https://github.com/LucidWorks/solr-scale-tk

Gathering metrics and analyzing logs

• Lucidworks SiLK (Solr + Logstash + Kibana) • collectd daemons on each host • rabbitmq to queue messages before delivering to log stash • Initially started with Kafka but discarded thinking it is overkill • Not happy with rabbitmq — crashes/unstable • Might try Kafka again soon • http://www.lucidworks.com/lucidworks-silk

http://www.lucidworks.com/lucidworks-silk

Generating data and load

• Custom randomized data generator (re-producible using a seed)

• JMeter for generating load • Embedded CloudSolrServer (Solr Java client)

using JMeter Java Action Sampler • JMeter distributed mode was itself a bottleneck! • Not open source (yet) but we’re working on it!

Numbers

•30 hosts, 120 nodes, 1000 collections, 6B+ docs, 15000 queries/second, 2000 writes/second, 2 second NRT sustained over 24-hours

•More than 3x the numbers we needed •Unfortunately, we had to stop testing at that

point :( •Our biggest cluster cost us just $120/hour :)

Not over yet

• We continue to test performance at scale • Published indexing performance benchmark, working on others

• 15 nodes, 30 shards, 1 replica, 157195 docs/sec • 15 nodes, 30 shards, 2 replicas, 61062 docs/sec • http://searchhub.org/introducing-the-solr-scale-toolkit/

• Setting up an internal performance testing environment • Jenkins CI • Jepsen tests • Single node benchmarks • Cloud tests

• Stay tuned!

http://searchhub.org/introducing-the-solr-scale-toolkit/

Pushing the limits

Not over yet

• SolrCloud continues to be improved • SOLR-6220 - Replica placement strategy • SOLR-6273 - Cross data center replication • SOLR-5656 - Auto-add replicas on HDFS • SOLR-5986 - Don’t allow runaway queries to harm the cluster • SOLR-5750 - Backup/Restore API for SolrCloud • Many, many more

Thank [email protected] @shalinmangar

scaling solrcloud to a large number of collections: presented by shalin shekhar mangar, lucidworks

Software

apache solr

overseerqueue solr

solr scalability

split cluster state

open solr jobs

new overseer status

dedicated overseers

state node