running high performance and fault tolerant elasticsearch clusters on docker

Post on 06-Jan-2017

26.897 Views

Category:

Data & Analytics

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Elasticsearch & Docker

Rafał Kuć – Sematext Group, Inc.@kucrafal @sematext sematext.com

Running High PerformanceFault Tolerant

Elasticsearch Clusters On Docker

About me…

Sematext consultant & engineerSolr.pl co-founderFather and husband :)

Next 30 minutes

You Are Probably Familiar With This

Development

You Are Probably Familiar With This

Development Test

You Are Probably Familiar With This

Development Test QA

You Are Probably Familiar With This

Development Test QA

Production environment

And The Problems That Come With It

Resources not utilized

And The Problems That Come With It

Resources not utilized

OverprovisionedServers

And The Problems That Come With It

Resources not utilized

OverprovisionedServers

≠ ≠

The solution

Development Test QA Production

Container Technologies

What is Docker?

Lightweight

Based onOpen Standards

Secure

Containers vs Virtual Machines

Hardware

Traditional Virtual Machine

Containers vs Virtual Machines

Hardware

Host Operating System

Traditional Virtual Machine

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Traditional Virtual Machine

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Guest OS Guest OS

Traditional Virtual Machine

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Guest OS Guest OS

Libraries Libraries

Traditional Virtual Machine

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Guest OS Guest OS

Libraries Libraries

Application 1 Application 2

Traditional Virtual Machine

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Guest OS Guest OS

Libraries Libraries

Application 1 Application 2

Hardware

Host Operating System

Traditional Virtual MachineContainer

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Guest OS Guest OS

Libraries Libraries

Application 1 Application 2

Hardware

Host Operating System

Docker Engine

Traditional Virtual MachineContainer

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Guest OS Guest OS

Libraries Libraries

Application 1 Application 2

Hardware

Host Operating System

Docker Engine

Libraries Libraries

Application 1 Application 2

Traditional Virtual MachineContainer

What is Elasticsearch?

Reasonabledefaults { JSON }{ JSON }

Distributed by design

http://www.dailypets.co.uk/2007/06/17/kittens-rest-at-half-time/

Running Official Elasticsearch Container

$ docker run -d elasticsearch

Running Official Elasticsearch Container

$ docker run -d elasticsearch == docker run -d elasticsearch:latest

Running Official Elasticsearch Container

$ docker run -d elasticsearch:1.7

$ docker run -d elasticsearch == docker run -d elasticsearch:latest

Running Official Elasticsearch Container

$ docker run -d elasticsearch == docker run -d elasticsearch:latest

$ docker run --name es_1 -h es_master_1 elasticsearch

$ docker run -d elasticsearch:1.7

Running Official Elasticsearch Container

$ docker run -d elasticsearch == docker run -d elasticsearch:latest

$ docker run --name es_1 -h es_master_1 elasticsearch

$ docker run -d elasticsearch:1.7

Container Constraints

$ docker run -d -m 2G elasticsearch

$ docker run -d -m 2G --memory-swappiness=0 elasticsearch

$ docker run -d --cpuset-cpus="1,3" elasticsearch

http://docs.docker.com/engine/reference/run/

Container Constraints

$ docker run -d -m 2G elasticsearch

$ docker run -d -m 2G --memory-swappiness=0 elasticsearch

$ docker run -d --cpuset-cpus="1,3" elasticsearch

http://docs.docker.com/engine/reference/run/

$ docker run -d --cpu-period=50000 --cpu-quota=25000 elasticsearch

Creating Optimized Image

Dockerfile:FROM elasticsearchADD ./elasticsearch.yml /usr/share/elasticsearch/config/

Creating Optimized Image

Dockerfile:FROM elasticsearchADD ./elasticsearch.yml /usr/share/elasticsearch/config/

$ docker build -t devops/example .

Creating Optimized Image

Dockerfile:FROM elasticsearchADD ./elasticsearch.yml /usr/share/elasticsearch/config/

$ docker build -t devops/example .

Sending build context to Docker daemon 3.072 kBStep 1 : FROM elasticsearch ---> 8112755253f1Step 2 : ADD ./elasticsearch.yml /usr/share/elasticsearch/config/ ---> Using cache ---> c9ca48a22e58Successfully built c9ca48a22e58

Dealing With Network

$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch

Dealing With Network

$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch

$ docker run -d elasticsearch -Dnetwork.publish_host=192.168.1.1

Dealing With Network

$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch

$ docker run -d elasticsearch -Dnetwork.publish_host=192.168.1.1

$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch -Dnetwork.publish_host=192.168.1.1

Dealing With Network

$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch

$ docker run -d elasticsearch -Dnetwork.publish_host=192.168.1.1

$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch -Dnetwork.publish_host=192.168.1.1

$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch -Dnetwork.publish_host=0.0.0.0

Network - Good Practices

Separate network for Elasticsearch cluster

Network - Good Practices

Separate network for Elasticsearch cluster

Common host names for containers$ docker run -d -h es_node_1 elasticsearch

Network - Good Practices

Separate network for Elasticsearch cluster

Common host names for containers$ docker run -d -h es_node_1 elasticsearch

Expose 9200 & 9300 ports only for client nodes

Network - Good Practices

Separate network for Elasticsearch cluster

Common host names for containers$ docker run -d -h es_node_1 elasticsearch

Expose 9200 & 9300 ports only for client nodes

Elasticsearch data & client nodes point to masters only

Dealing With Storage

By default in /usr/share/elasticsearch/data

Dealing With Storage

By default in /usr/share/elasticsearch/data

By default not persisted

Dealing With Storage

By default in /usr/share/elasticsearch/data

By default not persisted

$ docker run -d -v /opt/elasticsearch/data:/usr/share/elasticsearch/data elasticsearch

Dealing With Storage

$ docker run -d -v /opt/elasticsearch/data:/usr/share/elasticsearch/data elasticsearch

By default in /usr/share/elasticsearch/data

By default not persisted

Use data only containers

Permissions

Data-Only Docker Volumes

Bypasses Union File System

Data-Only Docker Volumes

Bypasses Union File System

Can be shared between containers

Data-Only Docker Volumes

Bypasses Union File System

Can be shared between containers

Data volumes persist if the container itself is deleted

Data-Only Docker Volumes

Bypasses Union File System

Can be shared between containers

Data volumes persist if the container itself is deleted

$ docker create -v /mnt/es/data:/usr/share/elasticsearch/data --name esdata elasticsearch

Permissions

Data-Only Docker Volumes

Bypasses Union File System

Can be shared between containers

Data volumes persist if the container itself is deleted

$ docker create -v /mnt/es/data:/usr/share/elasticsearch/data --name esdata elasticsearch

$ docker run --volumes-from esdata elasticsearch

Highly Available Cluster

Master only

Master only

Master only

Data only

Data only

Data only

Data only

Data only

Data only

Client only

Client only

Highly Available Cluster

Master only

Master only

Master only

Data only

Data only

Data only

Data only

Data only

Data only

Client only

Client only

minimum_master_nodes = N/2 + 1

Highly Available Cluster

Master only

Master only

Master only

Data only

Data only

Data only

Data only

Data only

Data only

Client only

Client only

minimum_master_nodes = N/2 + 1

recovery.after.nodes recovery.expected.nodes

cluster.routing.allocation.node_concurrent_recoveries

index.unassigned.node_left.delayed_timeoutindex.priority

Master Nodes & Docker

$ docker run -d elasticsearch -Dnode.master=true -Dnode.data=false -Dnode.client=false

Client Nodes & Docker

$ docker run -d elasticsearch -Dnode.master=false -Dnode.data=false -Dnode.client=true

Data Nodes & Docker

$ docker run -d elasticsearch -Dnode.master=false -Dnode.data=true -Dnode.client=false

Scaling

Elasticsearch Node Elasticsearch Node

Elasticsearch Node Elasticsearch Node

Scaling

curl -XPUT 'http://localhost:9200/devops/' -d '{ "settings" : { "index" : { "number_of_shards" : 4, "number_of_replicas" : 0 } }}'

Scaling

P P

P P

Scaling

curl -XPUT 'http://localhost:9200/devops/_settings' -d '{ "index.number_of_replicas" : 1}'

Scaling

P P

P P

R

R R

R

Scaling

curl -XPUT 'http://localhost:9200/devops/_settings' -d '{ "index.number_of_replicas" : 2}'

Scaling

P P

P P

R R

R R

R R

R R

Scaling

curl -XPUT 'http://localhost:9200/devops/_settings' -d '{ "index.number_of_replicas" : 1}'

Scaling

P P

P P

R

R R

R

Scaling

P P

P P

R

R R

R

Scaling

P P PP

UnassignedR

RR

R

RAM Buffer

indices.memory.index_buffer_size: 10%indices.memory.min_index_buffer_size: 48mb

indices.memory.max_index_buffer_size (unbounded) indices.memory.min_shard_index_buffer_size: 4mb

RAM Buffer

indices.memory.index_buffer_size: 10%indices.memory.min_index_buffer_size: 48mb

indices.memory.max_index_buffer_size (unbounded) indices.memory.min_shard_index_buffer_size: 4mb

Higher IndexingThroughput

Lower IndexingThroughput

defaults ><

Time-Based Data?

2015-11-23

TODAY

WEEK

Time-Based Data?

curl -XPOST 'http://localhost:9200/_aliases' -d '{ "actions" : [ { "add" : {"index":"2015-11-23","alias":"today"} }, { "add" : {"index":"2015-11-23","alias":"week"} } ]}'

Time-Based Data?

2015-11-23 2015-11-24

TODAY

WEEK

Time-Based Data?

2015-11-23 2015-11-24 2014-11-25

TODAY

WEEK

Multiple Tiers

node.tag=hot node.tag=cold node.tag=cold

Multiple Tiers

curl -XPUT 'localhost:9200/data_2015-11-23' -d '{ "settings": { "index.routing.allocation.include.tag" : "hot" }}'

Multiple Tiers

node.tag=hot node.tag=cold node.tag=cold

data_2015-11-23

data_2015-11-23

Multiple Tiers

curl -XPUT 'localhost:9200/data_2015-11-23/_settings' -d '{ "settings": { "index.routing.allocation.exclude.tag" : "hot", "index.routing.allocation.include.tag" : "cold", }}'

Multiple Tiers

node.tag=hot node.tag=cold node.tag=cold

data_2015-11-23

data_2015-11-23

Multiple Tiers

node.tag=hot node.tag=cold node.tag=cold

data_2015-11-23

data_2015-11-23

data_2015-11-24

data_2015-11-24

Multiple Tiers

node.tag=hot node.tag=cold node.tag=cold

data_2015-11-23

data_2015-11-23

data_2015-11-25

data_2015-11-25

data_2015-11-24

data_2015-11-24

Multiple Tenants

Multiple Tenants

Hot

Hot

Cold

Cold

Cold

Cold

Multiple Tenants

Hot

Hot

Cold

Cold

Cold

Cold

ROUTING

Indexing Without Routing

Shard 1 Shard 2 Shard 3 Shard 4

Shard 5 Shard 6 Shard 7 Shard 8

Elasticsearch

Application

userA

userA

userA

userAuserAuserA

userAuserA

Indexing With Routing

Shard 1 Shard 2 Shard 3 Shard 4

Shard 5 Shard 6 Shard 7 Shard 8

Elasticsearch

Application

user

A

Querying Without Routing

Shard 1 Shard 2 Shard 3 Shard 4

Shard 5 Shard 6 Shard 7 Shard 8

Elasticsearch

Application

Querying With Routing

Shard 1 Shard 2 Shard 3 Shard 4

Shard 5 Shard 6 Shard 7 Shard 8

Elasticsearch

Application

Routing vs No Routing

Queries without routing (200 shards, 1 replica)#thre

adsAvg response time Throughput 90%

lineMedian

CPU Utilization

1 3169ms 19,0/min 5214ms

2692ms

95 – 99%

Routing vs No Routing

Queries without routing (200 shards, 1 replica)#thre

adsAvg response time Throughput 90%

lineMedian

CPU Utilization

1 3169ms 19,0/min 5214ms

2692ms

95 – 99%

Queries with routing (200 shards, 1 replica)#thre

adsAvg response time Throughput 90%

lineMedian

CPU Utilization

10 196ms 50,6/sec 642ms

29ms 25 – 40%

20 218ms 91,2/sec 718ms

11ms 10 – 15%

Short summary

http://www.soothetube.com/2013/12/29/thats-all-folks/

We Are Hiring!Dig Search?Dig Analytics?Dig Big Data?Dig Performance?Dig Logging?Dig working with, and in, open–source?We’re hiring worldwide!

http://sematext.com/about/jobs.html

top related