elasticsearch - key featuresfiles.meetup.com/4046992/elastic-key-features_2015(alan).pdf ·...
TRANSCRIPT
![Page 1: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/1.jpg)
Elasticsearch - key features
Alan Hardy Solutions Architect
![Page 2: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/2.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
2
Elasticsearch
Distributed, scalable, and resilient Designed for scale-out; high availability
Developer friendly API-first; schemaless, native JSON, client libraries for any language
Real-time Search & Analytics Real-time aggregations, geospatial, full-text search; query structured and unstructured data
Store, Search and Analyze
![Page 3: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/3.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
3
Terminology
“node”running instance of elasticsearch
≈ one server
![Page 4: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/4.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
4
Terminology
“shard”holds just a a slice of the data
lives on one nodephysical worker unit
(a single Lucene instance)
![Page 5: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/5.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
5
Terminology
“index”logical namespace
points to one or more shards
shard = hash(_id) % no_of_shards
![Page 6: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/6.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
6
Terminology
many segments
ssssssssmany shards
ss
one shard
ss→
I
one index
I
→
![Page 7: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/7.jpg)
www.elastic.co7
scale out, not up
![Page 8: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/8.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
8
Create an Index
curl -XPUT 'http://localhost:9200/logs{ "settings" : { "number_of_shards" : 3, "number_of_replicas" : 1 }}
To add data we need an index (one or more shards) A shard can be either a primary shard or a replica shard A document belongs to a single primary shard
![Page 9: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/9.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
9
Single node cluster
one node with three primary shards creates a cluster of one node node is elected to master role within the cluster replica shards not allocated
![Page 10: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/10.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
10
Add Resiliency
second node started with same cluster.name node joins cluster (discovery unicast/multicast) replica shards automatically allocated to second node
![Page 11: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/11.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
11
Scale Horizontally
add another node elasticsearch automatically balances data
![Page 12: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/12.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
12
Scaling out more (number_of_replicas: n)
number of primary shard fixed at index creation can dynamically increase the number of replica shards more copies of you data means higher read throughput
![Page 13: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/13.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
13
Coping with failure
previous master node fails triggers a new master node election new master instantly promotes replicas to primary
![Page 14: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/14.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
14
Distributed
• Replication: Data duplication
• read scalability
• high-availability
• Sharding: Data partitioning
• split logical data over several machines
• write scalability
• control data flow
![Page 15: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/15.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
15
mapping
analysis query dsl
![Page 16: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/16.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
16
Search
mapping
analysis query dsl
![Page 17: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/17.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
17
flexible, powerful query language
query dsl
![Page 18: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/18.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
18
query dsl
• relevance • full text • not cached • slower
queries filters• boolean yes/no • exact values • cached • faster
Filter first, then query remaining docs
![Page 19: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/19.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
19
query dsl: basic query
GET /_search{ "query": {...} }
![Page 20: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/20.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
20
query dsl: basic query
GET /_search{ "query": { "match": { "title": "search" }} }
![Page 21: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/21.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
21
query dsl: filtered query
GET /_search{ "query": { "filtered": { "query": {...}, "filter": {...} } }}
![Page 22: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/22.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
22
query dsl: filtered query
GET /_search{ "query": { "filtered": { "query": { "match": { "title": "search" }}, "filter": { "term": { "status": "active" }} } }}
![Page 23: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/23.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
23
other filter types
WHERE field CONTAINS "value"term filter
"term": { "title": "brown" }
WHERE field IN ["val",…]terms filter
"terms": { "title": ["quick", "pets"] }
![Page 24: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/24.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
24
other filter types
WHERE field >= x AND field < y
range filter
"range": { "content":{ "gte": 10, "lt": 80 } }
"range": { "date":{ "gte": "2014-01-01", "lt": "2041-02-01" } }
![Page 25: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/25.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
25
boolean filter types
"bool": { "must": [ <filters> ], "should": [ <filters> ], "must_not": [ <filters> ] }
AND
OR
NOT
![Page 26: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/26.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
26
query dsl: full example{ "filtered": { "query": { "match": { "title": "full text search" }}, "filter": { "bool": { "must": { "range": { "created": { "gte": "now - 1d / d" }}}, "should": [ { "term": { "featured": true }}, { "term": { "starred": true }} ], "must_not": { "term": { "deleted": false }} } } }}
![Page 27: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/27.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
27
query dsl: filters cached individually{ "filtered": { "query": { "match": { "title": "full text search" }}, "filter": { "bool": { "must": { "range": { "created": { "gte": "now - 1d / d" }}}, "should": [ { "term": { "featured": true }}, { "term": { "starred": true }} ], "must_not": { "term": { "deleted": false }} } } }}
![Page 28: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/28.jpg)
www.elastic.co28
analytics (aggregations dsl)
![Page 29: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/29.jpg)
www.elastic.co29
Types of Aggregations
• Terms• Date Histogram• Filter• Range• Nested• Children• ….
Buckets• Stats• Percentile• Cardinality• Top hits• Scripted• Max | Min | Avg• ….
Metrics
![Page 30: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/30.jpg)
www.elastic.co30
aggs = buckets + calculated metric
CA
TX
MA
CO
AZ
![Page 31: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/31.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
31
How do aggs work?
data nodes
coordinating node
• ‘inline’ with search query • execute in isolation on each shard • 4 phases • parse • collect • combine • reduce
![Page 32: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/32.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
32
Phase 1 : Parse
• Coordinating node splits the request into shard request
• shards parse aggregation and initialize data structures
data nodes
coordinating node
![Page 33: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/33.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
33
Phase 2 + 3: Collect & Combine
• shards process all matching documents
• once done, they combine the aggregated data into an aggregation
data nodes
coordinating node
![Page 34: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/34.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
34
Phase 4: Reduce
• shards sends their aggregation to the coordinating node
• coordinating node reduces them into a single aggregation
34
data nodes
coordinating node
![Page 35: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/35.jpg)
www.elastic.co35
Aggregation DSL Example
.. “aggs”: { “by_date”: { “date_historgram”: {
“field”: “timestamp”, “interval”: “day” }, “aggs”: { “max_temperature”: { “max” : { “field”:”temperature” } } }
…
Request.. “aggregation”: { “by_date”: { “buckets”: [ { “key”: “2015-01-01T00:00:00.000Z”, “doc_count”: 24, “max_temperature”: { “value” : 23 } }] } }…
Response
![Page 36: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/36.jpg)
www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited
36
• Single network round-trip • Single pass through the data on shards • Aggregates are computed in-memory • Trades accuracy for speed in some use cases • Aggregations can be composed • Near real-time response times
Designed for speed and scale
![Page 37: Elasticsearch - key featuresfiles.meetup.com/4046992/Elastic-key-features_2015(Alan).pdf · Elasticsearch Distributed, scalable, and resilient Designed for scale-out; high availability](https://reader035.vdocuments.us/reader035/viewer/2022062602/5ee12ff1ad6a402d666c2802/html5/thumbnails/37.jpg)
Q & A