whats new in elasticsearch 2.0?

36
What's New in Elasticsearch 2.0? Ryan Ernst Elastic Engineering

Upload: ryan-ernst

Post on 21-Jan-2018

1.488 views

Category:

Technology


0 download

TRANSCRIPT

What's New in Elasticsearch 2.0?

Ryan Ernst Elastic Engineering

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

2

About Elastic

• Founded: July 2012 • Renamed Elasticsearch → Elastic: Mar 2015 • Headquarters: Amsterdam and Mountain View, CA • Develops Elasticsearch, Logstash, Kibana, Beats • Provides: • Training (public and onsite) • Development and production support • Hosted Elasticsearch (Found) • Commercial plugins: Marvel, Shield, Watcher

The Elastic Stack

3

Ingest

Store, Index, & Analyze

User Interface Kibana

Elasticseach

Logstash Beats

Plugins Monitoring Security Alerting

Found: Elasticsearch as a ServiceHosted Service

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

4

Elasticsearch 2.0!

•Very large release • >2,500 Pull Requests

• 469 contributors

• Four themes

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

5

Four Main Themes in 2.0

• Simplification • Removing, deprecating features • Query DSL / Doc improvements

• Security • Always high on customer wish lists

• Resiliency • Started in 1.x, but ongoing

• Features • pipeline aggs • Compression

Theme 1: Simplification

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

7

Removed Entirely

• Rivers - use logstash or create your own ingestion layer • Facets - replaced by aggregations • _shutdown API - use platform specific services • Support for Thrift and Memcached protocols • Bulk UDP - use the standard bulk API, or use UDP to send

documents to Logstash first.

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

8

Moved to Plugins

• Delete by query • Problematic, not a "core" feature

• Types: • murmur3 • _size

• Multicast discovery • Unicast was always recommended in production

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

9

Mappings

• Conflicting field mappings • Fields cannot be referenced by short name • Type name prefix removed • Field names cannot contain dots • Type names cannot start with a dot • Type may no longer be deleted • index_analyzer is removed • _analyzer field is removed • date format changes • ... and more ...

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

10

Conflicting Mappings

PUT my_index { "mappings": { "type_one": { "properties": { "name": { "type": "string" } } }, "type_two": { "properties": { "name": { "type": "string", "analyzer": "english" } } } } }

What is the mapping for name? Unexpected results. This is not allowed in Elasticsearch 2.0.

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

11

Ambiguous Mappings in < 2.0

PUT my_index { "mappings": { "name": { "properties": { "title": { "type": "string" }, "name": { "properties": { "title": { "type": "string" } } } } } } }

What does name refer to? name.title? name.name.title?

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

12

Refactored Mappings in 2.0

PUT my_index { "mappings": { "name": { "properties": { "title": { "type": "string" }, "name": { "properties": { "title": { "type": "string" } } } } } } }

name.name.title is not a thing.

title

name.title

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

13

Analyzer Mappings

PUT my_index { "mappings": { "my_type": { "properties": { "title": { "type": "string", "analyzer": "my_analyzer } } } } }

There are some changes in how field-specific analyzers are now set. This format, which sets both search and index analyzers, is still acceptable in 2.0.

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

14

Analyzer Mappings

• Before 2.0: • analyzer - sets index and search analyzer • search_analyzer - sets search analyzer • index_analyzer - sets index analyzer

• Starting with 2.0: • analyzer - sets index and search analyzers • search_analyzer - overrides search analyzer

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

15

Query and Filter Execution Changes

• Before 2.0 • Queries: • Typically contribute to scoring • No caching

• Filters: • Don't contribute to scoring • Can be cached

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

16

Query and Filter Execution Changes

{ "filtered" : { "query": { query definition }, "filter": { filter definition } } }

Before 2.0:

{ "bool" : { "must": { query definition }, "must_not": { query definition }, "should": { query definition }, "filter": { filter definition } } }

After 2.0:

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

17

Query and Filter Execution Changes

• Approximation phase • quickly iterates over a superset of the matching

documents • Verification phase • check if a document in this superset actually matches

the query

Two-Phase Query Execution

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

18

Analyzer Mappings

{ "bool" : { "must": [{ "match_phrase": { "body": "quick fox" }, { "match_phrase": { "body": "brown dog" } }] } }

Two-Phase Query Execution Example

• Approximation phase • all docs with "quick", "fox", "brown", and "dog"

• Verification phase • actual phrase matching

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

19

Query and Filter Execution Changes

• Fully automatic • Keeps track of 256 most recently used queries • Only caches those that appear 5 times or more • Does not cache segments which have less than 10000

documents or 3% of the documents of the index • More efficient query cache (roaring bitmaps) • Non-scoring components are cache-able

Query Caching

Theme 2: Security

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

21

Security Enhancements

• Elasticsearch now binds to local interfaces ONLY • Unicast discovery is now the default • Makes Elasticsearch more secure by default • Protects Elasticsearch in the wild (don't do that!) • Security Manager • Prevents outside access outside of elasticsearch even if

elasticsearch process is compromised • All resources that elasticsearch can access are defined

on node startup

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

22

Plugins

• Isolated from each other (separate class loaders) • Extendable security policy (2.2) • Warns user on install when any additional permissions

are requested • Shared setup to allow common build and test • Maven parent POM in 2.x • Gradle plugin in 3.x

• Plugin descriptor • Contains version of elasticsearch built against

Theme 3: Resiliency

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

24

Durability of Transaction Log

• Before 2.0 transaction log was fsynced every 5 sec • Transaction log is now fsynced after each operation • Configurable • On SSDs indexing is about 7% - 10% slower with bulk

indexing compared to async translog flushes

Index operations are now durable by default!

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

25

Multiple data path striping

Take advantage of striping in path.data configuration:

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

26

Multiple data path striping

Before Elasticsearch 2.0:

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

27

Multiple data path striping

PIC

Now safer in Elasticsearch 2.0!

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

28

Cluster State Diffs

• Before 2.0, the entire cluster state was shipped on every change to every node

• Starting with 2.0 only changes are sent • This can be a massive improvement on clusters with large

cluster states!

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

29

Non-Ambiguous Setting Units

curl -XPUT "localhost:9200/test/_settings" -d '{ "index" : { "refresh_interval" : "5" } }'

Settings now require units (when appropriate)

5 what??

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

30

Doc Values by Default

• Fielddata was a common culprit in OOMs • Doc Values: Lucene data structure (disk-based) • Dramatic heap memory reduction by default • Values for sorting, aggs, etc are moved onto disk • Let the OS deal with it! • Indexed, not_analyzed fields now use doc values • Only for indices created with 2.0 • Reindex required for older data

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

31

Previous Resiliency Improvements

• Sync-flush (1.6) • Async shard allocation (1.6) • Delayed Allocation (1.7) • Better handling of nodes leaving/rejoining

• Resiliency page contains latest information: • https://www.elastic.co/guide/en/elasticsearch/resiliency/current/index.html

Theme 4: Features

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

33

Pipeline Aggregations

• Derivatives • Moving average • Holt Winters (prediction / anomaly detection) • Stats: Min/Max/avg • Time-series math

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

34

Index Compression

• 10-30% reduction in index size • Some indexing/merging impact • Dynamic setting - could be set before optimization for

time-based indices

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

35

Upgrading to Elasticsearch 2.0

• Major Version Upgrade!!! • No rolling upgrades • One way - no way to downgrade back to 1.x • Take Snapshot (and test restore) before proceeding • Test! Test! Test! • Use the Migration plugin • Site plugin for 1.x that checks for potential issues • https://github.com/elastic/elasticsearch-migration

www.elastic.co Copyright Elastic 2015 Copying, publishing and/or distributing without written permission is strictly prohibited

36

Thank you!

@[email protected]