machine learning and the elastic stack

15
1 Dr. Stephen Dodson, Tech Lead Machine Learning, Elastic Machine Learning and the Elastic Stack

Upload: yann-cluchey

Post on 15-Feb-2017

148 views

Category:

Technology


7 download

TRANSCRIPT

Page 1: Machine Learning and the Elastic Stack

1

Dr. Stephen Dodson, Tech Lead Machine Learning, Elastic

Machine Learning and the Elastic Stack

Page 2: Machine Learning and the Elastic Stack

2

Overview

•  Background

•  Machine Learning Overview

•  Machine Learning and the Elastic Stack

•  Demo

•  Architecture

Page 3: Machine Learning and the Elastic Stack

Background •  Me

–  Currently, Tech Lead, Machine Learning @ Elastic –  Formally, Founder and CTO of Prelert (acquired by Elastic

September 2016) ‒  Presented overview of Prelert at Elastic London User Group in

May 2016

•  Prelert –  VC backed software company, founded 2009 –  Behavioural analytics for machine data based (mainly) on

unsupervised machine learning –  100+ customers + OEMs with CA, Bluecoat, NetApp + others

‒  IT Operations, IT Security, Retail analytics, IoT etc..

Page 4: Machine Learning and the Elastic Stack

4

Machine Learning

•  Algorithms and methods for data driven prediction, decision making, and modelling1

‒  Learn models from past behaviour (training, modelling) ‒  Use models to predict future behaviour (prediction) ‒  Use predictions to make decisions

•  Examples ‒  Image Recognition ‒  Language Translation ‒  Anomaly Detection

1Machine Learning Overview, Tommi Jaakkola, MIT

Page 5: Machine Learning and the Elastic Stack

5

How is this relevant to the Elastic Stack? •  Extracting useful, valuable information is hard

Search

Aggregations

Visualization

Machine Learning

Search

Aggregations

Visualization

Machine Learning

Search

Aggregations

Visualization

Machine Learning

Search

Aggregations

Visualization

Machine Learning

Search

Aggregations

Visualization

Machine Learning

Page 6: Machine Learning and the Elastic Stack

6

How is this relevant to the Elastic Stack?

•  What if we want to search for: ‒  Has my order rate dropped significantly? ‒  Do my application logs contain unusual messages? ‒  Are any users behaving unusually? ‒  What transactions are fraudulent?

•  Goal of ML at Elastic: Extend the Elastic Stack to allow the user to ask these type of questions and get understandable answers

•  Constraints: ‒  Data may be limited: no markup may be available or relevant ‒  Compute resource dedicated to machine learning may be limited ‒  User should not need to be a machine learning expert or data scientist

Page 7: Machine Learning and the Elastic Stack

7

Has my order rate dropped significantly?

Page 8: Machine Learning and the Elastic Stack

8

Has my order rate dropped significantly?

•  Learn models from past behaviour (training, modelling)

•  Use models to predict future behaviour (prediction)

•  Use predictions to make decisions

Expected value @ 15:05 = 1859

Actual value @ 15:05 = 280

Probability = 0.0000174025

Page 9: Machine Learning and the Elastic Stack

Demo: Simple Time Series

Page 10: Machine Learning and the Elastic Stack

10

Do my application logs contain unusual messages?

Page 11: Machine Learning and the Elastic Stack

11

Do my application logs contain unusual messages? Classify unstructured log messages by clustering similar messages

Nor

mal

Log

Mes

sage

s U

nusu

al lo

g M

essa

ges

Page 12: Machine Learning and the Elastic Stack

Demo: Multiple Data Sources

Page 13: Machine Learning and the Elastic Stack

13

Analytics Outside of Elastic Architecture

Beats

Logstash

Kibana

X-Pack X-Pack

Elasticsearch Prelert analysis node Data

Kibana Prelert UI

•  Issues –  Data Gravity – data from Elasticsearch needs to be sent to Prelert analytics node –  Context – anomalies and data are stored in different data stores and viewed in different Uis –  Scale – Prelert analysis was not easily distributable across nodes –  Resilience – Prelert analysis needed to be restored manually on failover

Page 14: Machine Learning and the Elastic Stack

14

Architecture •  Machine Learning will be part of X-Pack

•  Machine Learning jobs will be automatically distributed across the Elasticsearch cluster

•  Machine Learning jobs will be resilient to failover

•  Machine Learning results and data can be in the same cluster

Beats

Logstash

Kibana

X-Pack X-Pack

Elasticsearch

Security

Alerting

Monitoring

Reporting

Graph

Machine LearningICON TBD!!

X-Pack

Page 15: Machine Learning and the Elastic Stack

15

Status

•  Demo on Elastic 5.4 available at Elastic{ON} (March 7th 2017)

•  GA shortly after… (ask Sophie!)

•  Focus of initial ML product is time series analysis in real-time ‒  Metric anomaly detection ‒  Log message classification and anomaly detection ‒  Population analysis (entity profiling)

•  Shrink-wrapped configurations on Beats data - full Elastic Stack experience!

Beats

X-Pack

Elasticsearch AlertingMachine LearningICON TBD!!

Kibana