machine learning and the elastic stack
TRANSCRIPT
1
Dr. Stephen Dodson, Tech Lead Machine Learning, Elastic
Machine Learning and the Elastic Stack
2
Overview
• Background
• Machine Learning Overview
• Machine Learning and the Elastic Stack
• Demo
• Architecture
Background • Me
– Currently, Tech Lead, Machine Learning @ Elastic – Formally, Founder and CTO of Prelert (acquired by Elastic
September 2016) ‒ Presented overview of Prelert at Elastic London User Group in
May 2016
• Prelert – VC backed software company, founded 2009 – Behavioural analytics for machine data based (mainly) on
unsupervised machine learning – 100+ customers + OEMs with CA, Bluecoat, NetApp + others
‒ IT Operations, IT Security, Retail analytics, IoT etc..
4
Machine Learning
• Algorithms and methods for data driven prediction, decision making, and modelling1
‒ Learn models from past behaviour (training, modelling) ‒ Use models to predict future behaviour (prediction) ‒ Use predictions to make decisions
• Examples ‒ Image Recognition ‒ Language Translation ‒ Anomaly Detection
1Machine Learning Overview, Tommi Jaakkola, MIT
5
How is this relevant to the Elastic Stack? • Extracting useful, valuable information is hard
Search
Aggregations
Visualization
Machine Learning
Search
Aggregations
Visualization
Machine Learning
Search
Aggregations
Visualization
Machine Learning
Search
Aggregations
Visualization
Machine Learning
Search
Aggregations
Visualization
Machine Learning
6
How is this relevant to the Elastic Stack?
• What if we want to search for: ‒ Has my order rate dropped significantly? ‒ Do my application logs contain unusual messages? ‒ Are any users behaving unusually? ‒ What transactions are fraudulent?
• Goal of ML at Elastic: Extend the Elastic Stack to allow the user to ask these type of questions and get understandable answers
• Constraints: ‒ Data may be limited: no markup may be available or relevant ‒ Compute resource dedicated to machine learning may be limited ‒ User should not need to be a machine learning expert or data scientist
7
Has my order rate dropped significantly?
8
Has my order rate dropped significantly?
• Learn models from past behaviour (training, modelling)
• Use models to predict future behaviour (prediction)
• Use predictions to make decisions
Expected value @ 15:05 = 1859
Actual value @ 15:05 = 280
Probability = 0.0000174025
Demo: Simple Time Series
10
Do my application logs contain unusual messages?
11
Do my application logs contain unusual messages? Classify unstructured log messages by clustering similar messages
Nor
mal
Log
Mes
sage
s U
nusu
al lo
g M
essa
ges
Demo: Multiple Data Sources
13
Analytics Outside of Elastic Architecture
Beats
Logstash
Kibana
X-Pack X-Pack
Elasticsearch Prelert analysis node Data
Kibana Prelert UI
• Issues – Data Gravity – data from Elasticsearch needs to be sent to Prelert analytics node – Context – anomalies and data are stored in different data stores and viewed in different Uis – Scale – Prelert analysis was not easily distributable across nodes – Resilience – Prelert analysis needed to be restored manually on failover
14
Architecture • Machine Learning will be part of X-Pack
• Machine Learning jobs will be automatically distributed across the Elasticsearch cluster
• Machine Learning jobs will be resilient to failover
• Machine Learning results and data can be in the same cluster
Beats
Logstash
Kibana
X-Pack X-Pack
Elasticsearch
Security
Alerting
Monitoring
Reporting
Graph
Machine LearningICON TBD!!
X-Pack
15
Status
• Demo on Elastic 5.4 available at Elastic{ON} (March 7th 2017)
• GA shortly after… (ask Sophie!)
• Focus of initial ML product is time series analysis in real-time ‒ Metric anomaly detection ‒ Log message classification and anomaly detection ‒ Population analysis (entity profiling)
• Shrink-wrapped configurations on Beats data - full Elastic Stack experience!
Beats
X-Pack
Elasticsearch AlertingMachine LearningICON TBD!!
Kibana