statistical, geospatial, temporal, graph analysis with the ... · 1 steve kearns sr. director,...

Post on 17-Aug-2019

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Steve KearnsSr. Director, Product Management@skearns64

Statistical, Geospatial, Temporal, Graph

Multi-Modal Analysis with the Elastic Stack

2

Mo’ Data, Mo’ Problems

Notorious B.I.G.*

3

Today’s Missions Have Complex Requirements

Critical Mission Requirements Many users / many needs

Security / Integrity

Cross-Source Insights

Speed

Scale

Data Enrichment / Quality

DataComplex/Diverse

Location

Machine/Log Files

User-Activity

Documents

Social

Operational RequirementsReal-TimeAvailability

Rapid QueryExecution

FlexibleData Model

High Availability

Horizontal Scale

Simple APIs, Powerful UI

4

Elastic Cloud

Security

Monitoring

Alerting

Graph

X-Pack

KibanaUser Interface

ElasticsearchStore, Index,& Analyze

IngestLogstash Beats

+

ElasticStack

How Do We Help?

5

Multi-Modal Analysis with the Elastic Stack

• Statistical‒ Count, summarize, maybe do some math

• Temporal‒ How does your data change over time?

• Geospatial‒ Where are things happening? Combine with statistical, soon temporal!

• Graph‒ Entirely new way of exploring your data.

Many ways to explore, navigate and discover

6

7

8

9

10

11

New: Timelion

• Kibana Plugin

• New Expression Syntax‒ Describe query, transformation(s), and visualization in one line

• Highly configurable charts

Advanced date math and visualization, without the fuss

12

13

GraphThe Origin Story

14

Data is not FlatMuch like the world

"_source": {"created_at": "Tuesday Mar 28 12:10:52 +0000 2016","text": “Can’t wait for #HLTCON!","user": {"name": "Steve Kearns","screen_name": "skearns64","location": "Boston, MA"

},"hashtags": [{"text": “HLTCON”}]."lang": "en","@timestamp": "2016-03-24T12:09:52.000Z",

}

15

Relationships live in our data

• Direct: one document references multiple entities

"user": {"screen_name": "skearns64","location": "Boston, MA",

}

• Indirect: two or more documents share a reference

"user": {"screen_name": "skearns64","location": "Boston, MA",

}

"user": {"screen_name": ”imotov","location": "Boston, MA",

}

1616

What is Graph Technology Good for?

17

Fraud Detection

• Given credit card purchase histories..‒ Where did people with fraudulent purchases shop most often?‒ What purchasing patterns are unique to this group of suspects? New persons

involved?

• Given car emissions data…‒ Which car manufacturer fails emissions tests most often? ‒ At which shops?

18

Identifying Relationships

• Given a set of documents with extracted entities…‒ What topics / entities / locations are meaningfully related? ‒ If I know one bad actor, can I find others?

• Given network traffic data…‒ What external IPs do machines on my network talk to?‒ If I know one bad actor/IP, can I find others?

19

Recommendations

• Given my purchase history…‒ What am I most likely to buy next?

• Given Last.FM music preferences…‒ What music do people who like Mozart also like? ‒ Can I use this to identify new hate groups?

• Given search and click data.. ‒ What results do people who searched for “Belgium” tend to click on?

20

…There’s no limit to how complicated things can get, on account of one thing always leading to another…E.B. WhiteAmerican essayist, columnist, poet and editor

21

…There’s no limit to how complicated things can get, on account of one thing always leading to another…E.B. WhiteAmerican essayist, columnist, poet and editor

Theoretical Challenges with Graph Technology

• Zipf’s Law results in super-connected entities

• Super connected entities make graph exploration difficult

• Graph exploration is typically done by “most frequent” connections

!X

Dataset Doctype Sharedreferencepoints(concepts) “Super-connected”values

Twitter tweet accountids,hashtags #YOLO

Movielens user likedmovieIDs Shawshankredemption

LastFM user listened-tobands Coldplay,Radiohead,Beatles

Wikipedia article linkedarticleid UnitedStates,LivingPeople

Phonerecords call phonenumber Taxifirms

22

Simple API that combines Search and Graph Techniques

• Simple graph-walking API

• Leverages full Elasticsearch query language

• Relevance or count-based

• Explore your existing indexes

• Distributed query execution

• Near-real-time data availability

!X!X

What have we built?

23

Simple API that combines Search and Graph Techniques

• Simple graph-walking API

• Leverages full Elasticsearch query language

• Relevance or count-based

• Explore your existing indexes

• Distributed query execution

• Near-real-time data availability

!X

24

Simple API that combines Search and Graph Techniques

!X

GET /lastfm_raw/_graph/explore

{ "query": { "query_string": { "query": "Mozart” } }, "vertices": [{ "field": “artists.raw” }], "connections": { "vertices": [{ "field": “artists.raw" }] }}

25

Simple UI to Explore Your Data in New Ways

!X

top related