sf solr meetup - interactively search and visualize your big data

33
INTERACTIVELY SEARCH AND VISUALIZE YOUR DATA WITH SOLR AND SPARK Romain Rigaux

Upload: gethue

Post on 09-Jan-2017

1.352 views

Category:

Data & Analytics


3 download

TRANSCRIPT

Page 1: SF Solr Meetup - Interactively Search and Visualize Your Big Data

INTERACTIVELY SEARCH AND VISUALIZE YOUR DATA WITH SOLR AND SPARK

Romain Rigaux

Page 2: SF Solr Meetup - Interactively Search and Visualize Your Big Data

GOALS

Build  a  Web  app  Quickly  explore  data  

…  with  Solr

make  Solr  /  Hadoop  easier  to  use

+

Page 3: SF Solr Meetup - Interactively Search and Visualize Your Big Data

ARCHITECTURE“Just  a  view”  on  top  of  the  standard  Solr  API

REST

Page 4: SF Solr Meetup - Interactively Search and Visualize Your Big Data

HISTORYV1 USER

Page 5: SF Solr Meetup - Interactively Search and Visualize Your Big Data

HISTORYV1 ADMIN

Page 6: SF Solr Meetup - Interactively Search and Visualize Your Big Data

ARCHITECTURENEXT!

Lot  of  learning,  UX  Boost  needed  

Simple,  don’t  know  it  is  Solr

Page 7: SF Solr Meetup - Interactively Search and Visualize Your Big Data

HISTORYV2 USER

Page 8: SF Solr Meetup - Interactively Search and Visualize Your Big Data

HISTORYV2 ADMIN

Page 9: SF Solr Meetup - Interactively Search and Visualize Your Big Data

HISTORYV2 BETTER UX

Page 10: SF Solr Meetup - Interactively Search and Visualize Your Big Data

ARCHITECTURE

/select  /admin/collections  /get  /luke...

/add_widget  /zoom_in  /select_facet  /select_range...

REST AJAXTemplates  

+  JS  Model

www….

Page 11: SF Solr Meetup - Interactively Search and Visualize Your Big Data

ARCHITECTUREUI FOR FACETS

Query

Collection

 Layout All  the  2D  positioning  (cell  ids),  visual,  drag&drop

Dashboard,  fields,  template,  widgets  (ids)

Search  terms,  selected  facets  (q,  fqs)

Page 12: SF Solr Meetup - Interactively Search and Visualize Your Big Data

ADDING A WIDGETLIFECYCLE

Load  the  initial  page  Edit  mode  and  Drag&Drop

/solr/zookeeper/clusterstate.json  /solr/admin/luke…

/get_collection

Page 13: SF Solr Meetup - Interactively Search and Visualize Your Big Data

ADDING A WIDGETLIFECYCLE

/solr/select?stats=true /new_facet

Select  the  field  Guess  ranges  (number  or  dates)  Rounding  (number  or  dates)

Page 14: SF Solr Meetup - Interactively Search and Visualize Your Big Data

ADDING A WIDGETLIFECYCLE

Query  part  1

Query  Part  2

Augment  Solr  response

facet.range={!ex=bytes}bytes&f.bytes.facet.range.start=0&f.bytes.facet.range.end=9000000&  f.bytes.facet.range.gap=900000&f.bytes.facet.mincount=0&f.bytes.facet.limit=10

q=Chrome&fq={!tag=bytes}bytes:[900000+TO+1800000]

{ 'facet_counts':{ 'facet_ranges':{ 'bytes':{ 'start':10000, 'counts':[ '900000', 3423, '1800000', 339,

... ] } }}

{ ..., 'normalized_facets':[ { 'extraSeries':[

], 'label':'bytes', 'field':'bytes', 'counts':[ { 'from’:'900000', 'to':'1800000', 'selected':True, 'value':3423, 'field’:'bytes', 'exclude':False } ], ... } }}

Page 15: SF Solr Meetup - Interactively Search and Visualize Your Big Data

JSON TO WIDGET{ "field":"rate_code","counts":[ { "count":97797, "exclude":true, "selected":false, "value":"1", "cat":"rate_code" } ...

{ "field":"medallion","counts":[ { "count":159, "exclude":true, "selected":false, "value":"6CA28FC49A4C49A9A96", "cat":"medallion" } ….

{ "extraSeries":[

],"label":"trip_time_in_secs","field":"trip_time_in_secs","counts":[ { "from":"0", "to":"10", "selected":false, "value":527, "field":"trip_time_in_secs", "exclude":true } ...

{ "field":"passenger_count","counts":[ { "count":74766, "exclude":true, "selected":false, "value":"1", "cat":"passenger_count" } ...

Page 16: SF Solr Meetup - Interactively Search and Visualize Your Big Data

REPEATUNTIL…

Page 17: SF Solr Meetup - Interactively Search and Visualize Your Big Data

GAME CHANGER!

Possibilihes

5.1  /  5.2

Analyhc  Facets

Page 18: SF Solr Meetup - Interactively Search and Visualize Your Big Data

FACETFUNCTIONS

Count  Sum  Avg  Percentile  Max  ...

Count(id)  Sum(bytes)  Avg(mul(price,  quantity))  Percentile(salary,  50,  90)  Max(temperature)  ...

Page 19: SF Solr Meetup - Interactively Search and Visualize Your Big Data

FACETFUNCTIONS

Page 20: SF Solr Meetup - Interactively Search and Visualize Your Big Data

SUB “NESTED”FACETS

top_os  {      type:  term,      field:  os,      limit:  5  }

top_os  {      type:  term,      field:  os,      limit:  5,      facet  :  {          by_country:  {              type:  term,              field:  country          }      }  }

Page 21: SF Solr Meetup - Interactively Search and Visualize Your Big Data

FUNCTION + NESTED =ANALYTICS states  {  

   type:  term,      field:  state,      facet  :  {        by_month  :  {              type:  range,              field:  time,              start:  “TODAY-­‐6MONTHS”,              end:  “TODAY”,              gap:  “MONTH”,              facet  :  {                    avg_sal:  “avg(salary)”              }          }      }  }

states  {      type:  term,      field:  state,      facet  :  {          avg_sal:  “avg(salary)”      }  }

Page 22: SF Solr Meetup - Interactively Search and Visualize Your Big Data

OPERATIONS ONBUCKETS OF DATA

Counts  →  Functions

Page 23: SF Solr Meetup - Interactively Search and Visualize Your Big Data

OPERATIONS ONBUCKETS OF DATA

Nested  →  nD  functions

Page 24: SF Solr Meetup - Interactively Search and Visualize Your Big Data

SEARCH AS ONLYAPP IN HUE

gethue.com/solr-­‐search-­‐ui-­‐only/

Page 25: SF Solr Meetup - Interactively Search and Visualize Your Big Data

• Spark  in  your  browser  

• Notebooks  

• New  REST  Server

SPARKINDEXING WHAT

Page 26: SF Solr Meetup - Interactively Search and Visualize Your Big Data

• Open  source  REST  for  Spark  Shell  

• Runs  locally  or  inside  YARN  

• Spark  Scala,  PySpark  and  jar/py  submission

SPARKINDEXING WHAT

hpps://github.com/cloudera/hue/tree/master/apps/spark/java

Page 27: SF Solr Meetup - Interactively Search and Visualize Your Big Data

LIVY ARCH YARN LOCAL

Livy  Server

Livy  REPL

Spark  Contexts

Spark  Worker

Livy  ServerYARN  Master

YARN  Node

Livy  REPL

Spark  Context  /  PySpark

YARN  Node

Spark  Worker

YARN  Node

Spark  Worker

1

2

3

4

Page 28: SF Solr Meetup - Interactively Search and Visualize Your Big Data

SPARK STREAMING

Real  hme!                    Spark  Solr

Page 29: SF Solr Meetup - Interactively Search and Visualize Your Big Data

• Python  

• Scala  

• Charts

NOTEBOOKS / SHELL

WHAT

Page 30: SF Solr Meetup - Interactively Search and Visualize Your Big Data

DEMO TIME• Analyze  Bay  area  bike  share  

• Visualize  one  year  of  data  

• Know  your  users,  predict  behavior

Page 31: SF Solr Meetup - Interactively Search and Visualize Your Big Data

MISSEDSOMETHING?

demo.gethue.com

Page 32: SF Solr Meetup - Interactively Search and Visualize Your Big Data

• Full  Analyhcs  

• Easier  indexing  

• Geo  

• Export/Share  results  

• Solr  Joins,  Solr  SQL  

• Spark,  SQL...  integrahon,  Hue  4

WHAT’S NEXT

NEW FEATURES

Page 33: SF Solr Meetup - Interactively Search and Visualize Your Big Data

TWITTER

@gethue

USER GROUP

hue-­‐user@

WEBSITE

hpp://gethue.com

LEARN

hpp://learn.gethue.com

THANKS!