chicago solr meetup - june 10th: exploring hadoop with search

Post on 11-May-2015

231 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Exploring Hadoop with SearchPritesh Patel, Principal Architect Search and Big Data Analytics @ Avalon Consulting, LLC

Hadoop Ecosystem

Possible Integration Points

Why Search + Big Data?What Hadoop is good at What Search is good at

Distributed File storage Free text retrieval

Store large data sets Index large data sets

Distributed Processing Textual Analysis

Filtering and Sorting

= Intelligence Discovery System of large textual data sets

How we Integrated Search and Big Data Hbase Replication Facade

Take advantage of results of Analytical Pig and Hive jobs in Hadoop to make retrieval more intelligent

Done with inbuilt replication and it scales Fast access since in Memory Push architecture so its near real time CRUD

Store in HDFS and Search in LW/Solr Gives reference to source when integrated this way Hbase has a RestFul API to retrieve data given ID

that Solr would have after replication/indexing

Our Demo Architecture

Diagram by Varun Rao @ Avalon Consulting, LLC

A Use Case of this Architecture Monitor tweets with words “Hadoop”,

“Lucidworks”, and “Big Data” Automatically extract url’s mentioned when

talking about these terms In near real time visualize which urls seem to

be mentioned with these terms Discover urls that are becoming the most

popular when mentioned with the topics “Big Data”, “Lucidworks”, and “Hadoop” and those might be urls you want to read

Demo Any one want to send a tweet? Just use

one or more of the words “Hadoop”, “Lucidworks”, “Big Data”

Add the any url to the tweet that you’d like to share. Try: www.avalonconsult.com or www.lucidworks.com

So much potential You can apply this to so many things. Do intelligent entity extraction to

discover topics with UIMA integration of Solr

Do similar analysis of popular mentions and people of the topics of choice

Endless … Any questions?

Team Client Implementation done by Kevin

Risden @ Avalon (risdenk@avalonconsult.com)

Demo Architecture Team Varun Rao @ Avalon (

raov@avalonconsult.com) Pritesh Patel @ Avalon (

patelp@avalonconsult.com)

top related