building a recommendation engine with spring and hadoop

88
BUILDING ENGINES WITH SPRING

Upload: spring-io

Post on 08-Jul-2015

980 views

Category:

Software


1 download

DESCRIPTION

Speaker: Michael Minella Big Data Track The Amazon’s and Google’s of the world have had Ph.D.’s locked up in back rooms for years creating algorithms to get you to click on things and subsequently buy stuff. One of the big things that those smart people have been working on are recommendation engines. Today, a recommendation engine isn’t something that only the Amazon’s of the world can have. With an hour, and a handful of open source tools, we’ll build a recommendation engine based on the data from the website we probably spend the most time on…StackOverflow. We’ll use Spring XD and Spring Batch to orchestrate the full lifecycle of Hadoop processing (ingest, process, export) and use Apache Mahout to provide us with the recommendation processing. A basic understanding of Hadoop concepts (what Map/Reduce is) and Spring (basic D/I configuration) is expected for this talk.

TRANSCRIPT

Page 1: Building a Recommendation Engine with Spring and Hadoop

BUILDING

ENGINES

WITH SPRING

Page 2: Building a Recommendation Engine with Spring and Hadoop

MICHAEL MINELLATWITTER: @MICHAELMINELLA

HOME PAGE: SPRING.IO/TEAM/MMINELLA

Page 3: Building a Recommendation Engine with Spring and Hadoop

WHAT I’M NOT

Page 4: Building a Recommendation Engine with Spring and Hadoop
Page 5: Building a Recommendation Engine with Spring and Hadoop

https://github.com/SpringOne2GX-2014/

Page 6: Building a Recommendation Engine with Spring and Hadoop

THANK YOUSEBASTIAN SCHELTERPAT FERREL

Page 7: Building a Recommendation Engine with Spring and Hadoop
Page 8: Building a Recommendation Engine with Spring and Hadoop

13

Page 9: Building a Recommendation Engine with Spring and Hadoop
Page 10: Building a Recommendation Engine with Spring and Hadoop

RECOMMENDATION

ALGORITHMS

Page 11: Building a Recommendation Engine with Spring and Hadoop
Page 12: Building a Recommendation Engine with Spring and Hadoop

L E T ’ S S E T S O M E

EXPECTATIONS

Page 13: Building a Recommendation Engine with Spring and Hadoop
Page 14: Building a Recommendation Engine with Spring and Hadoop
Page 15: Building a Recommendation Engine with Spring and Hadoop
Page 16: Building a Recommendation Engine with Spring and Hadoop
Page 17: Building a Recommendation Engine with Spring and Hadoop
Page 18: Building a Recommendation Engine with Spring and Hadoop

SCALE OF THE PROBLEM

Page 19: Building a Recommendation Engine with Spring and Hadoop

MILLIONS OF

USERS

Page 20: Building a Recommendation Engine with Spring and Hadoop

100,000’s OF

ITEMS

Page 21: Building a Recommendation Engine with Spring and Hadoop

TOOLS AND

TECHNOLOGIES

Page 22: Building a Recommendation Engine with Spring and Hadoop

1SPRING BOOT

Page 23: Building a Recommendation Engine with Spring and Hadoop

2MYSQL

Page 24: Building a Recommendation Engine with Spring and Hadoop

3HADOOP

Page 25: Building a Recommendation Engine with Spring and Hadoop

4SPRING XD

Page 26: Building a Recommendation Engine with Spring and Hadoop

5MAHOUT

Page 27: Building a Recommendation Engine with Spring and Hadoop

SPRING XDEXTREME DATA

Page 28: Building a Recommendation Engine with Spring and Hadoop

APPLICATIONCOMPLEXITY

Page 29: Building a Recommendation Engine with Spring and Hadoop

L O T S O F

BOILERPLATE

Page 30: Building a Recommendation Engine with Spring and Hadoop

MANY DOMAINS TO

BRIDGE

Page 31: Building a Recommendation Engine with Spring and Hadoop

I N C O N S I S T E N T

APIS

Page 32: Building a Recommendation Engine with Spring and Hadoop

SOURCE, CHANEL, SINK

DATA FLOW MODEL

ADAPTER, CHANEL, FILTER, TRANSFORMER, ETC

EIP PATTERNS

=

Page 33: Building a Recommendation Engine with Spring and Hadoop

JOB, CONNECTOR

IMPORT/EXPORT

JOB, ITEMREADER/ITEMWRITER

BATCH PROCESSING

=

Page 34: Building a Recommendation Engine with Spring and Hadoop

WORKFLOW, ACTION

WORKFLOWORCHESTRATION

JOB, STEP

BATCH PROCESSING

=

Page 35: Building a Recommendation Engine with Spring and Hadoop

SPRING XDEXTREME DATA

Page 36: Building a Recommendation Engine with Spring and Hadoop

SPRING

Ingestion

Orchestration

Extraction

Real-time

Analytics

Page 37: Building a Recommendation Engine with Spring and Hadoop

D I S T R I B U T E D

RUNTIME

Page 38: Building a Recommendation Engine with Spring and Hadoop

STREAMING

BATCH&

Page 39: Building a Recommendation Engine with Spring and Hadoop
Page 40: Building a Recommendation Engine with Spring and Hadoop

--directory=/xd/dir1

filter --expression=“payload?.price > 3.00” |

http | hdfs--port=8181

Page 41: Building a Recommendation Engine with Spring and Hadoop

BATCH PROCESSING FOR

HEAVY LIFTING

Page 42: Building a Recommendation Engine with Spring and Hadoop

JOB

Page 43: Building a Recommendation Engine with Spring and Hadoop

STEP

Page 44: Building a Recommendation Engine with Spring and Hadoop

TASKLET

Page 45: Building a Recommendation Engine with Spring and Hadoop

CHUNK

Page 46: Building a Recommendation Engine with Spring and Hadoop

SPRING FOR

APACHE HADOOP

Page 47: Building a Recommendation Engine with Spring and Hadoop
Page 48: Building a Recommendation Engine with Spring and Hadoop

TOTAL LINES OF CUSTOM CODE

47 Lines of Java

29 Lines of XML

6 Spring XD Shell Commands

Page 49: Building a Recommendation Engine with Spring and Hadoop

RECOMMENDATION

ALGORITHMS

Page 50: Building a Recommendation Engine with Spring and Hadoop

PREDICTING THE

FUTURE

Page 51: Building a Recommendation Engine with Spring and Hadoop

C O L L A B O R AT I V E

FILTERING

Page 52: Building a Recommendation Engine with Spring and Hadoop

TWO OPTIONS

Page 53: Building a Recommendation Engine with Spring and Hadoop

USER BASED

Page 54: Building a Recommendation Engine with Spring and Hadoop

USER ITEM 1ITEM 2ITEM 3ITEM 4ITEM 5

DEREK

MICHAEL

PHIL

DARREL ?

Page 55: Building a Recommendation Engine with Spring and Hadoop

USER BASED

Page 56: Building a Recommendation Engine with Spring and Hadoop

USER BASED

Page 57: Building a Recommendation Engine with Spring and Hadoop

ITEM BASED

Page 58: Building a Recommendation Engine with Spring and Hadoop

?

ITEM DEREKMICHAELPHILDARREL

ITEM 1

ITEM 2

ITEM 3

ITEM 4

ITEM 5

Page 59: Building a Recommendation Engine with Spring and Hadoop

ITEM BASED

Page 60: Building a Recommendation Engine with Spring and Hadoop

ITEM BASED

Page 61: Building a Recommendation Engine with Spring and Hadoop

PEOPLE ARE

FUNNY

Page 62: Building a Recommendation Engine with Spring and Hadoop
Page 63: Building a Recommendation Engine with Spring and Hadoop
Page 64: Building a Recommendation Engine with Spring and Hadoop
Page 65: Building a Recommendation Engine with Spring and Hadoop
Page 66: Building a Recommendation Engine with Spring and Hadoop

USER_ID, TAG_ID, VOTES

TAG_ID, TAG_ID, SCORE

Page 67: Building a Recommendation Engine with Spring and Hadoop
Page 68: Building a Recommendation Engine with Spring and Hadoop

LOOKING INTO THE

FUTURE

Page 69: Building a Recommendation Engine with Spring and Hadoop

SNAPSHOTS AHEAD!

Page 70: Building a Recommendation Engine with Spring and Hadoop
Page 71: Building a Recommendation Engine with Spring and Hadoop

MAP REDUCE

Page 72: Building a Recommendation Engine with Spring and Hadoop

M A P R E D U C E

PROBLEMS

Page 73: Building a Recommendation Engine with Spring and Hadoop

A P I I S V E R Y

LOW LEVEL

Page 74: Building a Recommendation Engine with Spring and Hadoop

H I G H

LATENCY

Page 75: Building a Recommendation Engine with Spring and Hadoop

N O T A LWAY S A

GOOD FIT

Page 76: Building a Recommendation Engine with Spring and Hadoop
Page 77: Building a Recommendation Engine with Spring and Hadoop

POTENTIALLY

FASTER

Page 78: Building a Recommendation Engine with Spring and Hadoop

HIGHER LEVEL

APIS

Page 79: Building a Recommendation Engine with Spring and Hadoop

scala> textFile.count()

res0: Long = 126

Page 80: Building a Recommendation Engine with Spring and Hadoop
Page 81: Building a Recommendation Engine with Spring and Hadoop

USER_ID, TAG_ID, VOTES

TAGID,TAGID:RANK…

Page 82: Building a Recommendation Engine with Spring and Hadoop

U S E A

SEARCH ENGINE1

Page 83: Building a Recommendation Engine with Spring and Hadoop

D ATA

NORMALIZATION2

Page 84: Building a Recommendation Engine with Spring and Hadoop
Page 85: Building a Recommendation Engine with Spring and Hadoop
Page 86: Building a Recommendation Engine with Spring and Hadoop

Learn More. Stay Connected.

Spring BatchProject: spring.io/spring-batchGithub: github.com/spring-projects/spring-batchJira: jira.spring.io/browse/BATCH

Spring BootProject: spring.io/spring-bootGithub: github.com/spring-projects/spring-boot

Spring XDProject: spring.io/spring-xdGithub: github.com/spring-projects/spring-xdJira: jira.spring.io/browse/XD

Twitter: twitter.com/springcentral

YouTube: spring.io/video

LinkedIn: spring.io/linkedin

Google Plus: spring.io/gplus

Page 87: Building a Recommendation Engine with Spring and Hadoop

Servers by Jaime Carrion

from The Noun Project

Question by Jessica Lock

from The Noun Project

Check Box by Hrag Chanchanian

from The Noun Project

Crane by Kenneth Von Alt

from The Noun Project

Nut by Naomi Atkinson

from The Noun Project

Funnel by Volodin Anton

from The Noun Project

Circuit by Piotrek Chuchla

from The Noun Project

Puzzle by Matthew Hall

from The Noun Project

Database by Anton Outkine

from The Noun Project

Network by Mister Pixel

from The Noun Project

Puzzle by Eric M. Ellis

from The Noun Project

People by Wilson Joseph

from The Noun Project

Maze by Gilbert Bages

from The Noun Project

Fork by Dmitry Baranovskiy

from The Noun Project

Algebra by Ilsur Aptukov

from The Noun Project

Thumbs Up by Jørgen Bovolden

from The Noun Project

Scale by Edward Boatman

from The Noun Project

Users by Vittorio Maria Vecchi

from The Noun Project

Flow Chart by Michael Wohlwend

from The Noun Project

Running by Dimiter Petrov

from The Noun Project

Move by Dmitry Baranovskiy

from The Noun Project

Running by Dimiter Petrov

from The Noun Project

Abacus byAlice Mortaro

from The Noun Project

Stopwatch by Scott Lewis

from The Noun Project

Lego by jon trillana

from The Noun Project

Lego by jon trillana

from The Noun Project

Lego by jon trillana

from The Noun Project

Lego by Jake Dunham

from The Noun Project

Page 88: Building a Recommendation Engine with Spring and Hadoop

TheEnd