recommendation engine using aerospike and/or mongodb

18
© 2014 Aerospike. All rights reserved. Confidential 1 Aerospike aer . o . spike [air-oh- spahyk] noun, 1. tip of a rocket that enhances speed and stability RESTFUL RECOMMENDATION ENGINE USING SPRING, AEROSPIKE AND MONGODB IN-MEMORY + NOSQL + ACID PETER MILNE DIRECTOR OF APPLICATION ENGINEERING QCON SAN FRANCISCO NOVEMBER 2014

Upload: peter-milne

Post on 16-Jul-2015

509 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Recommendation engine using Aerospike and/OR MongoDB

© 2014 Aerospike. All rights reserved. Confidential 1

Aerospike aer . o . spike [air-oh- spahyk]

noun, 1. tip of a rocket that enhances speed and stability

RESTFUL RECOMMENDATION ENGINE

USING SPRING, AEROSPIKE AND MONGODB

IN-MEMORY + NOSQL + ACID

PETER MILNE

DIRECTOR OF APPLICATION ENGINEERING

QCON SAN FRANCISCO

NOVEMBER 2014

Page 2: Recommendation engine using Aerospike and/OR MongoDB

© 2014 Aerospike. All rights reserved. Confidential 2

Sagely Advice

Don’t let:

■ Information oversaturate your Knowledge

■Knowledge distract you from Wisdom

Where is the wisdom we have lost in

knowledge? Where is the knowledge we

have lost in information?-T.S.Elliot 1934

Page 3: Recommendation engine using Aerospike and/OR MongoDB

© 2014 Aerospike. All rights reserved. Confidential 3

Recommendation Engines

Recommendation engines are used in

applications to personalize user

experience.

Page 4: Recommendation engine using Aerospike and/OR MongoDB

© 2014 Aerospike. All rights reserved. Confidential 4

Movie recommendation Engine

■ Similar to

■ Hulu, Netflix, YouTube, Daily Motion, etc

■ Context free

■ RESTful service

■Spring Boot

■ ~100,000 Movies

■ ~800,000 recommendations

■ Data store

■Aerospike

■MongoDB

Page 5: Recommendation engine using Aerospike and/OR MongoDB

© 2014 Aerospike. All rights reserved. Confidential 5

RESTful web services

■ HTTP based

■GET

■POST

■PUT

■DELETE

■ Payload

■XML

■JSON

Commonly called “APIs”

■ “APIs are like … Everybody has

one”

Page 6: Recommendation engine using Aerospike and/OR MongoDB

© 2014 Aerospike. All rights reserved. Confidential 6

Spring Boot

“Spring Boot makes it easy to create stand-

alone, production-grade Spring based

Applications that can you can just run.”

-- spring.io

Page 7: Recommendation engine using Aerospike and/OR MongoDB

© 2014 Aerospike. All rights reserved. Confidential 7

MongoDB

■ NoSQL

■ Schema less

■ Data in JSON documents

■BSON on server

■ Seductive programmatic interface

■Fashionable JSON

■ In-memory a.k.a RAM

■Fast

Page 8: Recommendation engine using Aerospike and/OR MongoDB

© 2014 Aerospike. All rights reserved. Confidential 8

Aerospike

1) No Hotspots

– DHT simplifies data

partitioning

2) Smart Client – 1 hop to

data, no load balancers

3) Shared Nothing

Architecture,

every node is identical 4) Single row ACID

– synch replication in

cluster

5) Smart Cluster, Zero Touch

– auto-failover,

rebalancing, rack aware,

rolling upgrades

6) Flash Optimized

Page 9: Recommendation engine using Aerospike and/OR MongoDB

© 2014 Aerospike. All rights reserved. Confidential 9

Cosine Similarity

“Cosine similarity is a measure of

similarity between two vectors of

an inner product space that measures

the cosine of the angle between

them. …. Cosine similarity is

particularly used in positive space,

where the outcome is neatly

bounded in [0,1].” - Wikipedia

Page 10: Recommendation engine using Aerospike and/OR MongoDB

© 2014 Aerospike. All rights reserved. Confidential 10

The recommendation algorithm

1. Jane Doe accesses the application

2. Retrieve Jane’s User Profile

3. Retrieve the Movie record for each movie that Jane has watched. For

each movie:

■ Retrieve each of the watched user profiles

■ See if this profile is similar to Jane’s by giving it a score

4. Using the user profile with the highest similarity score, recommend the

movies in this user profile that Jane has not seen.Ac B = B \ A

Relative complement

A B

This is a very elementary

technique and it is useful only as

an illustration, and it does have

several flaws

Page 11: Recommendation engine using Aerospike and/OR MongoDB

© 2014 Aerospike. All rights reserved. Confidential 11

Data model

Page 12: Recommendation engine using Aerospike and/OR MongoDB

© 2014 Aerospike. All rights reserved. Confidential 12

One to Many

Page 13: Recommendation engine using Aerospike and/OR MongoDB

© 2014 Aerospike. All rights reserved. Confidential 13

The code

RESTful

List of Movies

watched

Make a vector

Page 14: Recommendation engine using Aerospike and/OR MongoDB

© 2014 Aerospike. All rights reserved. Confidential 14

Finding a similar customer

Cosine similarity

Watched list

For each movie

For each

customer

Page 15: Recommendation engine using Aerospike and/OR MongoDB

© 2014 Aerospike. All rights reserved. Confidential 15

Conclusions

■ MongoDB

■Technically seductive programmer interface

■Uses JSON ( the current fashion )

■Painful to scale – requires lots of servers

■Needed a big cluster - Ran out of RAM

■ Aerospike

■Tiny cluster required

■Easy to scale

■Faster than Mongo

■Large Data Type are different to JSON – but good

Page 16: Recommendation engine using Aerospike and/OR MongoDB

© 2014 Aerospike. All rights reserved. Confidential 16

Books

Page 17: Recommendation engine using Aerospike and/OR MongoDB

© 2014 Aerospike. All rights reserved. Confidential 17

➤ GitHub – Recommendation engine example

https://github.com/aerospike/recommendation-engine-example.git

➤ Spring Boot

http://projects.spring.io/spring-boot/

➤ Aerospike

http://www.aerospike.com/

➤ MongoDB

http://www.mongodb.org/

Resources

Page 18: Recommendation engine using Aerospike and/OR MongoDB

© 2014 Aerospike. All rights reserved. Confidential 18

[email protected]

helipilot50

@helipilot50

Questions?