realtime recommendationsnosqlroadshow.com/dl/nosql-berlin-2013/goto/goto... · plista_company 135...
TRANSCRIPT
![Page 1: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/1.jpg)
Realtime Recommendationswith Redis
Torben Brodtplista GmbH
April 25th, 2013
NoSQL Search Roadshowhttp://nosqlroadshow.com/nosql-berlin-2013/
![Page 2: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/2.jpg)
Introduction
● Torben Brodt, Head of Data Engineering○ computer science studies○ 5 years plista○ publication „collaborative filtering“○ evangelist for "power of algorithms“
● plista GmbH○ recommendations & advertising○ founded in 2008, Berlin [DE]○ ~5k recommendations/ second
![Page 3: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/3.jpg)
![Page 4: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/4.jpg)
Contents
1. How to feed a recommender?
2. How to build a recommendation?
3. How to scale a recommender?
![Page 5: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/5.jpg)
How to feed a recommender?
![Page 6: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/6.jpg)
How to feed a recommender?
● to show recommendations we are integrated on the website
● we have URL + HTTP Headers○ user agent○ IP address -> geolocation
![Page 7: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/7.jpg)
How to feed a recommender?
● push the data away quickly● make use of data quickly
RULE: be quick
src http://en.wikipedia.org/wiki/Pac-Man
![Page 8: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/8.jpg)
How to feed a recommender?
![Page 9: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/9.jpg)
How to feed a recommender?
![Page 10: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/10.jpg)
Technology overview
● Apache Lucene for Content● MySQL for relational data● Machine Learning
○ Hadoop? No! It's batch + slow○ In Memory? Yes, stream computing
● Redis for Statistics○ Live○ Backup
![Page 11: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/11.jpg)
How to build a recommendation?
![Page 12: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/12.jpg)
How to build a recommendation?
Behavioralbased on interaction between user and article
○ Most Popular○ Collaborative Filtering○ Item to Item
Contentbased on the articles
○ Content Similarity○ Latest Item
Classification
● different recommender families
![Page 13: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/13.jpg)
Most popular with
welt.de/football/berlin_wins.html● ZINCR "p:welt.de" berlin_wins● ZREVRANGEBYSCORE
p:welt.de
berlin_wins 689 +1
summer_is_coming 420
plista_company 135
Live Read+ Live Write= Real Time Recommendations
![Page 14: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/14.jpg)
● String, Lists, Set, ..● Hash
○ map between string fields and string values, very fast
○ HINCR complexity O(1)● Sorted Set
○ ZINCR complexity: O(log(N)) where N is the number of elements in the sorted set.
○ Allows to limit number of result: ZREVRANGEBYSCORE
○ UNION + INTERSECT
Recap Data typesp:welt.de
berlin_wins 689 +1
summer_is_coming 420
plista_company 135
![Page 15: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/15.jpg)
Most popular with timeseries
welt.de/football/berlin_wins.html● ZINCR "p:welt.de:1360007000" berlin_wins● ZUNION
○ "p:welt.de:1360007000"○ "p:welt.de:1360006000"○ "p:welt.de:1360005000"
● ZREVRANGEBYSCOREp:welt.de:1360005000
berlin_wins 420
summer_is_coming 135
plista_best_company 689
p:welt.de:1360006000
berlin_wins 420
summer_is_coming 135
plista_best_company 689
p:welt.de:1360007000
berlin_wins 689
summer_is_coming 420
plista_best_company 135
![Page 16: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/16.jpg)
Most popular with timeseries
welt.de/football/berlin_wins.html● ZINCR "p:welt.de:1360007000" berlin_wins● ZUNION ... WEIGHTS
○ "p:welt.de:1360007000" .. 4○ "p:welt.de:1360006000" .. 2○ "p:welt.de:1360005000" .. 1
● ZREVRANGEBYSCOREp:welt.de:1360005000
berlin_wins 420
summer_is_coming 135
plista_best_company 689
p:welt.de:1360006000
berlin_wins 420
summer_is_coming 135
plista_best_company 689
p:welt.de:1360007000
berlin_wins 689
summer_is_coming 420
plista_best_company 135
![Page 17: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/17.jpg)
Most popular with timeseries
:1360007000
-1h -2h -3h -4h -5h -6h -7h -8h
:1360007000
:1360007000
42
1
![Page 18: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/18.jpg)
Most popular to any context
● it's not only publisher, we use ~50 context attributes
context attributes:● publisher● weekday● geolocation● demographics● ...
publisher = welt.de
berlin_wins 689 +1
summer_is_coming 420
plista_company 135
weekday = sunday
berlin_wins 400 +1
dortmund_wins 200
... 100
geolocation = dortmund
dortmund_wins 200
berlin_wins 10 +1
... 5
![Page 19: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/19.jpg)
Most popular to any context
ZUNION ... WEIGHTSp:welt.de:1360007 4p:welt.de:1360006 2p:welt.de:1360005 1
w:sunday:1360007 4w:sunday:1360006 2w:sunday:1360005 1
g:dortmund:1360007 4g:dortmund:1360006 2g:dortmund:1360005 1
● how it looks like in Redispublisher = welt.de
berlin_wins 689 +1
summer_is_coming 420
plista_company 135
weekday = sunday
berlin_wins 400
dortmund_wins 200
... 100
geolocation = dortmund
dortmund_wins 200
berlin_wins 10
... 5
![Page 20: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/20.jpg)
Most popular with Effect size
ZUNION ... WEIGHTSp:welt.de:1360007 4p:welt.de:1360006 2p:welt.de:1360005 1
w:sunday:1360007 4w:sunday:1360006 2w:sunday:1360005 1
g:dortmund:1360007 4g:dortmund:1360006 2g:dortmund:1360005 1
* 70%* 70%* 70%
* 10%* 10%* 10%
* 30%* 30%* 30%
Effect Size
Examples:small effect: weatherbig effect: publisher
Data with small effect should not been taken into account, otherwise we get avg results
● which context has an influence?
![Page 21: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/21.jpg)
SUM over..
● timeseries● different context● previous hits of the user● similar publisher
knowledge
publisher = welt.de
berlin_wins 689
summer_is_coming 420
plista_company 135ΣZUNION ... WEIGHTSp:welt.de:1360007 4p:welt.de:1360006 2p:welt.de:1360005 1
w:sunday:1360007 4w:sunday:1360006 2w:sunday:1360005 1
g:dortmund:1360007 4g:dortmund:1360006 2g:dortmund:1360005 1
... redis can do it ;)
![Page 22: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/22.jpg)
Even more Matrix Operations ;)
● Similarity Matrix
● Human Control Matrix
● Meta-learning Matrix○ cooperation with
○ aided from
∏Σ
![Page 23: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/23.jpg)
More recommenders possible
this was only about most popular
● other algorithms using redis○ incremental collaborative filtering
○ article to article paths (~graph)
○ .. using external data sources
![Page 24: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/24.jpg)
How to scale a recommender?
![Page 25: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/25.jpg)
How to scale a recommender?
Distribution to many servers● 1 client to access n servers● partitioning of data using hashing
![Page 26: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/26.jpg)
How to scale a recommender?
Distribution to many servers● 1 client to access n servers● partitioning of data using hashing● for UNION we run into problems
○ combined keys need to be on same server○ NO consistent hashing possible○ workaround: prefix hashing
![Page 27: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/27.jpg)
How to scale a recommender?
Low Latency● master/slave replication● should be close to edge servers● e.g. 1 redis instance per 1 webserver
src http://en.wikipedia.org/w
iki/Flash_(comics)
![Page 28: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/28.jpg)
How to scale a recommender?
Application in Database● LUA Support is shipped● but single core process● a long read blocks all writes● concurrency issue
src http://lua.org
![Page 29: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/29.jpg)
How to scale a recommender?
in spite of all those disadvantages● Redis fits perfect for simple operations
○ SUM + AGGREGATE + MIN + MAX● In-Memory operations are pretty fast● real-time features feel better in a real-time
database (e.g. time series)● we don't need batch
![Page 30: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/30.jpg)
What else in Redis?
● message bus● many recommenders● live statistics● caching
"One technology to rule them all"
![Page 31: Realtime Recommendationsnosqlroadshow.com/dl/NoSQL-Berlin-2013/GOTO/GOTO... · plista_company 135 Live Read + Live Write = Real Time Recommendations String, Lists, Set, .. Hash map](https://reader033.vdocuments.us/reader033/viewer/2022050204/5f5735ebacdc0041a8335b11/html5/thumbnails/31.jpg)
Questions?
www.plista.com
@torbenbrodt
xing.com/profile/Torben_Brodt
http://goo.gl/pvXm5
http://lnkd.in/MUXXuv