recommender trends 2014

Post on 27-Jun-2015

851 Views

Category:

Engineering

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Coming right from the Recommender Systems conference in San Francisco, I present some latest developments in the field of large scale recommendation engines and machine learning.

TRANSCRIPT

@torbenbrodt #recsys

Recommender TrendsACM RecSys 2014Silicon Valley USA

Torben Brodtplista GmbH

inspired by ..StammtischNov 13th 2014

@torbenbrodt #recsys

Silicon Valley

Image by New Media at the University of Maine

@torbenbrodt #recsys

RecSys 2014 was ..

● 1 day workshop● 3 day tech conference (see )● 1 day conference

@torbenbrodt #recsys

biased with my experience

● Head of Data Engineering● > 6y plista

○ News, advertising, real-time● Open!

DevOps MathCore

@torbenbrodt #recsys

Contents

1. Product2. Algorithms3. Metrics4. Openness5. Crazy Stuff6. Missing

@torbenbrodt #recsys

Product, ”Data Driven Decisions”

“We take a proposal for an original production or for a piece of content we’re going to buy and we plug in all the data we can abou tit into our models. We’re able to predict reach and hours for that piece of content even before it exists with reasonable precision in a way that helps us to say, ‘this is worth funding’ or ‘that’s not worth funding,’ ”

NEIL HUNT Netflix

@torbenbrodt #recsys

Product, “Search & Recommendationshould (not?) converge”

HECTOR GARCIA-MOLINAProfessor, Stanford University

DEBORA DONATOStumbleUpon

@torbenbrodt #recsys

Product, “Use Human Experts”

ERIC COLSONStitch Fix

Humans send you customized outfits. Machines suggest clothes and judge stuff.

@torbenbrodt #recsys

Product, “Explain your knowledge”

● Xbox explains why their recommendations are utile

● Cortana builds ML model of user and still allows to change it

Build Trust!

@torbenbrodt #recsys

Product, “Care about Privacy”

once you lose your customer because of privacy, you will never get him back

solutions● store user history on client side● ..

@torbenbrodt #recsys

Product, ”Allow User Interaction”

HECTOR GARCIA-MOLINAProfessor, Computer Science and Electrical Engineering Departments of Stanford University

@torbenbrodt #recsys

Product, “active learning”

Why do vague passive learning when you can ask the user?

.. implicitly or explicitly

http://en.wikipedia.org/wiki/Active_learning_(machine_learning)

SMRITI BHAGATTechnicolor

@torbenbrodt #recsys

Algorithms

@torbenbrodt #recsys

Algorithms, ”Matrix Factorization”

[...] faster by replacing inner product with PCA trees

NOAM KOENIGSTEINMicrosoft R&D

@torbenbrodt #recsys

Algorithms, “Ensembles”

● Multi Armed Bandits● Ensemble Methods● Global Optimization

https://github.com/Yelp/MOE

@torbenbrodt #recsys

Algorithms, “How does MOE work”

DR. SCOTT CLARKYelp

1. Build Gaussian Process (GP) with points sampled so far

2. Optimize covariance hyperparameters of GP3. Find point(s) of highest Expected Improvement

within parameter domain4. Return optimal next best point(s) to sample

https://github.com/Yelp/MOE

@torbenbrodt #recsys

Algorithms, “Topic Modelling”

● LDA is standard● datascience tasks

○ where to cut○ how many topics

● where to use?

http://en.wikipedia.org/wiki/Topic_model

@torbenbrodt #recsys

Algorithms, “Content”

● Sense identifiers (int) instead of keywords● Word sense disambiguation

@torbenbrodt #recsys

Metrics, “Stakeholders”

● Business Value● Consumer Value● Conflicting goals?● Diversity?

NEIL HUNT Netflix

@torbenbrodt #recsys

Metrics, “Dwell Time”

● Client Side implementation

● Yahoo ensures dwell-time is comparable across different context (device, etc)

● it correlates to clicks, but is more meaningful XING YI

Yahoo Labs

@torbenbrodt #recsys

Metrics, “Increasing signals”

Get the full lifetime journey● reservation● rating● billing / tipping

JEREMY SCHIFFOpenTable

@torbenbrodt #recsys

Openness, “Software Side”

Companies share software● credits to Twitter, Yelp, others

Finally Paper results can be reused (github)

@torbenbrodt #recsys

Openness, “Data Side”

Wikipedia, DBPedia, common crawl

Companies share Data & Challenges● credits to Netflix, Tmall, Criteo

@torbenbrodt #recsys

Openness, “Connectivity”

Everything is possible!To Me and to You

● Connect to Facebook○ access open graph

● Get Fulltext without 10k servers● Use Apache Mahout, Azure ML, etc

@torbenbrodt #recsys

Openness, “Connectivity”

● Give students the chance to learn

● CoLaboratory Notebook

http://venturebeat.com/2014/08/08/google-whips-up-a-chrome-app-to-let-data-scientists-work-together/

@torbenbrodt #recsys

Openness, “Connectivity”

● Azure Marketplace allows to exchange machine learning models

● RapidMiner makes workflows reproducable

https://datamarket.azure.com/browse/data

@torbenbrodt #recsys

Crazy Stuff

Industry Sessions…● Facebook News● Shopkick● Stumble Upon● climate institute● ...

@torbenbrodt #recsys

Crazy Stuff, “music genome project”

1 song = 450 musical characteristics from trained music analyst

ERIK M. SCHMIDTPandora

@torbenbrodt #recsys

Crazy Stuff, “LinkedIn A/B testing”

● XLNT Platform● Key Component !● Continuous Deployment

YA XULinkedInhttps://engineering.linkedin.com/ab-testing/xlnt-

platform-driving-ab-testing-linkedin

@torbenbrodt #recsys

Crazy Stuff, “Google Deep Learning”

● Application?○ Pixels, Audio, Searches,

Translation● Embeddings● Language Models● Scalability

JEFF DEANGoogle

@torbenbrodt #recsys

Missing? “Uncovered Topics”

@torbenbrodt #recsys

Missing, “Probabilistic Data Structures”

probabilistic counting, hyperLogLog, etc

http://research.neustar.biz/https://streamdrill.com/

@torbenbrodt #recsys

Missing

Large Scale?● Computational Costs● Real-Time Recs

@torbenbrodt #recsys

Questions?

Torben Brodtplista GmbH

@torbenbrodt #recsys

● hard to convince mgmt (?!)● start measuring

example● coupons 1/week might

decrease revenueJEREMY SCHIFFOpenTable

Metrics, “Long Term Satisfaction”

@torbenbrodt #recsys

Resume, ”we enhance services”

Large Size Companies cannot exist without data science● Netflix● Zalando● etc

top related