enhancing recommendations on uber eats with graph ... · for dish recommendation we care about...

31
Enhancing Recommendations on Uber Eats with Graph Convolutional Networks Ankit Jain/Piero Molino ankit.jain/[email protected]

Upload: others

Post on 05-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Enhancing Recommendations on Uber Eats with Graph Convolutional NetworksAnkit Jain/Piero Molino

ankit.jain/[email protected]

Page 2: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Agenda1. Graph Representation Learning

2. Dish Recommendation on Uber Eats

3. Graph Learning on Uber Eats

Page 3: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Graph Representation Learning

Page 4: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Linked Open Data

Graph data

Social networks Biomedical networks

Information networks Internet Networks of neurons

Page 5: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Tasks on graphs

Node classificationPredict a type of a given node

Link predictionPredict whether two nodes are linked

Community detectionIdentify densely linked clusters of nodes

Network similarityHow similar are two (sub)networks

Page 6: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Define an encoder mapping from nodes to embeddings

Define a node similarity function based on the network structure

Optimize the parameters of the encoder so that:

Learning framework

embedding spaceoriginal graph

Page 7: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Simplest encoding approach: encoder is just an embedding-lookup

Algorithms like Matrix Factorization, Node2Vec, Deepwalk fall in this category

Embedding size

One column per node

embedding matrix

embedding vector for a specific node

Shallow encoding

Page 8: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Shallow encoding limitations

O(|V|) parameters are needed, every node has its own embedding vector

Either not possible or very time consuming to generate embeddings for nodes not seen during training

Does not incorporate node features

Page 9: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Graph Neural NetworkKey Idea: To obtain node representations, use a neural network to aggregate information from neighbors recursively by limited Breadth-FIrst Search (BFS)

INPUT GRAPH

A

A

C NN

NN

A

F

E

B

A

NN

B

D

C NN

DEPTH 2 BFS

A

B

D

E

F

C

Page 10: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Neighborhood aggregation

Each Layer is one level of depth in the BFS

Nodes have embeddings at each layer

Layer 0 embedding of node v is its input feature xu

Layer k embeddings are hkv

LAYER 0 LAYER 1 LAYER 2

XA

XC

h1B

h2A

h1C

h1D

XA

XB

XE

XF

XA

AGG NN W1

AGG NN W1

AGG NN W1

AGG NN W2

XB

XC

XD

h1A

NN B1

NN B1

NN B1

NN B2

Features of each node Computed Representations

Neighborhood Aggregation

Neural Network for neighborhood representation

Neural Network for self-embedding

Page 11: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

train with snapshot new node arrives generate embedding for new node

Inductive capability

In many real applications new nodes are often added to the graph

Need to generate embeddings for new nodes without retraining

Hard to do with shallow methods

Page 12: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Dish Recommendation on Uber Eats

Page 13: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Suggested Dishes

Recommended DishesCarousel Picked for You

Page 14: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with
Page 15: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with
Page 16: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with
Page 17: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with
Page 18: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Graph Learning in Uber Eats

Page 19: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Users connected to dishes they have ordered in the last M days

Weights are frequency of orders

Graph properties

Graph is dynamic: new users and dishes are added every day

Each node has features, e.g. word2vec of dish names

3

1

4

Bipartite graph for dish recommendation

U1

U2

D1

D2

Page 20: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

positive pair

negativesample

margin

For dish recommendation we care about ranking, not actual similarity score

Max Margin Loss:

Max Margin Loss

Page 21: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

New loss with Low Rank Positives

U1

U2

D2

D4

D1

D3

5

2

2

1

3

U1 D1

Positive Negative Low Rank Positive

U1 D4 U1 D3

Page 22: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Weighted pool aggregation

Aggregate neighborhood embeddings based on edge weight

hD hB

hA

hC

5

2

1

Q

Q

Q

denotes a fully connected layer

Page 23: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Model Test AUC

Previous production model 0.784

With graph embeddings 0.877

Offline evaluation

Trained the downstream Personalized Ranking Model using graph node embeddings

~12% improvement in test AUC over previous production model

Page 24: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Feature Importance

Graph learning cosine similarity is the top feature in the model

Page 25: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Online evaluation

Ran a A/B test of the Recommended Dishes Carousel in San Francisco

Significant uplift in Click-Through Rate with respect to the previous production model

Conclusion: Dish Recommendations with graph learning features are live in San Francisco, soon everywhere else

Page 26: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

ServingTraining Pipeline Step 2Training Pipeline Step 1

Data Pipeline Step 1

Table 1Table 1Source Tables

Versioned nodes and edges

Daily Ingestion

Data Pipeline Step 2

date

...date

- k

Collapsed graph with latest nodes

and edges

Date

Data Pipeline Step 3

Partitioned city graph using

Cypher

Data Pipeline Step 4

NetworkX graph for model training

& embedding generation

Back-dated graph for offline analysis

Past Date

GNN Model Training

Personalized Ranker Model

Training

Node Embeddings

Ranker Model

Personalized Ranker Online

Recommendation

Page 27: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

More Resources

Uber Eng Blog Post

Learn better representation in data scarcity regimes like small/new cities through meta-learning [NeurIPS Graph Representation Learning Workshop 2019]

Page 28: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Team

Ankit Jain Isaac Liu Ankur Sarda

Piero Molino Long Tao Jimin Jia

Jan Pedersen Nathan Berrebbi Santosh Golecha

Ramit Hora Alex Danilychev

Page 29: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Thank you for your attentionEnhance Recommendations in Uber Eats with Graph Convolutional NetworksAnkit Jain/Piero Molino

ankit.jain/[email protected]

Page 30: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

Proprietary © 2019 Uber Technologies, Inc. All rights reserved. No part of this document may be reproduced or utilized in any form or by any

means, electronic or mechanical, including photocopying, recording, or by any information storage or retrieval systems, without permission in

writing from Uber. This document is intended only for the use of the individual or entity to whom it is addressed and contains information that is

privileged, confidential or otherwise exempt from disclosure under applicable law. All recipients of this document are notified that the

information contained herein includes proprietary and confidential information of Uber, and recipient may not make use of, disseminate, or in any

way disclose this document or any of the enclosed information to any person other than employees of addressee to the extent necessary for

consultations with authorized personnel of Uber.

Page 31: Enhancing Recommendations on Uber Eats with Graph ... · For dish recommendation we care about ranking, not actual similarity score Max Margin Loss: Max Margin Loss. New loss with

References

[1] Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L. Hamilton and Jure Leskovec: Graph Convolutional Neural Networks for Web-Scale Recommender Systems, KDD 2018[2] Bryan Perozzi, Rami Al-Rfou' and Steven Skiena: DeepWalk: online learning of social representations. KDD 2014[3]Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan and Qiaozhu Mei: LINE: Large-scale Information Network Embedding. WWW 2015[4] Aditya Grover, Jure Leskovec: node2vec: Scalable Feature Learning for Networks. KDD 2016[5] Marco Gori, Gabriele Monfardini and Franco Scarselli: A new model for learning in graph domains. IJCNN 2005[6] William L. Hamilton, Rex Ying and Jure Leskovec: Inductive Representation Learning on Large Graphs. NIPS 2017[7] Joey Bose, Ankit Jain, Piero Molino and Will Hamilton: Meta-Graph: Few shot Link Prediction via Meta-Learning. Graph Representation Learning Workshop @ NeurIPS 2019