game theory and privacy preservation in recommendation systems iordanis koutsopoulos u of thessaly...
TRANSCRIPT
Game Theory and Privacy Preservation in Recommendation Systems
Iordanis Koutsopoulos
U of Thessaly
Thalis project CROWNKick-off Meeting
Volos, May 11, 2012
State of the art
• Internet-enabled services rely on end-user data to provide personalized feedback to user– User provisioned data constitute user profile
• Escalating privacy concerns to end users:– Profile revelation, possible consequences from
correlating various segments of the profile– Possible improper data use by third parties, data
monetization
ExamplesCase Study User Profile Entities involved Scope (Service)
Web browsing Web browsing behavior (sites visited, frequency of visit, …)
Online Service Providers (Google, Facebook,…)
Targeted advertising
Location Locations visited, length of stay, trajectory,..
Mobile Telecom operators, mobile equipment manufacturers (e.g. Apple,…)
Location-based services (location-based ads, locating on map, receiving alert/ notifications,…)
Smart Metering Power consumption profile (appliances on/off,…)
Electric utility operators
Smart grid services (e.g. demand response), ads,…
Recommendation Systems
Rating for items bought/seen,…
Online retailers (Amazon), media providers (Netflix),…
Recommendation of items of likely interest to user
Basic Questions• How to model a user personal profile?
• How to quantify privacy?
• What does privacy preservation mean and how to quantify it?
• How to measure the Quality of personalized services received by the user?
• Can the users do something besides individually trying to hide their personal profiles?
User Profile and privacy
• Finite set of attributes A that characterize user profile• Profile can be modeled as a real vector of dimension |A|
– Vector entries are values with respect to various attributes
• When does privacy increase?– Reveal a profile vector that is as far away as possible from real profile
vector
Example User Profiles
Location-based Services
P = (L(1),L(2),…L(t),..) :Locations visited
Web browsingP = (w1, …wt) :Categories of websites visited
Smart meteringP=(a1,…an) : Power consumption of Electric appliances
Recommendation Systems
P=(a1,…an) : Private Ratings of Items, e.g. movies watched
Recommendation Systems• Recommendation systems: data exchange between the users and
a central entity (server) that performs recommendations user privacy concerns
• User goal: – preserve privacy by not revealing much information to third
party about private preferences and ratings.
– receive high-quality personalized recommendation results
– Fundamental tradeoff between privacy preservation and recommendation quality.
Model: Ratings and recommendation
Set U of N users and set of items I available for recommendation. Si I: ⊂ a small subset of items that a user i has already viewed,
purchased (or obtained experience of) pi = (pik : k S∈ i): vector of ratings of user i for the items he has
viewed 0 ≤ pik ≤ P (continuous-valued) Vector of ratings pi is private information for each user i.
qi = (qik : k S∈ i): vector of declared ratings from user i to the server (different from pi).
Model: Ratings and recommendation (2)
P = (pi : i U):∈ ensemble of private ratings of users. Q = (qi : i U):∈ ensemble of declared ratings of users to server. Recommendation server collects declared user profiles and issues
personalized recommendations to different users. Computes recommendation vector ri = fi(Q) = fi(q1, . . . , qN) for each user i. Vector of ri dimension |I \ Si| (items that user i has not viewed)
2 popular classes of recommendation systems• Collaborative filtering (CF):
– For each user i and for each item not tried (viewed) by i, compute a statistic based on other users’ rating about the item
• Content based (CB): – For each user i, and for each item k not tried (viewed) by i, compute a
statistic based on relation of k with other items that i has viewed
• Simple Example for Intuition : 3 Items, {A,B,C}, 2 users– User 1 has seen and rated {A,B} – User 2 has seen and rated {B,C}
• Question: Will C be recommended to user 1 or not? With what rating?
• Depends on:– Rating of user 1 for A and B, and “Similarity” of A,B to C (Content-Based)– Rating of user 2 for C (Collaborative Filtering)
Case Study: A Hybrid Recommendation system• Collaborative filtering + Content-based approach• For each user i, the recommendation server applies the following measure to
compute metrics riℓ for items , so as to rate them and use them in the recommendation vector for user i:
• ρkℓ ∈ [0, 1] is the correlation between items k and ℓ.
Recommendation ri = fi(qi, q−i). that each user i receives depends on: declared profiles of other users to server, q−i = (q1, . . . , qi−1, qi+1, . . . , qN) declared profile of this specific user, qi
Collaborative filteringContent-based
Privacy metric• Quantifies degree at which privacy is preserved for user i. • Simplest form: Depends on Euclidean Distance between private
profile pi and declared profile qi of user i. • Function that quantifies privacy preservation for user i is taken:
• Privacy increases as Euclidean distance increases • Distance weighted by the private rating pik
– among items whose private and declared rating have the same distance, it is preferable from privacy preservation perspective to change rating of items that are higher rated in reality
Quality of Personalized Service• Quality of personalized recommendation • Measured in terms of the Euclidean distance between
– the recommendation vector user i gets if he declares profile qi and – the recommendation vector he would get if he declared the real private
rating pi, regardless of what other users do.– Should not exceed a level D
OBJECTIVE of each user:
MAXIMIZE privacy while
• satisfying certain quality of personalized recommendation• If each user selfishly pursues his own objective, a game emerges• M. Halkidi and I. Koutsopoulos, “A game theoretic framework for data privacy preservation in Recommender systems, European
Conference on Machine Learning and Principles and Practice of knowledge discovery in databases (ECML/PKDD), 2011.
System architecture
The agent of user i solves the optimization
problem
the server passes touser i
ECML PKDD 2011, Athens, Greece / M. Halkidi, I Koutsopoulos 14
MAXIMIZE privacy s.t
satisfy certain quality of personalized recommendation
A Nash equilibrium exists, Best Response converges to N.E.
Conclusion• Step towards characterizing the fundamental tradeoff between
privacy preservation and good quality recommendation, more to do– Quantify privacy is a non-trivial challenge!
• Introduced game theoretic framework for capturing the interaction and conflicting interests of users in the context of privacy preservation in recommendation systems– Each user selfishly attempts to maximize its own privacy
• Can users coordinate and jointly determine their rating revelation strategy so as to have mutual benefit in terms of privacy preservation?
• More enriched definition of privacy?• What if there exist conflicting goals in third parties that make the
recommendation? – Incentives to users for revealing their profile?
M. Halkidi and I. Koutsopoulos, “A game theoretic framework for data privacy preservation in Recommender systems, European Conference on Machine Learning and Principles and Practice of knowledge discovery in databases (ECML/PKDD), 2011.