privacy-preservingeigentaste-based collaborative filtering ibrahim yakut and huseyin polat...
Post on 19-Dec-2015
221 views
TRANSCRIPT
![Page 1: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/1.jpg)
Privacy-Preserving Privacy-Preserving Eigentaste-based Eigentaste-based
Collaborative FilteringCollaborative Filtering
Ibrahim Yakut and Huseyin Polat{iyakut,polath}@anadolu.edu.tr
Department of Computer Engineering
Anadolu University, Turkey
![Page 2: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/2.jpg)
Collaborative Filtering(CF)Collaborative Filtering(CF)
18.04.23 IWSEC'07 2
ProblemInformation Overload
Solution Collaborative
Filtering
![Page 3: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/3.jpg)
Collaborative Filtering Collaborative Filtering Recent technique for filtering and
recommendationApplications
◦E-commerce◦Search engines◦Direct recommendations
18.04.23 IWSEC'07 3
![Page 4: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/4.jpg)
18.04.23 IWSEC'074
Collaborative Filtering ProcessCollaborative Filtering Process
i1 i2 iq im
u1
u2
ua
un
Active user
Prediction
Paq = Prediction on item q for active user
Item for which prediction is sought
![Page 5: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/5.jpg)
Proposed by Goldberg et al in 2001The main feature: Online
computation in constant time.Secondly, flexibly usage of several
clustering algorithms.Based on Principal Component
AnalysisApplication in Jester: online joke
recommendation. http://eigentaste.berkeley.edu/
18.04.23 IWSEC'07 5
EigenTasteEigenTaste
![Page 6: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/6.jpg)
Eigentaste AlgorithmEigentaste Algorithm
Step.1 Find correlation matrix of AStep.2 Find eigenvectors(E) and eigenvalues() of
C
18.04.23 IWSEC'07 6
AAn
C T
1
1
D:nxmA: nxk
User-item matrix
n us
ers
m items k gauge items
Correlation Matrix of A
![Page 7: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/7.jpg)
Eigentaste Algorithm Eigentaste Algorithm cont’dcont’dStep.3 Take first m=2 eigenvectors and
project A. x = AEm
T = AE2T
Step.4 Cluster the projected data using RRC.
18.04.23 IWSEC'07 7
Recursive Rectangular Clustering(RRC)
Step.5 Construct a lookup table with mean of nongauge item ratings for each clusters.
![Page 8: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/8.jpg)
Eigentaste- onlineEigentaste- online
When active user(a) enters,◦Rate the items in gauge set.◦Using PCs of his data, a is projected◦Find representative cluster◦Recommend objects based on
preconstructed lookup table.
18.04.23 IWSEC'07 8
Disapprove Approve
![Page 9: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/9.jpg)
MotivationMotivationMentioned algorithm is succesfulBut due to privacy risks, collection
of truthful and trustworthy data is challenge!!!
Therefore, how can users give data for CF purposes without jeopardizing their privacy?
Is it possible to use perturbed data in Eigentaste-based algorithms?
18.04.23 IWSEC'07 9
![Page 10: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/10.jpg)
Modifications on OriginalModifications on OriginalNormalization:
◦Instead of item mean and std, user mean and std.
Clustering:◦Instead of RRC, k-means clustering is
used.Prediction
◦Instead of look up table directly, denormalize then predict.
18.04.23 IWSEC'07 10
u
uujuj
vvz
qaaaq zvp
![Page 11: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/11.jpg)
Masking dataMasking data
18.04.23 IWSEC'07 11
CF Process
Central Database
User1
User2 Usern-1 Usern
+R1 +R2+Rn-1 +Rn
Randomized Pertubation
Technique (RPT)Aggrawal&Srikant,
2000
![Page 12: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/12.jpg)
Masking ProcessMasking Process
1. Users and servers agree on γ, θ, δ
2. Each user u compute z-scores of their ratings
3. u selects σu over [0, γ] uniformly randomly, use it as std of masking data
4. u selects ru over [0,1], if ru<= θ, use uniform otherwise gaussian
5. u selects xer over [0, δ]. %xer of unfilled cells to be filled with noise
18.04.23 IWSEC'07 12
γ θ δ
![Page 13: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/13.jpg)
Masking ProcessMasking Processu creates mu number of random
numbers where◦mu= number of rated cell+xer
◦std=σu, μ=0, gaussian or uniform(√3 .σu) wrt ru
Mask his private data by adding this noise data. Here empty cells are selected randomly.
18.04.23 IWSEC'07 13
![Page 14: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/14.jpg)
Eigentaste-based CF with Eigentaste-based CF with PrivacyPrivacyNow server holds disguised user-
item matrix, D’and user-gauge matrix A’
In some steps, the effects of perturbation must be considered and handled! ◦Correlation matrix construction◦Projection◦Active user’s entry of gauge set
18.04.23 IWSEC'07 14
![Page 15: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/15.jpg)
Correlation Matrix Correlation Matrix ConstrctionConstrction
18.04.23 IWSEC'07 15
If f≠g means for nondiagonal entries of C’
Expected values 0 0 0 since μ=0
n
uuguf zz
nC
11
1'Then
![Page 16: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/16.jpg)
Correlation Matrix Correlation Matrix ConstrctionConstrction
18.04.23 IWSEC'07
If f=g means for diagonal entries of C’
Expected value is 0 since μ=0
n
uuf
n
uruf
n
uuf z
nr
nz
nC
1
2
1
22
1
2
1
1
1
1
1
1'
Then, assumming n≈n-1
![Page 17: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/17.jpg)
ProjectionProjection
18.04.23 IWSEC'07 17
Similarly, expected values are 0, then approximated matrix is obtained
TEAx 2
k
lljljililij Rerzx
1
))((
k
llj
k
l
k
l
k
lilljilljilljil RrerRzez
1 1 1 1
k
lljilez
1
![Page 18: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/18.jpg)
Remaining PartsRemaining PartsAfter determining clusters depending
on estimated data◦Z-score means of nongauge items are
stored in look up table.◦When active user, enters disguised gauge
ratings the effect of randomization is got rid of by the same way.
◦The representative cluster is defined, corresponding value from the table denormalized and the prediction is obtained!
18.04.23 IWSEC'07 18
![Page 19: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/19.jpg)
ExperimentsExperimentsData Set
◦Jester is a web-based joke data 17,988 users, 100 jokes Ratings over a range (-10,+10),continuos 50% of all ratings are present
Evaluation Metrics
18.04.23 IWSEC'07 19
d
rpMAE
d
iii
1
minmax rr
MAENMAE
p:predicted valuer:original valued:size of test setrmax:max rating
rmin: min rating
![Page 20: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/20.jpg)
Eigentaste vs. ModifiedEigentaste vs. Modified9000 training users, 5000 test
users(10 test items)
18.04.23 IWSEC'07 20
MAE NMAE
Eigentaste 3,740 0,187
Modified Eigentaste 3,334 0,167
![Page 21: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/21.jpg)
Protecting active users’ Protecting active users’ privacyprivacy
M1 M2 M3
MAE 3,3508 3,4710 3,4807
NMAE 0,1676 0,1735 0,1741
18.04.23 IWSEC'07 21
M1: No disguise, but requires additional costM2: Just considering gauge mean and stdM3: Considering whole mean and std
![Page 22: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/22.jpg)
Accuracy vs. Varying Accuracy vs. Varying Numbers of UsersNumbers of Users
n 500 1000 2000 4000 8000
MAE 4,678 4,242 3,832 3,624 3,483
NMAE 0,234 0,212 0,192 0,181 0,174
18.04.23 IWSEC'07 22
Fix 5000 users and random 10 test items
•By increasing number of users, accuracy improves since random numbers will converge to zero•n>=2000, results are satisfying!
![Page 23: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/23.jpg)
Accuracy with Varying Accuracy with Varying δδ ValuesValuesδ 0 35 70 100
MAE 3,4460 3,4567 3,4615 3,4710
NMAE 0,1723 0,1728 0,1730 0,1735
18.04.23 IWSEC'07 23
Accuracy slightly becomes better with decreasing δ values!
![Page 24: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/24.jpg)
ConclusionConclusionWe showed that how to achieve
privacy preserving CF tasks using Eigentaste-based algorithms?
We will study ◦whether we can employ other
clustering algorithms◦How to improve recommendation
qualitiesby using correlation based CF algorithms.
18.04.23 IWSEC'07 24
![Page 25: Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat {iyakut,polath}@anadolu.edu.tr Department of Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022081513/56649d3f5503460f94a196b7/html5/thumbnails/25.jpg)
Thanks for your interests!Questions?