collaborative filtering with spark
DESCRIPTION
Spotify uses a range of Machine Learning models to power its music recommendation features including the Discover page and Radio. Due to the iterative nature of training these models they suffer from IO overhead of Hadoop and are a natural fit to the Spark programming paradigm. In this talk I will present both the right way as well as the wrong way to implement collaborative filtering models with Spark. Additionally, I will deep dive into how Matrix Factorization is implemented in the MLlib library.TRANSCRIPT
![Page 1: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/1.jpg)
May 9, 2014
Collaborative Filtering with Spark
Chris Johnson@MrChrisJohnson
Friday, May 9, 14
![Page 2: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/2.jpg)
Who am I??•Chris Johnson
– Machine Learning guy from NYC– Focused on music recommendations– Formerly a graduate student at UT Austin
Friday, May 9, 14
![Page 3: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/3.jpg)
3What is MLlib?Algorithms:
•classification: logistic regression, linear support vector machine (SVM), naive bayes
•regression: generalized linear regression
•clustering: k-means
•decomposition: singular value decomposition (SVD), principle component analysis (PCA
•collaborative filtering: alternating least squares (ALS)
http://spark.apache.org/docs/0.9.0/mllib-guide.html
Friday, May 9, 14
![Page 4: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/4.jpg)
4What is MLlib?Algorithms:
•classification: logistic regression, linear support vector machine (SVM), naive bayes
•regression: generalized linear regression
•clustering: k-means
•decomposition: singular value decomposition (SVD), principle component analysis (PCA
•collaborative filtering: alternating least squares (ALS)
http://spark.apache.org/docs/0.9.0/mllib-guide.html
Friday, May 9, 14
![Page 5: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/5.jpg)
Collaborative Filtering - “The Netflix Prize” 5
Friday, May 9, 14
![Page 6: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/6.jpg)
Collaborative Filtering 6
Hey,I like tracks P, Q, R, S!
Well,I like tracks Q, R, S, T!
Then you should check out track P!
Nice! Btw try track T!
Image via Erik BernhardssonFriday, May 9, 14
![Page 7: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/7.jpg)
7Collaborative Filtering at Spotify• Discover (personalized recommendations)• Radio• Related Artists• Now Playing
Friday, May 9, 14
![Page 8: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/8.jpg)
Section name 8
Friday, May 9, 14
![Page 9: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/9.jpg)
Explicit Matrix Factorization 9
Movies
Users
Chris
Inception
•Users explicitly rate a subset of the movie catalog•Goal: predict how users will rate new movies
Friday, May 9, 14
![Page 10: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/10.jpg)
• = bias for user• = bias for item• = regularization parameter
Explicit Matrix Factorization 10
ChrisInception
? 3 5 ?1 ? ? 12 ? 3 2? ? ? 55 2 ? 4
•Approximate ratings matrix by the product of low-dimensional user and movie matrices
•Minimize RMSE (root mean squared error)
• = user rating for movie • = user latent factor vector• = item latent factor vector
X Y
Friday, May 9, 14
![Page 11: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/11.jpg)
Implicit Matrix Factorization 11
1 0 0 0 1 0 0 10 0 1 0 0 1 0 0 1 0 1 0 0 0 1 10 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1
•Replace Stream counts with binary labels– 1 = streamed, 0 = never streamed
•Minimize weighted RMSE (root mean squared error) using a function of stream counts as weights
• = bias for user• = bias for item• = regularization parameter
• = 1 if user streamed track else 0• • = user latent factor vector• =i tem latent factor vector
X Y
Friday, May 9, 14
![Page 12: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/12.jpg)
Alternating Least Squares 12
• Initialize user and item vectors to random noise
• Fix item vectors and solve for optimal user vectors– Take the derivative of loss function with respect to user’s vector, set
equal to 0, and solve– Results in a system of linear equations with closed form solution!
• Fix user vectors and solve for optimal item vectors• Repeat until convergence
code: https://github.com/MrChrisJohnson/implicitMF
Friday, May 9, 14
![Page 13: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/13.jpg)
Alternating Least Squares 13
• Note that:
• Then, we can pre-compute once per iteration– and only contain non-zero elements for tracks that
the user streamed– Using sparse matrix operations we can then compute each user’s
vector efficiently in time where is the number of tracks the user streamed
code: https://github.com/MrChrisJohnson/implicitMF
Friday, May 9, 14
![Page 14: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/14.jpg)
14Alternating Least Squares
code: https://github.com/MrChrisJohnson/implicitMFFriday, May 9, 14
![Page 15: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/15.jpg)
Section name 15
Friday, May 9, 14
![Page 16: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/16.jpg)
Scaling up Implicit Matrix Factorization with Hadoop
16
Friday, May 9, 14
![Page 17: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/17.jpg)
Hadoop at Spotify 2009 17
Friday, May 9, 14
![Page 18: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/18.jpg)
Hadoop at Spotify 2014 18
700 Nodes in our London data center
Friday, May 9, 14
![Page 19: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/19.jpg)
Implicit Matrix Factorization with Hadoop 19
Reduce stepMap step
u % K = 0i % L = 0
u % K = 0i % L = 1 ... u % K = 0
i % L = L-1
u % K = 1i % L = 0
u % K = 1i % L = 1 ... ...
... ... ... ...
u % K = K-1i % L = 0 ... ... u % K = K-1
i % L = L-1
item vectorsitem%L=0
item vectorsitem%L=1
item vectorsi % L = L-1
user vectorsu % K = 0
user vectorsu % K = 1
user vectorsu % K = K-1
all log entriesu % K = 1i % L = 1
u % K = 0
u % K = 1
u % K = K-1
Figure via Erik BernhardssonFriday, May 9, 14
![Page 20: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/20.jpg)
Implicit Matrix Factorization with Hadoop 20
One map taskDistributed
cache:All user vectors where u % K = x
Distributed cache:
All item vectors where i % L = y
Mapper Emit contributions
Map input:tuples (u, i, count)
where u % K = x
andi % L = y
Reducer New vector!
Figure via Erik BernhardssonFriday, May 9, 14
![Page 21: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/21.jpg)
Hadoop suffers from I/O overhead 21
IO Bottleneck
Friday, May 9, 14
![Page 22: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/22.jpg)
Spark to the rescue!! 22
Vs
http://www.slideshare.net/Hadoop_Summit/spark-and-shark
Spark
Hadoop
Friday, May 9, 14
![Page 23: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/23.jpg)
Section name 23
Friday, May 9, 14
![Page 24: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/24.jpg)
First Attempt 24
ratings user vectors item vectors
node 1 node 2 node 3 node 4 node 5 node 6
• For each iteration:– Compute YtY over item vectors and broadcast – Join user vectors along with all ratings for that user and all item vectors for
which the user rated the item– Sum up YtCuIY and YtCuPu and solve for optimal user vectors
Friday, May 9, 14
![Page 25: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/25.jpg)
First Attempt 25
ratings user vectors item vectors
node 1 node 2 node 3 node 4 node 5 node 6
• For each iteration:– Compute YtY over item vectors and broadcast – Join user vectors along with all ratings for that user and all item vectors for
which the user rated the item– Sum up YtCuIY and YtCuPu and solve for optimal user vectors
node 2 node 3 node 4 node 5
Friday, May 9, 14
![Page 26: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/26.jpg)
First Attempt 26
ratings user vectors item vectors
node 1 node 2 node 3 node 4 node 5 node 6
• For each iteration:– Compute YtY over item vectors and broadcast – Join user vectors along with all ratings for that user and all item vectors for
which the user rated the item– Sum up YtCuIY and YtCuPu and solve for optimal user vectors
node 2 node 3 node 4 node 5
Friday, May 9, 14
![Page 27: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/27.jpg)
First Attempt 27
Friday, May 9, 14
![Page 28: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/28.jpg)
First Attempt 28
•Issues: – Unnecessarily sending multiple copies of item vector to each node– Unnecessarily shuffling data across cluster at each iteration–Not taking advantage of Spark’s in memory capabilities!
Friday, May 9, 14
![Page 29: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/29.jpg)
Second Attempt 29
ratings user vectors item vectors
node 1 node 2 node 3 node 4 node 5 node 6
• For each iteration:– Compute YtY over item vectors and broadcast – Group ratings matrix into blocks, and join blocks with necessary user and
item vectors (to avoid multiple item vector copies at each node)– Sum up YtCuIY and YtCuPu and solve for optimal user vectors
Friday, May 9, 14
![Page 30: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/30.jpg)
Second Attempt 30
ratings user vectors item vectors
node 1 node 2 node 3 node 4 node 5 node 6
• For each iteration:– Compute YtY over item vectors and broadcast – Group ratings matrix into blocks, and join blocks with necessary user and
item vectors (to avoid multiple item vector copies at each node)– Sum up YtCuIY and YtCuPu and solve for optimal user vectors
Friday, May 9, 14
![Page 31: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/31.jpg)
Second Attempt 31
ratings user vectors item vectors
node 1 node 2 node 3 node 4 node 5 node 6
• For each iteration:– Compute YtY over item vectors and broadcast – Group ratings matrix into blocks, and join blocks with necessary user and
item vectors (to avoid multiple item vector copies at each node)– Sum up YtCuIY and YtCuPu and solve for optimal user vectors
Friday, May 9, 14
![Page 32: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/32.jpg)
Second Attempt 32
Friday, May 9, 14
![Page 33: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/33.jpg)
Second Attempt 33
Friday, May 9, 14
![Page 34: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/34.jpg)
Second Attempt 34
•Issues: –Still Unnecessarily shuffling data across cluster at each iteration–Still not taking advantage of Spark’s in memory capabilities!
Friday, May 9, 14
![Page 35: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/35.jpg)
So, what are we missing?... 35
•Partitioner: Defines how the elements in a key-value pair RDD are partitioned across the cluster.
node 1 node 2 node 3 node 4 node 5 node 6
user vectorspartition 1
partition 2
partition 3
Friday, May 9, 14
![Page 36: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/36.jpg)
So, what are we missing?... 36
•partitionBy(partitioner): Partitions all elements of the same key to the same node in the cluster, as defined by the partitioner.
node 1 node 2 node 3 node 4 node 5 node 6
user vectorspartition 1
partition 2
partition 3
Friday, May 9, 14
![Page 37: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/37.jpg)
So, what are we missing?... 37
•mapPartitions(func): Similar to map, but runs separately on each partition (block) of the RDD, so func must be of type Iterator[T] => Iterator[U] when running on an RDD of type T.
node 1 node 2 node 3 node 4 node 5 node 6
user vectorspartition 1
partition 2
partition 3
function()function()
function()
Friday, May 9, 14
![Page 38: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/38.jpg)
So, what are we missing?... 38
• persist(storageLevel): Set this RDD's storage level to persist (cache) its values across operations after the first time it is computed.
node 1 node 2 node 3 node 4 node 5 node 6
user vectorspartition 1
partition 2
partition 3
Friday, May 9, 14
![Page 39: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/39.jpg)
Third Attempt 39
ratings user vectors item vectors
node 1 node 2 node 3 node 4 node 5 node 6
• Partition ratings matrix, user vectors, and item vectors by user and item blocks and cache partitions in memory• Build InLink and OutLink mappings for users and items
– InLink Mapping: Includes the user IDs and vectors for a given block along with the ratings for each user in this block– OutLink Mapping: Includes the item IDs and vectors for a given block along with a list of destination blocks for which to send
these vectors• For each iteration:
– Compute YtY over item vectors and broadcast – On each item block, use the OutLink mapping to send item vectors to the necessary user blocks– On each user block, use the InLink mapping along with the joined item vectors to update vectors
Friday, May 9, 14
![Page 40: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/40.jpg)
Third Attempt 40
ratings user vectors item vectors
node 1 node 2 node 3 node 4 node 5 node 6
• Partition ratings matrix, user vectors, and item vectors by user and item blocks and cache partitions in memory• Build InLink and OutLink mappings for users and items
– InLink Mapping: Includes the user IDs and vectors for a given block along with the ratings for each user in this block– OutLink Mapping: Includes the item IDs and vectors for a given block along with a list of destination blocks for which to send
these vectors• For each iteration:
– Compute YtY over item vectors and broadcast – On each item block, use the OutLink mapping to send item vectors to the necessary user blocks– On each user block, use the InLink mapping along with the joined item vectors to update vectors
Friday, May 9, 14
![Page 41: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/41.jpg)
Third Attempt 41
ratings user vectors item vectors
node 1 node 2 node 3 node 4 node 5 node 6
• Partition ratings matrix, user vectors, and item vectors by user and item blocks and cache partitions in memory• Build InLink and OutLink mappings for users and items
– InLink Mapping: Includes the user IDs and vectors for a given block along with the ratings for each user in this block– OutLink Mapping: Includes the item IDs and vectors for a given block along with a list of destination blocks for which to send
these vectors• For each iteration:
– Compute YtY over item vectors and broadcast – On each item block, use the OutLink mapping to send item vectors to the necessary user blocks– On each user block, use the InLink mapping along with the joined item vectors to update vectors
Friday, May 9, 14
![Page 42: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/42.jpg)
Third Attempt 42
ratings user vectors item vectors
node 1 node 2 node 3 node 4 node 5 node 6
• Partition ratings matrix, user vectors, and item vectors by user and item blocks and cache partitions in memory• Build InLink and OutLink mappings for users and items
– InLink Mapping: Includes the user IDs and vectors for a given block along with the ratings for each user in this block– OutLink Mapping: Includes the item IDs and vectors for a given block along with a list of destination blocks for which to send
these vectors• For each iteration:
– Compute YtY over item vectors and broadcast – On each item block, use the OutLink mapping to send item vectors to the necessary user blocks– On each user block, use the InLink mapping along with the joined item vectors to update vectors
Friday, May 9, 14
![Page 43: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/43.jpg)
Third Attempt 43
ratings user vectors item vectors
node 1 node 2 node 3 node 4 node 5 node 6
• Partition ratings matrix, user vectors, and item vectors by user and item blocks and cache partitions in memory• Build InLink and OutLink mappings for users and items
– InLink Mapping: Includes the user IDs and vectors for a given block along with the ratings for each user in this block– OutLink Mapping: Includes the item IDs and vectors for a given block along with a list of destination blocks for which to send
these vectors• For each iteration:
– Compute YtY over item vectors and broadcast – On each item block, use the OutLink mapping to send item vectors to the necessary user blocks– On each user block, use the InLink mapping along with the joined item vectors to update vectors
Friday, May 9, 14
![Page 44: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/44.jpg)
Third Attempt 44
ratings user vectors item vectors
node 1 node 2 node 3 node 4 node 5 node 6
• Partition ratings matrix, user vectors, and item vectors by user and item blocks and cache partitions in memory• Build InLink and OutLink mappings for users and items
– InLink Mapping: Includes the user IDs and vectors for a given block along with the ratings for each user in this block– OutLink Mapping: Includes the item IDs and vectors for a given block along with a list of destination blocks for which to send
these vectors• For each iteration:
– Compute YtY over item vectors and broadcast – On each item block, use the OutLink mapping to send item vectors to the necessary user blocks– On each user block, use the InLink mapping along with the joined item vectors to update vectors
Friday, May 9, 14
![Page 45: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/45.jpg)
Third attempt 45
Friday, May 9, 14
![Page 46: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/46.jpg)
Third attempt 46
Friday, May 9, 14
![Page 47: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/47.jpg)
Third attempt 47
Friday, May 9, 14
![Page 48: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/48.jpg)
ALS Running Times 48
Via Xiangrui Meng (Databricks) http://stanford.edu/~rezab/sparkworkshop/slides/xiangrui.pdfFriday, May 9, 14
![Page 49: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/49.jpg)
Section name 49
Fin
Friday, May 9, 14
![Page 50: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/50.jpg)
Section name 50
Friday, May 9, 14
![Page 51: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/51.jpg)
Section name 51
Friday, May 9, 14
![Page 52: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/52.jpg)
Section name 52
Friday, May 9, 14
![Page 53: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/53.jpg)
Section name 53
Friday, May 9, 14
![Page 54: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/54.jpg)
Section name 54
Friday, May 9, 14
![Page 55: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/55.jpg)
Section name 55
Friday, May 9, 14
![Page 56: Collaborative Filtering with Spark](https://reader033.vdocuments.us/reader033/viewer/2022051207/540dcbb18d7f72767e8b4b48/html5/thumbnails/56.jpg)
Section name 56
Friday, May 9, 14