outlier-robust matrix completion via - sigport · 2019. 12. 18. · matrix completion via l...
TRANSCRIPT
![Page 1: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/1.jpg)
H. C. So Page 1
Outlier-Robust
Matrix Completion via
lp-Minimization
Hing Cheung So
http://www.ee.cityu.edu.hk/~hcso
Department of Electronic Engineering, City University of Hong Kong
W.-J.Zeng and H.C.So, “Outlier-robust matrix completion via lp-minimization,” IEEE Transactions on Signal Processing, vol.66, no.5, pp.1125-1140, March 2018
![Page 2: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/2.jpg)
H. C. So Page 2
Outline
Introduction
Matrix Completion via lp-norm Factorization
Iterative lp-Regression
Alternating Direction Method of Multipliers (ADMM)
Numerical Examples
Concluding Remarks
List of References
![Page 3: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/3.jpg)
H. C. So Page 3
Introduction
What is Matrix Completion?
The aim is to recover a low‐rank matrix given only a subset
of its possibly noisy entries, e.g.,
1 4
2 5
5 4
4 5
?
? ?
? ? ?
?
?
?
?? ?
![Page 4: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/4.jpg)
H. C. So Page 4
Let be a matrix with missing entries:
where is a subset of the complete set of entries ,
while the unknown entries are assumed zero.
Matrix completion refers to finding , given the incomplete observations with the low‐rank information of
, which can be mathematically formulated as:
That is, among all matrices consistent with the observed
entries, we look for the one with minimum rank.
![Page 5: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/5.jpg)
H. C. So Page 5
Why Matrix Completion is Important?
It is a core problem in many applications including:
Collaborative Filtering
Image Inpainting and Restoration
System Identification
Node Localization
Genotype Imputation
It is because many real-world signals can be approximated by a matrix whose rank is .
Netflix Prize, whose goal was to accurately predict user preferences with the use of a database of over 100 million movie ratings made by 480,189 users in 17,770 films,
![Page 6: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/6.jpg)
H. C. So Page 6
which corresponds to the task of completing a matrix with
around 99% missing entries.
Alice
Bob
Carol
Dave
1 4
2 5
5 4
4 5
![Page 7: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/7.jpg)
H. C. So Page 7
How to Recover an Incomplete Matrix?
Directly solving the noise-free version:
or noisy version:
is difficult because the rank minimization problem is NP-hard.
A popular and practical solution is to replace the nonconvex rank by convex nuclear norm:
![Page 8: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/8.jpg)
H. C. So Page 8
or
where equals the sum of singular values of .
However, complexity of nuclear norm minimization is still high and this approach is not robust if contains outliers.
Another popular direction which is computationally simple is
to apply low-rank matrix factorization:
where and . Again, the Frobenius norm is
not robust against impulsive noise.
![Page 9: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/9.jpg)
H. C. So Page 9
Matrix Completion via lp-norm Factorization
To achieve outlier resistance, we robustify the matrix
factorization formulation via generalization of the Frobenius norm to ‐norm where :
where denotes the element-wise -norm of a matrix:
![Page 10: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/10.jpg)
H. C. So Page 10
Iterative lp-Regression To ‐norm minimization, our first idea is to adopt the
alternating minimization strategy:
and
where the algorithm is initialized with , and represents
the estimate of at the th iteration.
After determining and , the target matrix is obtained as .
![Page 11: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/11.jpg)
H. C. So Page 11
We now focus on solving:
for a fixed . Note that is dropped for notational
simplicity.
Denoting the th row of and the th column of as and
, where , , , the problem can
be rewritten as:
Since is decoupled w.r.t. , it is equivalent to solving
the following independent subproblems:
![Page 12: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/12.jpg)
H. C. So Page 12
where denotes the set
containing the row indices for the th column in . Here,
stands for the cardinality of and in general .
For example, consider :
For , the and entries are observed, and thus
. Similarly, and . Combining
the results yields .
![Page 13: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/13.jpg)
H. C. So Page 13
Define containing the rows indexed by :
and , then we obtain:
which is a robust linear regression in ‐space.
For , it is a least squares (LS) problem with solution
being , and the corresponding computational
complexity is .
![Page 14: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/14.jpg)
H. C. So Page 14
For , the ‐regression can be efficiently solved by
the iteratively reweighted least squares (IRLS). At the th
iteration, the IRLS solves the following weighted LS problem:
where with
The is the th element of and . As only
one LS problem is required to solve in each IRLS iteration, its complexity is . Hence the total complexity for
all -regressions is due to .
![Page 15: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/15.jpg)
H. C. So Page 15
Due to the same structure in ,
The th row of is updated by
where is the set containing the
column indices for the th row in .
Using previous example, only entry is observed for ,
and thus . Similarly, , and
. Here, contains columns indexed
by and . The involved complexity
is and hence the total complexity for solving all
‐regressions is due to .
![Page 16: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/16.jpg)
H. C. So Page 16
![Page 17: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/17.jpg)
H. C. So Page 17
ADMM
Assign:
The proposed robust formulation is then equivalent to:
Its augmented Lagrangian is:
where with for contains
Lagrange multipliers.
![Page 18: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/18.jpg)
H. C. So Page 18
The Lagrange multiplier method aims to find a saddle point
of:
The solution is obtained by applying the ADMM via the
following iterative steps:
![Page 19: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/19.jpg)
H. C. So Page 19
Ignoring the constant term independent of , it is shown
that
is equivalent to:
which can be solved by Algorithm 1 with , with a
complexity bound of , where is the required
iteration number.
![Page 20: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/20.jpg)
H. C. So Page 20
For the problem of
It can be simplified as:
where
We only need to consider the entries indexed by because other entries of and which are not in are zero.
![Page 21: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/21.jpg)
H. C. So Page 21
Defining , , , and as the vectors that contain
the observed entries in , , , and , we have the
equivalent vector optimization problem:
whose solution can be written in proximity operator:
Denoting and , , as the th entry of and ,
and noting the separability of the problem, we solve
independent scalar problems instead:
![Page 22: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/22.jpg)
H. C. So Page 22
For , closed‐form solution exists:
with a marginal complexity of .
For , the solution of the scalar minimization problem is:
where with being the unique root of:
in and the bisection method can be used.
![Page 23: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/23.jpg)
H. C. So Page 23
Although computing the proximity operator for still has
a complexity of , it is more complicated than
because there is no closed‐form solution.
On the other hand, the solution for the case of can
be obtained in a similar manner. Again, there is no closed‐
form solution and calculating the proximity operator for has a complexity of although an iterative
procedure for root finding is required.
Note that the choice of is more robust than employing
and is computationally simpler.
![Page 24: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/24.jpg)
H. C. So Page 24
For
It is converted in vector form:
whose complexity is .
Note that at each iteration, instead of is needed to
compute, whose complexity is because only inner
products are calculated.
The algorithm is terminated when
![Page 25: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/25.jpg)
H. C. So Page 25
![Page 26: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/26.jpg)
H. C. So Page 26
Numerical Examples
is generated by multiplying and
whose entries are standard Gaussian distribution.
45% entries of are randomly selected as observations.
, and .
Performance measure is:
CPU times for attaining of SVT, SVP, -
regression with and and ADMM with are
10.7s, 8.0s, 0.28s, 4.5s, and 0.28s, respectively.
![Page 27: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/27.jpg)
H. C. So Page 27
RMSE versus iteration number in noise-free case
![Page 28: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/28.jpg)
H. C. So Page 28
RMSE versus iteration number in GMM noise at SNR=6dB
![Page 29: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/29.jpg)
H. C. So Page 29
RMSE versus SNR
![Page 30: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/30.jpg)
H. C. So Page 30
Results of image inpainting in salt-and-pepper noise
![Page 31: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/31.jpg)
H. C. So Page 31
Concluding Remarks
Two algorithms for robust matrix completion using low-rank factorization via -norm minimization with
are devised.
The first tackles the nonconvex factorization with missing data by iteratively solving multiple independent linear -
regressions.
The second applies ADMM in ‐space: At each iteration,
it requires solving a LS matrix factorization problem and calculating proximity operator of the th power of ‐
norm. The LS factorization can be efficiently solved using linear LS regression while the proximity operator has
![Page 32: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/32.jpg)
H. C. So Page 32
closed‐form solution for or can be obtained by root
finding of a scalar nonlinear equation for .
Both are based on alternating optimization, and have
comparable recovery performance and computational complexity of where is a fixed constant of
several hundreds to thousands.
Their superiority over the SVT and SVP in terms of
implementation complexity, recovery capability and outlier-robustness is demonstrated.
![Page 33: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/33.jpg)
H. C. So Page 33
List of References
[1] E. J. Candès and Y. Plan, “Matrix completion with noise,”
Proceedings of the IEEE, vol. 98, no. 6, pp. 9255–936,
Jun. 2010.
[2] J.-F. Cai, E. J. Candès and Z. Shen, “A singular value thresholding algorithm for matrix completion,” SIAM
Journal on Optimization, vol. 20, no. 4, pp. 1956–1982,
2010.
[3] P. Jain, R. Meka and I. S. Dhillon, “Guaranteed rank minimization via singular value projection,” Advances in
Neural Information Processing Systems , pp. 937–945,
2010.
[4] Y. Koren, R. Bell and C. Volinsky, “Matrix factorization
techniques for recommender systems,” Computer, vol. 42, no. 8, pp. 30–37, 2009.
![Page 34: Outlier-Robust Matrix Completion via - SigPort · 2019. 12. 18. · Matrix Completion via l p-Minimization Hing Cheung So ... Genotype Imputation It is because many real-world signals](https://reader033.vdocuments.us/reader033/viewer/2022060521/604fde8652277b17f1661246/html5/thumbnails/34.jpg)
H. C. So Page 34
[5] R. H. Keshavan, A. Montanari and S. Oh, “Matrix
completion from a few entries,” IEEE Transactions on
Information Theory, vol. 56, no. 6, pp. 2980–2998, Jun. 2010.
[6] R. Sun and Z.-Q. Luo, “Guaranteed matrix completion
via nonconvex factorization,” IEEE Transactions on
Information Theory, vol. 62, no. 11, pp. 6535–6579,
Nov. 2016. [7] G. Marjanovic and V. Solo, “On optimization and
matrix completion,” IEEE Transactions on Signal
Processing, vol. 60, no. 11, pp. 5714–5724, Nov. 2012.
[8] J. Liu, P. Musialski, P. Wonka and J. Ye, “Tensor
completion for estimating missing values in visual data,”
IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 35, no. 1, pp. 208–220, Jan. 2013.