high-dimensional error analysis of regularized m-estimators ehsan abbasichristos thrampoulidisbabak...
TRANSCRIPT
![Page 1: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/1.jpg)
1
High-dimensional Error Analysis of Regularized M-Estimators
Ehsan AbbasiChristos Thrampoulidis Babak Hassibi
Allerton ConferenceWednesday September 30, 2015
![Page 2: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/2.jpg)
2
Linear Regression ModelEstimate unknown signal from noisy linear measurements:
measurement/design matrix
unknown signal
noise vector
![Page 3: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/3.jpg)
3
M-estimatorsFor some convex loss function solve:
• Maximum Likelihood (ML) estimators
?
• least-squares, least-absolute deviationsHuber-loss, etc…
Fisher information, consistency, asymptotic normality,Cramer-Rao bound, ML, robust statistics, Huber loss, optimal loss …
![Page 4: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/4.jpg)
4
Why revisit & what changes?
• Modern: n is increasingly large machine learning, image processing, sensor/social networks, DNA microarrays, ...
• Structured signals: sparse, low-rank, block-sparse, low-varying …
Regularized M-estimators
• Compressive sensing:
• Traditional: but the ambient dimension n is fixed
• Regularizer is structure inducing, convex, typically non-smoothL1 , nuclear, L1/L2 norms, total variation …atomic norms
atomic norms
![Page 5: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/5.jpg)
5
Classical question - Modern regime: New results & phenomena
• High-dimensional Proportional regime
?
• Question goes back to 50’s (Huber, Kolmogorov…)• Only very recent advances, special instances, strict assumptions• No general theory!
has entries iid GaussianAssumption:
• benchmark in CS/statistics theory• universality
![Page 6: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/6.jpg)
6
Contribution
• at a rate Assume
• has entries iid Gaussian
• mild regularity conditions on , pz, f, and px0
Then, with probability one,
where is the unique solution to a system of four nonlinear equationsin four unknowns :
![Page 7: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/7.jpg)
7
The Equations
Let’s parse them,to get some insight …
![Page 8: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/8.jpg)
8
The Explicit ones
and appear in the equations explicitly.
![Page 9: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/9.jpg)
9
The Loss and the Regularizer
The loss function and the regularizer appear through their Moureau envelope approximations.
In the traditional regime instead of the Moureau envelopes the functions themselves appear
![Page 10: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/10.jpg)
10
The Distributions
The convolution of the pdf of the noise with a gaussian is a completely new phenomenon compared to the traditional regime
![Page 11: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/11.jpg)
11
The Expected Moureau Envelope• The role of and is summarized in
• how they affect error performance of the M-estimator • (strictly) convex and continuously differentiable
even if is non-differentiable!
• generalizes the “Gaussian width” or “Gaussian distance squared” or “statistical dimension”.
• same for and
![Page 12: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/12.jpg)
12
Reminder: Moureau EnvelopesMoureau-Yoshida envelope of evaluated at with parameter :
• always underestimates f at x. The smaller the τ the closer to f
• smooth approximation always continuously differentiable in both x and τ
( even if f is non-differentiable )• jointly convex in x and τ
• optimal v is unique (proximal operator)
• everything extends to vector-valued function f
![Page 13: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/13.jpg)
13
Examples
![Page 14: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/14.jpg)
14
Set Indicator Function
Gaussian width
![Page 15: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/15.jpg)
15
Summarizing Key Features
• Squared error of general Regularized M-estimators• Minimal and generic regularity assumptions
– non-smooth, heavy-tails, non-separable, …• Key role of Expected Moureau envelopes
– strictly convex and smooth– generalize known geometric summary parameters
• Observation: fast solution by simple iterative scheme!
![Page 16: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/16.jpg)
16
Simulations
Optimal tuning?
![Page 17: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/17.jpg)
17
Non-smooth losses
![Page 18: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/18.jpg)
18
Non-smooth losses
Optimal loss?
![Page 19: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/19.jpg)
19
Non-smooth losses
Consistent Estimators?
![Page 20: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/20.jpg)
20
Heavy-tailed noise• Huber loss function + noise iid Cauchy Robustness?
![Page 21: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/21.jpg)
21
Non-separable loss
Square-root LASSO
![Page 22: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/22.jpg)
22
Beyond Gaussian Designs
• analysis framework directly applies to elliptically distributed• For the LASSO we have extended ideas to IRO matrices
• Universality over iid entries (Empirical observation) modifiedequations
![Page 23: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/23.jpg)
23
Convex Gaussian Min-max Theorem
Apply CGMT to
(PO)
(AO)
Theorem (CGMT) [TAH’15,TOH’15]
![Page 24: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/24.jpg)
24
Proof Diagram
M-estimator (PO)Duality
(AO)
(DO)Deterministic min-max
Optimization in 4 variablesCGMT
The Equations
First-order optimality conditions
![Page 25: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/25.jpg)
25
Related Literature
• [El Karoui 2013,2015]• Ridge regularization, smooth loss, no structured x0
• Ellpitical distributions• iid entries beyond Gaussian
• [Donoho, Montanari 2013]• No regularizer• smooth+strongly convex, bounded noise
![Page 26: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,](https://reader035.vdocuments.us/reader035/viewer/2022070412/5697bf811a28abf838c854f2/html5/thumbnails/26.jpg)
26
Conclusions• Master Theorem for general M-estimators
– Minimal assumptions– 4 nonlinear equations, unique solution, fast iterative solution (why?)– Summary parameters: Expected Moureau envelopes
• Opportunities, lots to be asked…• Optimal loss-function? optimal Regularizer? • When can we be consistent?• Optimally tuning tuning parameter?
LASSO: Linear = Non-linear[TAH’15 NIPS]
• CGMT framework is powerful• non-linear measurements, y=g(Ax0)
• Beyond squared error analysis… Apply CGMT for different set S…[TAYH’15 ICASSP]
Thank You!