all you wanted to know about (sparse / high dimensional ...saharon/bigdata2018/pca_high_dim.pdf ·...

46
All you wanted to know about (sparse / high dimensional) PCA and Covariance Estimation ——————— but didn’t dare to ask Boaz Nadler Department of Computer Science and Applied Mathematics The Weizmann Institute of Science June 2018 Boaz Nadler PCA

Upload: truongnguyet

Post on 07-Aug-2019

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

All you wanted to know about(sparse / high dimensional)

PCA and Covariance Estimation———————

but didn’t dare to ask

Boaz Nadler

Department of Computer Science and Applied MathematicsThe Weizmann Institute of Science

June 2018

Boaz Nadler PCA

Page 2: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Part 0: Introduction

Boaz Nadler PCA

Page 3: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Introduction

In many applications we need to analyze multivariate data,

x1, . . . , xn ∈ Rp

Throughout talk:p - dimension, n - number of samples

Many different scenarios:

- p small, n large.

- both p and n large.

- p large, n small.

Boaz Nadler PCA

Page 4: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Introduction

In many applications we need to analyze multivariate data,

x1, . . . , xn ∈ Rp

Throughout talk:p - dimension, n - number of samples

Many different scenarios:

- p small, n large.

- both p and n large.

- p large, n small.

Boaz Nadler PCA

Page 5: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Unsupervised Tasks

- Visualization,

- Dimensionality Reduction

- Compression, Denoising

- Exploratory Data Analysis - finding structure in data

- Many hypothesis testing problems.

Boaz Nadler PCA

Page 6: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Unsupervised Tasks

- Visualization,

- Dimensionality Reduction

- Compression, Denoising

- Exploratory Data Analysis - finding structure in data

- Many hypothesis testing problems.

Boaz Nadler PCA

Page 7: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Unsupervised Tasks

- Visualization,

- Dimensionality Reduction

- Compression, Denoising

- Exploratory Data Analysis - finding structure in data

- Many hypothesis testing problems.

Boaz Nadler PCA

Page 8: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Unsupervised Tasks

- Visualization,

- Dimensionality Reduction

- Compression, Denoising

- Exploratory Data Analysis - finding structure in data

- Many hypothesis testing problems.

Boaz Nadler PCA

Page 9: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Unsupervised Tasks

- Visualization,

- Dimensionality Reduction

- Compression, Denoising

- Exploratory Data Analysis - finding structure in data

- Many hypothesis testing problems.

Boaz Nadler PCA

Page 10: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Supervised Tasks

In some cases we observe also a response yi for each xi .

Common tasks:

- Classification (Linear Discriminant Analysis)

- Regression (Multivariate Linear Regression).

Boaz Nadler PCA

Page 11: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Density Estimation and Moments

Typical assumption: xi are i.i.d. from some r.v. X with unknowndensity f (x).

In principle, f (x) describes everything about X .

So, we can try to estimate it.

Non-parametric (kernel) density estimation:

|f (x)− f (x)| ∼ n−2β/(2β+p)

where β measure of smoothness of f (x).

Curse of dimensionality:For small error ε, need exponential in p number of samples,n = O(exp(Cp)).

Typically not practical if p > 15

Boaz Nadler PCA

Page 12: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Density Estimation and Moments

Typical assumption: xi are i.i.d. from some r.v. X with unknowndensity f (x).

In principle, f (x) describes everything about X .

So, we can try to estimate it.

Non-parametric (kernel) density estimation:

|f (x)− f (x)| ∼ n−2β/(2β+p)

where β measure of smoothness of f (x).

Curse of dimensionality:For small error ε, need exponential in p number of samples,n = O(exp(Cp)).

Typically not practical if p > 15

Boaz Nadler PCA

Page 13: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Density Estimation and Moments

Typical assumption: xi are i.i.d. from some r.v. X with unknowndensity f (x).

In principle, f (x) describes everything about X .

So, we can try to estimate it.

Non-parametric (kernel) density estimation:

|f (x)− f (x)| ∼ n−2β/(2β+p)

where β measure of smoothness of f (x).

Curse of dimensionality:For small error ε, need exponential in p number of samples,n = O(exp(Cp)).

Typically not practical if p > 15

Boaz Nadler PCA

Page 14: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Density Estimation and Moments

Typical assumption: xi are i.i.d. from some r.v. X with unknowndensity f (x).

In principle, f (x) describes everything about X .

So, we can try to estimate it.

Non-parametric (kernel) density estimation:

|f (x)− f (x)| ∼ n−2β/(2β+p)

where β measure of smoothness of f (x).

Curse of dimensionality:For small error ε, need exponential in p number of samples,n = O(exp(Cp)).

Typically not practical if p > 15

Boaz Nadler PCA

Page 15: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Density Estimation and Moments

Typical assumption: xi are i.i.d. from some r.v. X with unknowndensity f (x).

In principle, f (x) describes everything about X .

So, we can try to estimate it.

Non-parametric (kernel) density estimation:

|f (x)− f (x)| ∼ n−2β/(2β+p)

where β measure of smoothness of f (x).

Curse of dimensionality:For small error ε, need exponential in p number of samples,n = O(exp(Cp)).

Typically not practical if p > 15

Boaz Nadler PCA

Page 16: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

First and Second Moments of X

Since already at moderate dimensions cannot estimate f (x), optfor first and second moments:

Definition: The population mean vector µ ∈ Rp is

µ = E[X ] =

∫xf (x)dx

Definition: The population covariance matrix Σp×p is defined as

Σ = E[(X − µ)(X − µ)T ] =

∫(x− µ)(x− µ)T f (x)dx

In many problems, estimates of µ,Σ will suffice to solve a dataanalysis task.

Boaz Nadler PCA

Page 17: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Example I: Quadratic Discriminant Analysis

Binary (2-class) classification problem:

Observe xin+i=1 from class +1

and

x′in−i=1 from class −1.

Goal: classify label y ∈ ±1 of new x.

QDA: Assume each class is multivariate Gaussian,

Likellhood Ratio = C (Σ+,Σ−)exp(−(x− µ+)TΣ−1+ (x− µ+)/2)

exp(−(x− µ−)TΣ−1− (x− µ−)/2)

To apply QDA, need to estimate µ± and Σ± (actually its inverse)

Boaz Nadler PCA

Page 18: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Example I: Quadratic Discriminant Analysis

Binary (2-class) classification problem:

Observe xin+i=1 from class +1

and

x′in−i=1 from class −1.

Goal: classify label y ∈ ±1 of new x.

QDA: Assume each class is multivariate Gaussian,

Likellhood Ratio = C (Σ+,Σ−)exp(−(x− µ+)TΣ−1+ (x− µ+)/2)

exp(−(x− µ−)TΣ−1− (x− µ−)/2)

To apply QDA, need to estimate µ± and Σ± (actually its inverse)

Boaz Nadler PCA

Page 19: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Example I: Quadratic Discriminant Analysis

Binary (2-class) classification problem:

Observe xin+i=1 from class +1

and

x′in−i=1 from class −1.

Goal: classify label y ∈ ±1 of new x.

QDA: Assume each class is multivariate Gaussian,

Likellhood Ratio = C (Σ+,Σ−)exp(−(x− µ+)TΣ−1+ (x− µ+)/2)

exp(−(x− µ−)TΣ−1− (x− µ−)/2)

To apply QDA, need to estimate µ± and Σ± (actually its inverse)

Boaz Nadler PCA

Page 20: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Example I: Quadratic Discriminant Analysis

Binary (2-class) classification problem:

Observe xin+i=1 from class +1

and

x′in−i=1 from class −1.

Goal: classify label y ∈ ±1 of new x.

QDA: Assume each class is multivariate Gaussian,

Likellhood Ratio = C (Σ+,Σ−)exp(−(x− µ+)TΣ−1+ (x− µ+)/2)

exp(−(x− µ−)TΣ−1− (x− µ−)/2)

To apply QDA, need to estimate µ± and Σ± (actually its inverse)

Boaz Nadler PCA

Page 21: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Example I: Quadratic Discriminant Analysis

Binary (2-class) classification problem:

Observe xin+i=1 from class +1

and

x′in−i=1 from class −1.

Goal: classify label y ∈ ±1 of new x.

QDA: Assume each class is multivariate Gaussian,

Likellhood Ratio = C (Σ+,Σ−)exp(−(x− µ+)TΣ−1+ (x− µ+)/2)

exp(−(x− µ−)TΣ−1− (x− µ−)/2)

To apply QDA, need to estimate µ± and Σ± (actually its inverse)

Boaz Nadler PCA

Page 22: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Example II: Markowitz Portfolio Optimization

[Harry Markowitz, Nobel prize 90’]User can invest in p stocks, with weight vector w = (w1, . . . ,wp),s.t.

∑wi = 1.

Stock i has expected return µi . All p stocks have joint covarianceΣ, where Σii is the variance of the i-th stock.

Goal: Construct a portfolio that achieves an average target returnµR , namely

∑i wiµi = µR

butwith minimal risk (variance).

minw

wTΣw

To construct portfolio, need to estimate µi and Σ.

Boaz Nadler PCA

Page 23: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Example II: Markowitz Portfolio Optimization

[Harry Markowitz, Nobel prize 90’]User can invest in p stocks, with weight vector w = (w1, . . . ,wp),s.t.

∑wi = 1.

Stock i has expected return µi . All p stocks have joint covarianceΣ, where Σii is the variance of the i-th stock.

Goal: Construct a portfolio that achieves an average target returnµR , namely

∑i wiµi = µR

butwith minimal risk (variance).

minw

wTΣw

To construct portfolio, need to estimate µi and Σ.

Boaz Nadler PCA

Page 24: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Example II: Markowitz Portfolio Optimization

[Harry Markowitz, Nobel prize 90’]User can invest in p stocks, with weight vector w = (w1, . . . ,wp),s.t.

∑wi = 1.

Stock i has expected return µi . All p stocks have joint covarianceΣ, where Σii is the variance of the i-th stock.

Goal: Construct a portfolio that achieves an average target returnµR , namely

∑i wiµi = µR

butwith minimal risk (variance).

minw

wTΣw

To construct portfolio, need to estimate µi and Σ.

Boaz Nadler PCA

Page 25: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Example II: Markowitz Portfolio Optimization

[Harry Markowitz, Nobel prize 90’]User can invest in p stocks, with weight vector w = (w1, . . . ,wp),s.t.

∑wi = 1.

Stock i has expected return µi . All p stocks have joint covarianceΣ, where Σii is the variance of the i-th stock.

Goal: Construct a portfolio that achieves an average target returnµR , namely

∑i wiµi = µR

butwith minimal risk (variance).

minw

wTΣw

To construct portfolio, need to estimate µi and Σ.

Boaz Nadler PCA

Page 26: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Example III: Dimensionality of Data / Signals in Noise

Analytical Chemistry, Array Signal Processing, ...:

Assume “signal+noise” model

x =K∑j=1

sjvj + noise

Given observed data x1, . . . , xn, a basic question is: what is K ?

The covariance matrix of data has K spikes, eigenvalues largerthan noise variance σ2, and remaining eigenvalues all equal σ2.

Question: How to estimate K , which signal strengths can bedetected ?

Boaz Nadler PCA

Page 27: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Example III: Dimensionality of Data / Signals in Noise

Analytical Chemistry, Array Signal Processing, ...:

Assume “signal+noise” model

x =K∑j=1

sjvj + noise

Given observed data x1, . . . , xn, a basic question is: what is K ?

The covariance matrix of data has K spikes, eigenvalues largerthan noise variance σ2, and remaining eigenvalues all equal σ2.

Question: How to estimate K , which signal strengths can bedetected ?

Boaz Nadler PCA

Page 28: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Example III: Dimensionality of Data / Signals in Noise

Analytical Chemistry, Array Signal Processing, ...:

Assume “signal+noise” model

x =K∑j=1

sjvj + noise

Given observed data x1, . . . , xn, a basic question is: what is K ?

The covariance matrix of data has K spikes, eigenvalues largerthan noise variance σ2, and remaining eigenvalues all equal σ2.

Question: How to estimate K , which signal strengths can bedetected ?

Boaz Nadler PCA

Page 29: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Detection of Structure / Signals in Noise

Eigenvalues of Sn (log-scale):

5 10 15 20 25

−2

−1

0

1

2

3

p= 25 n= 1000

5 10 15 20 25

−2

−1

0

1

2

3

p= 25 n= 50

Boaz Nadler PCA

Page 30: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Estimation of Signal Direction

Suppose there is one signal in the data.

x = sv + σξ

Goal: Estimate signal direction (vector v).

Questions: How should it be estimated ?How well can it be estimated ?What happens when dimension p is high ?

Boaz Nadler PCA

Page 31: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Estimation of Signal Direction

Suppose there is one signal in the data.

x = sv + σξ

Goal: Estimate signal direction (vector v).

Questions: How should it be estimated ?How well can it be estimated ?What happens when dimension p is high ?

Boaz Nadler PCA

Page 32: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Estimation of Signal Direction

0 1 2 30

20

40

60

80

σ

λ1, λ

2 as a function of σ

0 1 2 30

0.2

0.4

0.6

0.8

1

σ

< vk , e

1 >

Boaz Nadler PCA

Page 33: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Estimation of Signal Direction

0 2 4 6

x 104

1

1.1

1.2

1.3

1.4

1.5

n

λ1, λ

2 as a function of n

0 2 4 6

x 104

0

0.2

0.4

0.6

0.8

n

< vPCA

, e1 >

Boaz Nadler PCA

Page 34: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Common Statistical Inference Problems

Given data x1, . . . , xn:

- Estimate population covariance matrix Σ

- Estimate the precision matrix Ω = Σ−1.

- Estimate the largest eigenvalues and eigenvectors of Σ.

- Suppose Σ (or Σ−1) have many zeros. Find their support.

- Many hypothesis testing problems: Is Σ = Σ0 ?

- Uncorrelated variables: Is Σ = diag , is Σ banded ?

Boaz Nadler PCA

Page 35: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Common Statistical Inference Problems

Given data x1, . . . , xn:

- Estimate population covariance matrix Σ

- Estimate the precision matrix Ω = Σ−1.

- Estimate the largest eigenvalues and eigenvectors of Σ.

- Suppose Σ (or Σ−1) have many zeros. Find their support.

- Many hypothesis testing problems: Is Σ = Σ0 ?

- Uncorrelated variables: Is Σ = diag , is Σ banded ?

Boaz Nadler PCA

Page 36: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Common Statistical Inference Problems

Given data x1, . . . , xn:

- Estimate population covariance matrix Σ

- Estimate the precision matrix Ω = Σ−1.

- Estimate the largest eigenvalues and eigenvectors of Σ.

- Suppose Σ (or Σ−1) have many zeros. Find their support.

- Many hypothesis testing problems: Is Σ = Σ0 ?

- Uncorrelated variables: Is Σ = diag , is Σ banded ?

Boaz Nadler PCA

Page 37: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Common Statistical Inference Problems

Given data x1, . . . , xn:

- Estimate population covariance matrix Σ

- Estimate the precision matrix Ω = Σ−1.

- Estimate the largest eigenvalues and eigenvectors of Σ.

- Suppose Σ (or Σ−1) have many zeros. Find their support.

- Many hypothesis testing problems: Is Σ = Σ0 ?

- Uncorrelated variables: Is Σ = diag , is Σ banded ?

Boaz Nadler PCA

Page 38: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Common Statistical Inference Problems

Given data x1, . . . , xn:

- Estimate population covariance matrix Σ

- Estimate the precision matrix Ω = Σ−1.

- Estimate the largest eigenvalues and eigenvectors of Σ.

- Suppose Σ (or Σ−1) have many zeros. Find their support.

- Many hypothesis testing problems: Is Σ = Σ0 ?

- Uncorrelated variables: Is Σ = diag , is Σ banded ?

Boaz Nadler PCA

Page 39: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Inference Problems

Typically, µ and Σ are unknown and must be estimated from data.

Classical Estimates:

Sample Mean:

x =1

n

∑i

xi

Sample Covariance Matrix:

Sn =1

n − 1

n∑i=1

(xi − x)(xi − x)T

Boaz Nadler PCA

Page 40: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Inference Problems

Typically, µ and Σ are unknown and must be estimated from data.

Classical Estimates:

Sample Mean:

x =1

n

∑i

xi

Sample Covariance Matrix:

Sn =1

n − 1

n∑i=1

(xi − x)(xi − x)T

Boaz Nadler PCA

Page 41: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Outline

Q-1: How close are Sn and its eigenvalues/vectors to those of Σ ?

Part 1. p small, n→∞ [classical asymptotic statistics]

Part 2. p, n both large, [modern high dimensional statistics]

Q-2: What if we know additional information: sparsity.

Part 3. Sparse Principal Component Analysis, Sparse CovarianceEstimation

Boaz Nadler PCA

Page 42: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Outline

Q-1: How close are Sn and its eigenvalues/vectors to those of Σ ?

Part 1. p small, n→∞ [classical asymptotic statistics]

Part 2. p, n both large, [modern high dimensional statistics]

Q-2: What if we know additional information: sparsity.

Part 3. Sparse Principal Component Analysis, Sparse CovarianceEstimation

Boaz Nadler PCA

Page 43: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Outline

Q-1: How close are Sn and its eigenvalues/vectors to those of Σ ?

Part 1. p small, n→∞ [classical asymptotic statistics]

Part 2. p, n both large, [modern high dimensional statistics]

Q-2: What if we know additional information: sparsity.

Part 3. Sparse Principal Component Analysis, Sparse CovarianceEstimation

Boaz Nadler PCA

Page 44: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

The Important Question

Why is this interesting, relevant, important ?

Boaz Nadler PCA

Page 45: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

Covariance Estimation will be important to you

Boaz Nadler PCA

Page 46: All you wanted to know about (sparse / high dimensional ...saharon/BigData2018/pca_high_dim.pdf · - Exploratory Data Analysis - nding structure in data - Many hypothesis testing

References (Partial List)

* Nadler, Finite sample approximation results for PCA, Annals of Stat.,2008.* Kritchman and Nadler, determining the number of components in afactor model from limited noisy data, 2008.* Birnbaum et. al., Minimax bounds for sparse PCA with noisy highdimensional data, Annals of Stat., 2013.* Baik and Silverstein, Eigenvalues of large sample covariance matrices...,J. Mult. Anal. 2006.* Paul, Asymptotics of sample eigenstructure..., Stat. Sinica, 2008.* Bickel and Levina, Covariance regularization by thresholding, Annals ofStat., 2008.* Cai and Liu, Adaptive thresholding for sparse covariance matrixestimation, J. Am. Stat. Assoc., 2011.

* Cai, Zhang, Zhou, Optimal rates of convergence for covariance matrix

estimation, Ann. of Stat. 2010.

and many others...

Boaz Nadler PCA