a powerpoint presentation online principal component analysis · 2019. 11. 26. · a powerpoint...

A PowerPo in t P resen ta t i on

PRESENTED BY Firstname Lastname⎪ August 25, 2013

On l i ne P r inc ipa l Componen t Ana lys i s B o u t s i d i s , G a r b e r , K a r n i n , L i b e r t y

PRESENTED BY Zohar Karnin⎪ November 23, 2014

Data Matrix

2 Yahoo labs

§  Often, data is represented as a huge matrix

§  Sometimes, we can’t store the entire matrix

Principal Component Analysis

3 Yahoo labs

§  Often, we require a low rank approximation of matrix A ›  Recommender systems, images, LSA, …

§  The approximation is used to save space and often, clean up noise

A = + + +

Column by Column Stream

4 Yahoo labs

§  Data arrives column by column §  column=item and we’re seeing the items one at a time

The Formal Stream Setup

5 Yahoo labs

§  Observe x1 2 Rd, output y1 2 Rk


6 Yahoo labs


§  …


7 Yahoo labs


§  … §  Observe xt 2 Rd,

output yt 2 Rk


8 Yahoo labs

Cost =   Min ©   ∑t kxt – ©ytk2

s.t   © = embedding

from Rk to Rd

  kyi-yjk=k©yi-©yjk

X

Y

The Cost Function

9 Yahoo labs

Y

X

Output

Input

The Cost Function

10 Yahoo labs

-

Y

©Y X

Embedding of Y into the same space of X

The Cost Function

11 Yahoo labs

-

=

Y

©Y X

R=X-©Y Error matrix

The Cost Function

12 Yahoo labs

-

=

Frob Error = kRkF2 = ∑ij (Xij - ©Yij) = MSE

Y

©Y X


The Cost Function

13 Yahoo labs

-

=

Frob Error = kRkF2 = ∑ij (Xij - ©Yij) = MSE

Spectral Error = kRk2 = maxkvk=1 kv>X – v>(©Y)k

Y

©Y X


Secondary Costs: Computational Resources

14 Yahoo labs

§  Run time: #operations required per observed column §  Memory

Previous Works

15 Yahoo labs

§  Regret Minimization Setting [WK 07], [NKW 13]

§  At time t, before observing xt, predict Ut, a projection matrix onto a k dim subspace. The loss is kxt-Utxtk2

§  Each Ut can be completely different

Previous Works

16 Yahoo labs




§  Stochastic setting [ACS 13], [MCJ 13], [BDF 13] ›  xt are drawn i.i.d from some distribution. Objective: find U as quickly as possible

minimizing E[ kxt-Uxtk2 ]

Previous Works

17 Yahoo labs




§  Stochastic setting [ACS 13], [MCJ 13], [BDF 13] ›  xt are drawn i.i.d from some distribution. Objective: find U as quickly as possible

minimizing E[ kxt-Uxtk2 ]

§  Reconstruction matrix (not an embedding) [CW 09] ›  min© ∑t kxt – ©ytk2 s.t © is an arbitrary linear transformation from Rk to Rd

Results

18 Yahoo labs

§  X = d £ n matrix whose columns are observed

Results

19 Yahoo labs

§  X = d £ n matrix whose columns are observed §  k << d

Results

20 Yahoo labs

§  X = d £ n matrix whose columns are observed §  k << d §  Xk = Best rank k approximation of X (top k directions)

Results

21 Yahoo labs

§  X = d £ n matrix whose columns are observed §  k << d §  Xk = Best rank k approximation of X (top k directions) §  OPT = kX-XkkF

2

Results

22 Yahoo labs


2 §  Theorem 1: Given kXkF, k, ²: Error = OPT + ²kXkF

2

›  Memory, Target dimension, Processing time per column = O(k/²2)

Results

23 Yahoo labs


2 §  Theorem 1: Given kXkF, k, ²: Error = OPT + ²kXkF

2


§  Theorem 2: Given k, ²: Error = OPT + ²kXkF2


The “Operator Norm” Cost Function

24 Yahoo labs

§  Y = output matrix [y1,…,yn]

§  Cost = kX – ©YkF2

›  Interpretation: Mean square error

kX – XkkF2 ¿ kXkkF

2 noise signal


25 Yahoo labs





2 kX – XkkF2 ÀkXkkF

2

…

but… kX – Xkk2 ¿ kXkk2

noise signal


26 Yahoo labs




§  Alternative cost: kX – ©Yk2 ›  Interpretation: bounds max unit vector v, kv>X – v>©Yk


2 kX – XkkF2 ÀkXkkF

2

…

but… kX – Xkk2 ¿ kXkk2

noise signal

Results

27 Yahoo labs

§  Theorem 3 [under construction] : Given kXk, kX-Xkk, k, ²: Operator Norm Error = OPToperator + ²kXk2

›  Target dimension = O(k/²)

Algorithm

28 Yahoo labs

§  Maintain U:Rd → R`

§  Directions are only added, never removed (for now)

•  r = Tolerable error radius = kXkF

/ `1/2

Algorithm

29 Yahoo labs




/ `1/2

Algorithm

30 Yahoo labs




/ `1/2 •  “Error ellipsoid”

Algorithm

31 Yahoo labs





Algorithm

32 Yahoo labs





Algorithm

33 Yahoo labs





Algorithm

34 Yahoo labs





Algorithm

35 Yahoo labs




/ `1/2 •  “Error ellipsoid” •  Add vector u1 to U

Algorithm

36 Yahoo labs




/ `1/2 •  “Error ellipsoid” •  Add vector u1 to U

Analysis: Target Dimension

37 Yahoo labs

•  r = Tolerable error radius = kXkF / `1/2

Target dimension = number of vectors added to U


38 Yahoo labs


Target dimension = number of vectors added to U Obs: adding a vector to U means requires kXkF

2 / ` weight from kXkF

2


39 Yahoo labs


Target dimension = number of vectors added to U Obs: adding a vector to U means requires kXkF

2 / ` weight from kXkF

2 ) number of vectors added to U · `

Analysis: Cost

40 Yahoo labs

•  “Error ellipsoid” •  Y = output matrix •  R = error matrix = X-Un

>Y Operator norm cost = kRk2 = max{r1

2,r22}

Cost = kRkF2 = r1

2+r22

r1

r2

Analysis: Cost

41 Yahoo labs


•  “Error ellipsoid” •  Y = output matrix •  R = error matrix = X-Un

>Y Statements: •  kRk2 · r2 = kXkF

2 / ` •  kRkF

2 · loss from Xk + loss from X-Xk · kXkF

2 (k/`)1/2 + kX-XkkF2

Implementation: Memory and Run-time Complexity

42 Yahoo labs

rt = xt – Ut xt

R = [r1, r2, …, rt]


43 Yahoo labs

rt = xt – Ut xt

R = [r1, r2, …, rt] §  Straightforward version requires maintaining RR>

›  Update time, memory requirements = d2


44 Yahoo labs

rt = xt – Ut xt

R = [r1, r2, …, rt] §  Straightforward version requires maintaining RR>

›  Update time, memory requirements = d2

§  Instead: Maintain Z: d£` matrix such that ZZ> ¼ RR>

§  kZZ>- RR>k< kRkF2/`

§  [Lib 12] Update time, memory requirements = d`

Implementation: Unknown Horizon

45 Yahoo labs

Error radius parameter = kXkF / `1/2


46 Yahoo labs


§  Def: Xt = [x1,…,xt] §  Idea: use growing radius parameter kXtkF

/ `1/2


47 Yahoo labs



/ `1/2 §  Thm: works as before, but target dimension =

`·log(n)


48 Yahoo labs




`·log(n)


49 Yahoo labs




`·log(n)


50 Yahoo labs




`·log(n)


51 Yahoo labs




`·log(n) ›  Divide time into epochs, in each epoch, N · kXtkF

· 2N ›  At most ` directions are added in each epoch


52 Yahoo labs




`·log(n) ›  Divide time into epochs, in each epoch, N · kXtkF

· 2N ›  At most ` directions are added in each epoch

§  Idea 2: if direction u becomes weak (ku>Xtk¿ kXtkF / `1/2) remove it

§  Thm: works as before, target dimension = ` / ²

Conclusions and Open Questions

53 Yahoo labs

§  We obtain error = OPT + ²kXkF2 with target dimension O(k/²3). Can we

reduce the dependence on ²? §  Improve to OPT(1+²) ? §  Lower bound? (currently same for arbitrary reconstruction matrix) §  Obtain approximation of OPT + ²kX-Xkk2

Thank you!

54 Yahoo labs

a powerpoint presentation online principal component analysis · 2019. 11. 26. · a powerpoint...

Documents