semi-supervised discriminant analysis lishan qiao 2009.03.13

25
Semi-supervised Discrimin ant Analysis Lishan Qiao 2009.03.13

Post on 19-Dec-2015

236 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

Semi-supervised Discriminant Analysis

Lishan Qiao

2009.03.13

Page 2: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

Outline

• Motivation

• Locality Preserving Regularization based…

– Laplacian Linear Discriminant Analysis(LapLDA)[1]

– Semi-supervised Discriminant Analysis(SDA)[2]

– Comments: Does Locality Preserving Reg. really work?

• Opitimization based…

– Semi-supervised Discriminant Analysis Via CCCP(SSDACCCP)[3]

• Conclusion

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

[1] J.H.Chen, J.P.Ye, Q.Li, Integrating global and local structures: A least squares framework for dimensionality reduction, CVPR07 [2] D.Cai, X.F.He, J.W.Han, Semi-supervised discriminant analysis, ICCV07[3] Y. Zhang, D.Y.Yeung, Semi-supervised Discriminant Analysis Via CCCP, ECML PKDD 08

Page 3: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

Motivation Why to extend LDA

Linear Discriminant Analysis (LDA) is popular supervised DR method.

PseudoLDA, PCA+LDA, NullLDA, RLDA,…

2DLDA, TensorLDA,…

LapLDA, SDA, SSLDA

SDA, SSLDA, SSDACCCP

wSw

wSw

tT

bT

maxObjective function:

Besides,

Semi-supervised LearningCo-TrainingTransductive, e.g. Label PropagationInductive, e.g. LapSVM…

Small Sample Size (SSS)

Global DR method

Completely supervised method

However, 1)(,)( nSrankcnSrank tw

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

Page 4: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

LapLDA Motivation & Objective function

Motivation: LDA captures the global geometric structure of the data by simultaneously maximizing the between-class distance and minimizing the within-class distance. However, local geometric structure has recently beenshown to be effective for dimensionality reduction.

ijw

ixjx

otherwise

xofkNNamongisxif

xofkNNamongisxif

or

xx

w ij

jiji

ij

,

,

0

)2/||||exp( 22

Objective function: wXLXwwSw

wSwTT

tT

bT

max WDL

LapLDA = LDA + LPP

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

Page 5: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

2.3288.67

1.3181.02

↑4.54

1.7282.27

?

0.6190.90

RLDA

Does locality preserving Regularizer really work?

It seems to only play the role of Tikhonov Regularizer!!

LapLDA Experiments & Discussion

IwwwSw

wSwT

tT

bT

max

wXLXwwSw

wSwTT

tT

bT

max (LapLDA) (RLDA)

0 2 4 6 8 10 12 14 16 18 200.8

0.802

0.804

0.806

0.808

0.81

K

Acc

urac

yLetter (a-m)

K=1,2,3,5,10,15,20

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

Page 6: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

SDA Motivation & Objective function

SDA=RLDA+LPP=LapLDA+Tikhonov Reg.=LDA+LPP+Tikhonov Reg.

Motivation: The labeled data points are used to maximize the separability between different classes and the unlabeled data points are used toestimate the intrinsic geometric structure of the data.

||||max

wwXLXwwSw

wSwTT

tT

bT

Objective function:

Globality Preserving DA: ||||

maxwwSw

wXXwwSw

tT

TTb

T

wSw

wXXw

tT

TT

max

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

Only 1 labeled training sample per class

Page 7: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

SDA Experiments & Discussion

WOptions = [];WOptions.Metric = 'Cosine';WOptions.NeighborMode = 'KNN';WOptions.k = 2;WOptions.WeightMode = 'Cosine';WOptions.bSelfConnected = 0;WOptions.bNormalized = 1; options = [];options.ReguType = 'Ridge';options.ReguAlpha = 0.01;options.beta = 0.1;

No any parameter!

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

3.27.32 1 labeled + 29 unlabeled

1.35.37 1 labeled + 1 unlabeled

Page 8: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

Discussion About Locality Preserving Reg.

Although the graph is at the heart of graph-based semi-supervised learning methods, its construction has not been studied extensively. [X. Zhu, SSL_survey, 05-08]

otherwise

xofkNNamongisxif

xofkNNamongisxif

or

xx

w ij

jiji

ij

,

,

0

)2/||||exp( 22

1) Graph Construction

For example, the face space is estimated to have at least 100 dimensions [4]

[4] M. Meytlis, L. Sirovich. On the dimensionality of face space. PAMI, 29(7):1262–1267, 2007

Curse of dimensionalityIssue 1

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

Page 9: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

1.90x103

0.84x103 0.92x103

The performance of classification relies heavily on how well the nearest neighbor criterion works in the original high-dimensional space[5].

Issue 2

Discussion About Locality Preserving Reg.

0 10 20 30 40 50 60 700

0.1

0.2

0.3

0.4

0.5

0.6

0.7

↑ 4.35%

[5] H. T. Chen, H. W. Chang, and T. L. Liu, Local discriminant embedding and its variants. CVPR, 2005.

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

☆□

LDA

Page 10: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

Issue 3 Difficulty of Parameter selection Cross-validation?

[6]

[6] D. Zhou, O. Bousquet, B. Scholkopf. Learning with Local and Global Consistency.NIPS,2004

Discussion About Locality Preserving Reg.

2) Parameter model vs. non-parametric model

||||max

wwXLXwwSw

wSwTT

tT

bT

wSw

wXXw

tT

TT

max

wSw

wSw

tT

bT

max

IwwwSw

wSwT

tT

bT

max

wXLXwwSw

wSwTT

tT

bT

maxLapLDA:

RLDA:

LDA:

gpDA:SDA:

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

Page 11: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

algorithmsDiscriminative term Regularization term

ParametersFisher MMC Pairwise Tikhonov Globality Locality

RLDA √ √ λ

LapLDA[1] √ √ λ, K, σ

SDA[2]/SSLDA √ √ √ α,β,K,σ

SSDR √ √ α,β

SSDRL √ √ λ

SSMMC √ √ √ α,β,K,σ

Related Works Semi-supervised DR

p||||w wXXw TT wXLXw TT

Sparsity preserving “regularization”

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

Page 12: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

Outline

• Motivation

• Locality Preserving Regularization based…

– Laplacian Linear Discriminant Analysis(LapLDA)[1]

– Semi-supervised Discriminant Analysis(SDA)[2]

– Comments: Does Locality Preserving Reg. really work?

• Opitimization based…

– Semi-supervised Discriminant Analysis Via CCCP(SSDACCCP)[3]

• Conclusion[1] J.H.Chen, J.P.Ye, Q.Li, Integrating global and local structures: A least squares framework for dimensionality reduction, CVPR07 [2] D.Cai, X.F.He, J.W.Han, Semi-supervised discriminant analysis, ICCV07[3] Y. Zhang, D.Y.Yeung, Semi-supervised Discriminant Analysis Via CCCP, ECML PKDD 08

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

Page 13: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

×

××

××

×

×

××

×

×

SSDACCCP Motivation

1x

lx

1lx

nx

1 2 C1 0 01 0 0

0 0 1

? ? ?

? ? ?

LDA:

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

Page 14: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

SSDACCCP Formulation

1x

lx

1lx

nx

1 2 C1 0 01 0 0

0 0 1

? ? ?

? ? ?

1x

lx

1lx

nx

1 2 C1 0 01 0 0

0 1 0

0 1 0

0 0 1

],,,[ 21 CAAAA

C

k

Tkkkb mmmmnS

1

))((

Tt DDS

kkk

n

nTkk

nDAm

nDm

An

/

/

1

1

],,,[ 21 CXXXD 0 1 0

0 0 1

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

Page 15: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

)()()()1 BtraceAtraceBAtrace

)()()2 BAtraceABtrace

)()()()3 CABtraceBCAtraceABCtrace

SSDACCCP Formulation

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

Page 16: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

SSDACCCP Formulation

Amax

t

SxxT

tx,max

t

Sxxconst

T

tx

,min

)(xg )(xh

Without loss of generality,

D.C. Programming

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

Page 17: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

SSDACCCP CCCP

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

Page 18: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

)(xg

)(xh

px 1px

SSDACCCP CCCP

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

Page 19: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

SSDACCCP CCCP

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

Page 20: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

SSDACCCP Formulation

t

SxxT

tx,max

t

Sxxconst

T

tx

,min

)(xg )(xh

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

t

Sxxtxh

T

),( ],)2

[(2p

pTpT

p

pT

t

Sxx

t

Sxh

p

p

p

pTpT

p

p

tt

xx

t

Sxx

t

Sxh ],)

2[(

20 tt

Sxxx

t

Sx

p

pTpT

p

p

2)

2(

gradient First-order Taylor expansion

Omit constant term

Page 21: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

SSDACCCP Formulation

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

Page 22: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

SSDACCCP Experiments

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

Page 23: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

SSDACCCP Experiments

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

Page 24: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

Conclusion

The power of the Locality Preserving Reg. was somewhat overstated.

The prior from the practical problem is paramount important.

×

××

××

×

×

××

×

×

××

×

×

×

××

×

×

×

××

×

× ×

××

×

× ×

1. Data-dependent Regularizer

2. Label estimation via optimization

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3

Page 25: Semi-supervised Discriminant Analysis Lishan Qiao 2009.03.13

Thanks!

Semi-supervised Discriminant Analysis Lishan Qiao 2009-3