on multiple-case diagnostics in linear subspace method · on multiple-case diagnostics in linear...

On multiple-case diagnostics

in linear subspace method

Kuniyoshi HAYASHI1 Hiroyuki MINAMI2 Masahiro MIZUTA2

1Graduate School of Information Science and Technology,

Hokkaido University2Information Initiative Center, Hokkaido University

19th International Conference on Computational Statistics Paris – France, August 22-27 1/36

Outline

1 Introduction

2 Sensitivity analysis in linear subspace

method

3 A multiple-case diagnostics with clustering

4 Numerical example

5 Concluding remarks

2/36

Outline

1 Introduction


method


4 Numerical example


3/36

Introduction4/36

Sensitivity analysis in pattern recognition

Single-case diagnostics in linear subspace

method (Hayashi et al., 2008)

Multiple-case diagnostics in linear subspace

method

We show the availability through a

simulation study.

Outline

1 Introduction


method


4 Numerical example


5/36

CLAss-Featuring Information Compression

(CLAFIC; Watanabe, 1967)

Training observations ( )KkniR k

pk

i ,,1;,,1 KK ==∈x

Class1

L LClass Classk K

*x

11

1 1,, nxx K

k

n

k

kxx ,,1 K

K

n

K

Kxx ,,1 K

Test observation

6/36

Linear subspace method

[CLAFIC]Autocorrelation matrix of the training data in -th class :

∑=

Τ=

kn

i

k

i

k

i

k

kn

G1

1ˆ xx

0ˆˆˆ21 ≥≥≥≥ k

p

kk λλλ L

k

( )psk

s ,,1ˆ K=u

Eigenvalues :

Eigenvectors :

A projection matrix in -th class:k

( )ppP k

p

s

k

s

k

sk

k

≤≤=∑=

Τ1ˆˆˆ

1

uu

{ }** ˆmaxarg: xxT

kPk =

7/36

CLAFIC

∗∗xx

T

2P

∗x

Class1

Class2

>

∗x Class1

8/36

∗∗xx

T

1P

Test observation

∗∗xx

T

2P

∗∗xx

T

1P

[Discriminant score and Average]

Discriminant score for :kix

( ) ,1ˆˆk

k

ik

k

i

k

i niQz ≤≤= xxT

where .ˆˆ1

1ˆ

1

−

−= ∑

=

K

l

lkk PPKK

Q

.ˆ1ˆ

1

∑=

=kn

i

k

i

k

k zn

Z

Average discriminant score

9/36

where is the number of the basis vectors in -th class.

means without -th observation in -th class

[Sample influence function (SIF)]

( )( ) ( ) ( ) ( ){ },ˆˆ1ˆ; )( kjgkgk

g

j QQnQ vechvechvechxSIF −⋅−−=

( ))(ˆjgkQvech ( )kQvech j g

[Empirical influence function (EIF)]

( )( )( ) ( ) ( )

( ) ( ) ( ),

ˆˆˆˆˆˆˆˆˆ1

1

ˆˆˆˆˆˆˆˆˆ

ˆ;

1 1

1

1 1

1

≠

+−

−−

=

+−

=

∑ ∑

∑ ∑

= +=

−

= +=

−

kgGK

kgG

Qg

g

g

g

p

s

g

s

g

t

g

t

g

s

g

t

jg

g

g

s

p

pt

g

t

g

s

p

s

g

s

g

t

g

t

g

s

g

t

jg

g

g

s

p

pt

g

t

g

s

k

g

j

TTT

TTT

uuuuuuvech

uuuuuuvech

vechxEIF

λλ

λλ

gp g

10/36

Case of

( ) ( )∑=

=kn

i

k

i

jg

k

k

i

k

jg

k Qn

Z1

,ˆ1ˆ xxT

( ) ( ) ( ).ˆˆ1ˆ)( kjgkg

jg

k QQnQ −⋅−−=

( )( )k

g

j Q; vechxSIF

Case of

( )∑=

≠=kn

i

k

i

jg

k

k

i

k

jg

k kgQn

Z1

,ˆ1ˆ xxT

( ) ( ).ˆˆˆˆˆˆˆˆˆ1

1ˆ

1 1

1

∑ ∑= +=

−+−

−−=

g

g

p

s

g

s

g

t

g

t

g

s

g

t

jg

g

g

s

p

pt

g

t

g

s

jg

k GK

QTTT

uuuuuuλλ

( )( )k

g

j Q; vechxEIF

11/36

Single-case diagnostics

Influence of single observation for kZ

Index

12/36

( )jg

kZ

Multiple-case diagnostics

Influence of multiple observations for with( ) ( )gg

jig

k njniZ ,,1;,,1ˆ ,KK ==

kZ

Index

Index( )jigkZ,ˆ

13/36

i

j

Multiple-case diagnostics

( )Tw 1,,1,10 K=g

( )( )gkQL 0ˆ wvech

( )kQvech

hww tgg += 0

( )( )gk gQL wvechw

ˆ

Weights of unperturbed observations

in -th class

Maximum likelihood estimator

Maximum log likelihood

Weights of perturbed case in -th class

Maximum log likelihood in -th

perturbed class

g

14/36

1=h

g

g

Based on Cook (1986) and Tanaka (1994),

( ) ( ) ( ) ( ) ( ).0

2

100 32

2

2

ttdt

Ddt

dt

dDDtD Ο+++=

( ) ( ) .][ 22

0

tL

ConsttDgg

gg⋅

∂∂

∂−⋅≅

=

hww

h

ww

T

T

.][

0

2

hww

h

ww

T

T

h

gg

gg

LC

=∂∂

∂−=

15/36

( ) ( )( ) ( )( )]ˆˆ[2 00

g

k

g

k

ggQLQLD wvechwvechw

w−=

( )( ) ( )

( ).

ˆˆ 2

hw

vech

vechvechw

vechh

w

T

T

wT

h

∂

∂

∂∂

∂−

∂

∂=

g

k

kk

g

k gg Q

QQ

LQC

( )( ) ,][]ˆ][[ 1 hEIFvechacovEIFh TT

h gkkgk QC −∧

≅

[ ] ( )( ){ }k

g

rgk Q;vechxEIFEIF =

and are evaluated

at and , respectively.

( ) ]ˆ[ g

gkQ wvech

T

w

∂∂ ( ) ( ) ][ 2 Tvechvech kk QQL ∂∂∂−

gg

0ww = ( ) ( )kkQQ ˆvechvech =

By and ,( )( ) ( )0

ˆˆ;

=∂

∂⋅=

gr

g

g

r

k

gk

g

r

QnQ

w

w

w

vechvechxEIF ( )( )

( ) ( )

1

2

ˆ

−

∂∂∂

−=T

vechvechvechacov

kk

k

QQ

LEQ

we estimate ashC

where and .( )( )( ) ( )

∂∂

∂−=

∧

T

vechvechvechacov

kk

k

QQ

LQ

ˆˆˆ

2

16/36

( ) ( ) ( ) ( ) ,ˆ1ˆˆ1ˆ1

1,1,1

1 1,1,1

T

vechvechvechvech

−

−

−= ∑∑ ∑

=−−

= =−−

N

jjNkiNk

N

i

N

jjNkiNk Q

NQQ

NQ

N

N

( ) ( )NrQrNk ,,1ˆ,1

K=−vech ( )k

rg

k QQ ˆˆ −

We solve the following eigenvalue problem. 17/36

( )( ) .0][ˆ][

1

=

−

−∧

hIEIFvechEIFT

acov λgkkgk Q

( )( )kQvechacov∧

Here, we estimate with jackknife method.

where we put as .

( )( ) kk VQ /ˆ

JACKvechacov ≅∧

hC : Cook’s Curvature

1 Get the eigenvectors with

2 Plot them in a lower

dimensional space

3 Detect observations that

have similar influence

4 Evaluate the values of

( )( ) .0][ˆ][

1

=

−

−∧

hIEIFvechEIFT

acov λgkkgk Q

18/36

( ) { }( )g

Jg

k nJZ ,,1ˆ K∈

Outline

1 Introduction


method


4 Numerical example


19/36

A multiple-case diagnostics with clustering

We can not always represent influence directions in

low dimensions.

We solve the problem with clustering.

Eigenvalues of :

A component of the eigenvectors : ( )gij nij ,,1, K=a

021 ≥≥≥≥gn

ξξξ L

( )( )

−

−∧

IEIFvechEIF T

acov λ][ˆ][

1

gkkgk Q

20/36

Euclidean norm

( ) ( ) ( ).,,1,1 g

n

i ijijijijjj njjdg

K=−−= ∑ =aaaa

T

( )jjdD=Apply Ward’s method to

Reduce the number of combinations of deleting

observations based on the result

21/36

Outline

1 Introduction


method


4 Numerical example


22/36

Numerical example23/36

[Simulation]

Classes : Group1, Group2

Distribution : Multivariate normal distribution

Number of variables : 6

Number of observations : 10

We compared the result of the proposed method

with the result of deleting observations in all

possible combinations.

Threshold of linear subspace method : 0.997

Group1 :

[ ]Tµ 352.12244.18030.1037.0241.127857.1051 −=

=∑

44.963101.179-5.694-0.045-13.106-245.875

101.179-250.45912.1270.076209.257525.731-

5.694-12.1272.1710.0019.090-21.393-

0.045-0.0760.0010.0000.208-0.384-

13.106-209.2579.090-0.208-1674.438789.092

245.875525.731-21.393-0.384-789.0928676.883

1

Data [,1] [,2] [,3] [,4] [,5] [,6]

[1,] 3.251 74.731 0.04 2.868 19.71 -15.56

[2,] -18.51 87.221 0.038 0.316 17.876 -13.239

[3,] 162.092 130.641 0.03 0.835 -2.196 -2.52

[4,] 120.625 175.7 0.033 3.239 39.097 -17.958

[5,] 70.893 104.692 0.064 -0.848 12.563 -10.372

[6,] 61.044 139.198 0.044 1.157 14.358 -10.993

[7,] 277.224 98.071 0.051 1.425 18.551 -15.139

[8,] 40.983 188.004 0.042 0.915 48.729 -23.67

[9,] 212.658 175.569 0.008 -1.482 -2.177 -1.233

[10,] 128.307 98.585 0.018 1.873 15.924 -12.834

24/36

Group2 : [ ]Tµ 1.3141.3250.089-0.000131.57897.4922 =

=∑

101.04150.9006.4070.119311.49677.281

50.90092.73232.2380.025-171.388145.029

6.40732.23818.1230.00210.82717.550

0.1190.025-0.0020.0010.125-1.018-

311.496171.38810.8270.125-1934.00284.440-

77.281145.02917.5501.018-84.440-5736.272

2

Data [,1] [,2] [,3] [,4] [,5] [,6]

[1,] 90.704 96.139 -0.021 -0.622 -0.666 -3.766

[2,] 39.285 95.397 0.026 3.493 0.681 1.991

[3,] 31.19 243.481 -0.001 1.623 8.291 16.302

[4,] -18.781 113.201 0.011 2.88 7.173 -4.303

[5,] 193.847 161.998 0.024 0.688 15.552 22.948

[6,] 122.154 110.92 0.024 4.865 7.591 -4.361

[7,] 174.834 111.309 0.007 -0.382 -7.5 -1.789

[8,] 48.081 122.082 0.024 -9.056 -15.391 -2.228

[9,] 87.225 122.684 -0.013 -5.637 -9.465 -4.696

[10,] 206.376 138.573 -0.082 1.261 6.988 -6.954

25/36

26/36

{ }10,7 Large influential subset

Average scores in Group1


All possible combinations

( ) { }( )1

1 ,,1;2,1ˆ nJkZ J

k K∈=

27/36

( ) { }( )2

2 ,,1;2,1ˆ nJkZ J

k K∈=

{ }8,5 Large influential subset



All possible combinations

28/36

8 clusters8 clusters

1023 → 255 1023 → 255( )128 −= ( )128 −=

Clusters of influence directions

Proposed method

{ }10,7

Large influential subset



29/36Proposed method

{ }8,5Large influential subset



30/36Proposed method

Case 231/36

[Simulation]

Classes : Group1, Group2

Distribution : Multivariate normal distribution

Number of variables : 6

Number of observations : 100

Threshold of linear subspace method : 0.997

32/36

[ ]Tµ 352.12244.18030.1037.0241.127857.1051 −=

=∑

44.963101.179-5.694-0.045-13.106-245.875

101.179-250.45912.1270.076209.257525.731-

5.694-12.1272.1710.0019.090-21.393-

0.045-0.0760.0010.0000.208-0.384-

13.106-209.2579.090-0.208-1674.438789.092

245.875525.731-21.393-0.384-789.0928676.883

1

[ ]Tµ 1.3141.3250.089-0.000131.57897.4922 =

=∑

101.04150.9006.4070.119311.49677.281

50.90092.73232.2380.025-171.388145.029

6.40732.23818.1230.00210.82717.550

0.1190.025-0.0020.0010.125-1.018-

311.496171.38810.8270.125-1934.00284.440-

77.281145.02917.5501.018-84.440-5736.272

2

Setting up 3 outliers that

have similar influence

direction

{ }94,93,42

Group1 Group2

19 clusters

1.267651 1030

→ 524287

Our Multiple-case

diagnostics in Group1

{ }94,93,423 outliers

33/36

×

Outline

1 Introduction


method


4 Numerical example


34/36

Concluding remarks

We proposed a multiple-case diagnostics with

clustering in linear subspace method.

In simulation study, we could confirm the availability

of our method.

In the future, we would like to show the results of

some simulation studies and their Monte Carlo

simulations.

35/36

Thank you for your attention!

36/36

on multiple-case diagnostics in linear subspace method · on multiple-case diagnostics in linear...

Documents