on multiple-case diagnostics in linear subspace method · on multiple-case diagnostics in linear...
TRANSCRIPT
On multiple-case diagnostics
in linear subspace method
Kuniyoshi HAYASHI1 Hiroyuki MINAMI2 Masahiro MIZUTA2
1Graduate School of Information Science and Technology,
Hokkaido University2Information Initiative Center, Hokkaido University
19th International Conference on Computational Statistics Paris – France, August 22-27 1/36
Outline
1 Introduction
2 Sensitivity analysis in linear subspace
method
3 A multiple-case diagnostics with clustering
4 Numerical example
5 Concluding remarks
2/36
Outline
1 Introduction
2 Sensitivity analysis in linear subspace
method
3 A multiple-case diagnostics with clustering
4 Numerical example
5 Concluding remarks
3/36
Introduction4/36
Sensitivity analysis in pattern recognition
Single-case diagnostics in linear subspace
method (Hayashi et al., 2008)
Multiple-case diagnostics in linear subspace
method
We show the availability through a
simulation study.
Outline
1 Introduction
2 Sensitivity analysis in linear subspace
method
3 A multiple-case diagnostics with clustering
4 Numerical example
5 Concluding remarks
5/36
CLAss-Featuring Information Compression
(CLAFIC; Watanabe, 1967)
Training observations ( )KkniR k
pk
i ,,1;,,1 KK ==∈x
Class1
L LClass Classk K
*x
11
1 1,, nxx K
k
n
k
kxx ,,1 K
K
n
K
Kxx ,,1 K
Test observation
6/36
Linear subspace method
[CLAFIC]Autocorrelation matrix of the training data in -th class :
∑=
Τ=
kn
i
k
i
k
i
k
kn
G1
1ˆ xx
0ˆˆˆ21 ≥≥≥≥ k
p
kk λλλ L
k
( )psk
s ,,1ˆ K=u
Eigenvalues :
Eigenvectors :
A projection matrix in -th class:k
( )ppP k
p
s
k
s
k
sk
k
≤≤=∑=
Τ1ˆˆˆ
1
uu
{ }** ˆmaxarg: xxT
kPk =
7/36
CLAFIC
∗∗xx
T
2P
∗x
Class1
Class2
>
∗x Class1
8/36
∗∗xx
T
1P
Test observation
∗∗xx
T
2P
∗∗xx
T
1P
[Discriminant score and Average]
Discriminant score for :kix
( ) ,1ˆˆk
k
ik
k
i
k
i niQz ≤≤= xxT
where .ˆˆ1
1ˆ
1
−
−= ∑
=
K
l
lkk PPKK
Q
.ˆ1ˆ
1
∑=
=kn
i
k
i
k
k zn
Z
Average discriminant score
9/36
where is the number of the basis vectors in -th class.
means without -th observation in -th class
[Sample influence function (SIF)]
( )( ) ( ) ( ) ( ){ },ˆˆ1ˆ; )( kjgkgk
g
j QQnQ vechvechvechxSIF −⋅−−=
( ))(ˆjgkQvech ( )kQvech j g
[Empirical influence function (EIF)]
( )( )( ) ( ) ( )
( ) ( ) ( ),
ˆˆˆˆˆˆˆˆˆ1
1
ˆˆˆˆˆˆˆˆˆ
ˆ;
1 1
1
1 1
1
≠
+−
−−
=
+−
=
∑ ∑
∑ ∑
= +=
−
= +=
−
kgGK
kgG
Qg
g
g
g
p
s
g
s
g
t
g
t
g
s
g
t
jg
g
g
s
p
pt
g
t
g
s
p
s
g
s
g
t
g
t
g
s
g
t
jg
g
g
s
p
pt
g
t
g
s
k
g
j
TTT
TTT
uuuuuuvech
uuuuuuvech
vechxEIF
λλ
λλ
gp g
10/36
Case of
( ) ( )∑=
=kn
i
k
i
jg
k
k
i
k
jg
k Qn
Z1
,ˆ1ˆ xxT
( ) ( ) ( ).ˆˆ1ˆ)( kjgkg
jg
k QQnQ −⋅−−=
( )( )k
g
j Q; vechxSIF
Case of
( )∑=
≠=kn
i
k
i
jg
k
k
i
k
jg
k kgQn
Z1
,ˆ1ˆ xxT
( ) ( ).ˆˆˆˆˆˆˆˆˆ1
1ˆ
1 1
1
∑ ∑= +=
−+−
−−=
g
g
p
s
g
s
g
t
g
t
g
s
g
t
jg
g
g
s
p
pt
g
t
g
s
jg
k GK
QTTT
uuuuuuλλ
( )( )k
g
j Q; vechxEIF
11/36
Single-case diagnostics
Influence of single observation for kZ
Index
12/36
( )jg
kZ
Multiple-case diagnostics
Influence of multiple observations for with( ) ( )gg
jig
k njniZ ,,1;,,1ˆ ,KK ==
kZ
Index
Index( )jigkZ,ˆ
13/36
i
j
Multiple-case diagnostics
( )Tw 1,,1,10 K=g
( )( )gkQL 0ˆ wvech
( )kQvech
hww tgg += 0
( )( )gk gQL wvechw
ˆ
Weights of unperturbed observations
in -th class
Maximum likelihood estimator
Maximum log likelihood
Weights of perturbed case in -th class
Maximum log likelihood in -th
perturbed class
g
14/36
1=h
g
g
Based on Cook (1986) and Tanaka (1994),
( ) ( ) ( ) ( ) ( ).0
2
100 32
2
2
ttdt
Ddt
dt
dDDtD Ο+++=
( ) ( ) .][ 22
0
tL
ConsttDgg
gg⋅
∂∂
∂−⋅≅
=
hww
h
ww
T
T
.][
0
2
hww
h
ww
T
T
h
gg
gg
LC
=∂∂
∂−=
15/36
( ) ( )( ) ( )( )]ˆˆ[2 00
g
k
g
k
ggQLQLD wvechwvechw
w−=
( )( ) ( )
( ).
ˆˆ 2
hw
vech
vechvechw
vechh
w
T
T
wT
h
∂
∂
∂∂
∂−
∂
∂=
g
k
kk
g
k gg Q
LQC
( )( ) ,][]ˆ][[ 1 hEIFvechacovEIFh TT
h gkkgk QC −∧
≅
[ ] ( )( ){ }k
g
rgk Q;vechxEIFEIF =
and are evaluated
at and , respectively.
( ) ]ˆ[ g
gkQ wvech
T
w
∂∂ ( ) ( ) ][ 2 Tvechvech kk QQL ∂∂∂−
gg
0ww = ( ) ( )kkQQ ˆvechvech =
By and ,( )( ) ( )0
ˆˆ;
=∂
∂⋅=
gr
g
g
r
k
gk
g
r
QnQ
w
w
w
vechvechxEIF ( )( )
( ) ( )
1
2
ˆ
−
∂∂∂
−=T
vechvechvechacov
kk
k
LEQ
we estimate ashC
where and .( )( )( ) ( )
∂∂
∂−=
∧
T
vechvechvechacov
kk
k
LQ
ˆˆˆ
2
16/36
( ) ( ) ( ) ( ) ,ˆ1ˆˆ1ˆ1
1,1,1
1 1,1,1
T
vechvechvechvech
−
−
−= ∑∑ ∑
=−−
= =−−
N
jjNkiNk
N
i
N
jjNkiNk Q
NQQ
NQ
N
N
( ) ( )NrQrNk ,,1ˆ,1
K=−vech ( )k
rg
k QQ ˆˆ −
We solve the following eigenvalue problem. 17/36
( )( ) .0][ˆ][
1
=
−
−∧
hIEIFvechEIFT
acov λgkkgk Q
( )( )kQvechacov∧
Here, we estimate with jackknife method.
where we put as .
( )( ) kk VQ /ˆ
JACKvechacov ≅∧
hC : Cook’s Curvature
1 Get the eigenvectors with
2 Plot them in a lower
dimensional space
3 Detect observations that
have similar influence
4 Evaluate the values of
( )( ) .0][ˆ][
1
=
−
−∧
hIEIFvechEIFT
acov λgkkgk Q
18/36
( ) { }( )g
Jg
k nJZ ,,1ˆ K∈
Outline
1 Introduction
2 Sensitivity analysis in linear subspace
method
3 A multiple-case diagnostics with clustering
4 Numerical example
5 Concluding remarks
19/36
A multiple-case diagnostics with clustering
We can not always represent influence directions in
low dimensions.
We solve the problem with clustering.
Eigenvalues of :
A component of the eigenvectors : ( )gij nij ,,1, K=a
021 ≥≥≥≥gn
ξξξ L
( )( )
−
−∧
IEIFvechEIF T
acov λ][ˆ][
1
gkkgk Q
20/36
Euclidean norm
( ) ( ) ( ).,,1,1 g
n
i ijijijijjj njjdg
K=−−= ∑ =aaaa
T
( )jjdD=Apply Ward’s method to
Reduce the number of combinations of deleting
observations based on the result
21/36
Outline
1 Introduction
2 Sensitivity analysis in linear subspace
method
3 A multiple-case diagnostics with clustering
4 Numerical example
5 Concluding remarks
22/36
Numerical example23/36
[Simulation]
Classes : Group1, Group2
Distribution : Multivariate normal distribution
Number of variables : 6
Number of observations : 10
We compared the result of the proposed method
with the result of deleting observations in all
possible combinations.
Threshold of linear subspace method : 0.997
Group1 :
[ ]Tµ 352.12244.18030.1037.0241.127857.1051 −=
=∑
44.963101.179-5.694-0.045-13.106-245.875
101.179-250.45912.1270.076209.257525.731-
5.694-12.1272.1710.0019.090-21.393-
0.045-0.0760.0010.0000.208-0.384-
13.106-209.2579.090-0.208-1674.438789.092
245.875525.731-21.393-0.384-789.0928676.883
1
Data [,1] [,2] [,3] [,4] [,5] [,6]
[1,] 3.251 74.731 0.04 2.868 19.71 -15.56
[2,] -18.51 87.221 0.038 0.316 17.876 -13.239
[3,] 162.092 130.641 0.03 0.835 -2.196 -2.52
[4,] 120.625 175.7 0.033 3.239 39.097 -17.958
[5,] 70.893 104.692 0.064 -0.848 12.563 -10.372
[6,] 61.044 139.198 0.044 1.157 14.358 -10.993
[7,] 277.224 98.071 0.051 1.425 18.551 -15.139
[8,] 40.983 188.004 0.042 0.915 48.729 -23.67
[9,] 212.658 175.569 0.008 -1.482 -2.177 -1.233
[10,] 128.307 98.585 0.018 1.873 15.924 -12.834
24/36
Group2 : [ ]Tµ 1.3141.3250.089-0.000131.57897.4922 =
=∑
101.04150.9006.4070.119311.49677.281
50.90092.73232.2380.025-171.388145.029
6.40732.23818.1230.00210.82717.550
0.1190.025-0.0020.0010.125-1.018-
311.496171.38810.8270.125-1934.00284.440-
77.281145.02917.5501.018-84.440-5736.272
2
Data [,1] [,2] [,3] [,4] [,5] [,6]
[1,] 90.704 96.139 -0.021 -0.622 -0.666 -3.766
[2,] 39.285 95.397 0.026 3.493 0.681 1.991
[3,] 31.19 243.481 -0.001 1.623 8.291 16.302
[4,] -18.781 113.201 0.011 2.88 7.173 -4.303
[5,] 193.847 161.998 0.024 0.688 15.552 22.948
[6,] 122.154 110.92 0.024 4.865 7.591 -4.361
[7,] 174.834 111.309 0.007 -0.382 -7.5 -1.789
[8,] 48.081 122.082 0.024 -9.056 -15.391 -2.228
[9,] 87.225 122.684 -0.013 -5.637 -9.465 -4.696
[10,] 206.376 138.573 -0.082 1.261 6.988 -6.954
25/36
26/36
{ }10,7 Large influential subset
Average scores in Group1
Average scores in Group2
All possible combinations
( ) { }( )1
1 ,,1;2,1ˆ nJkZ J
k K∈=
27/36
( ) { }( )2
2 ,,1;2,1ˆ nJkZ J
k K∈=
{ }8,5 Large influential subset
Average scores in Group1
Average scores in Group2
All possible combinations
28/36
8 clusters8 clusters
1023 → 255 1023 → 255( )128 −= ( )128 −=
Clusters of influence directions
Proposed method
{ }10,7
Large influential subset
Average scores in Group1
Average scores in Group2
29/36Proposed method
{ }8,5Large influential subset
Average scores in Group1
Average scores in Group2
30/36Proposed method
Case 231/36
[Simulation]
Classes : Group1, Group2
Distribution : Multivariate normal distribution
Number of variables : 6
Number of observations : 100
Threshold of linear subspace method : 0.997
32/36
[ ]Tµ 352.12244.18030.1037.0241.127857.1051 −=
=∑
44.963101.179-5.694-0.045-13.106-245.875
101.179-250.45912.1270.076209.257525.731-
5.694-12.1272.1710.0019.090-21.393-
0.045-0.0760.0010.0000.208-0.384-
13.106-209.2579.090-0.208-1674.438789.092
245.875525.731-21.393-0.384-789.0928676.883
1
[ ]Tµ 1.3141.3250.089-0.000131.57897.4922 =
=∑
101.04150.9006.4070.119311.49677.281
50.90092.73232.2380.025-171.388145.029
6.40732.23818.1230.00210.82717.550
0.1190.025-0.0020.0010.125-1.018-
311.496171.38810.8270.125-1934.00284.440-
77.281145.02917.5501.018-84.440-5736.272
2
Setting up 3 outliers that
have similar influence
direction
{ }94,93,42
Group1 Group2
19 clusters
1.267651 1030
→ 524287
Our Multiple-case
diagnostics in Group1
{ }94,93,423 outliers
33/36
×
Outline
1 Introduction
2 Sensitivity analysis in linear subspace
method
3 A multiple-case diagnostics with clustering
4 Numerical example
5 Concluding remarks
34/36
Concluding remarks
We proposed a multiple-case diagnostics with
clustering in linear subspace method.
In simulation study, we could confirm the availability
of our method.
In the future, we would like to show the results of
some simulation studies and their Monte Carlo
simulations.
35/36
Thank you for your attention!
36/36