comparison of methods for classification of alzheimer’s

6
Comparison of Methods for Classification of Alzheimer’s Disease, Frontotemporal Dementia and Asymptomatic Controls Zhijie Wang 1 , Pawel J. Markiewicz 1 , G¨ unther Platsch 2 , Johannes Kornhuber 3 , Torsten Kuwert 4 , Dorit Merhof 1 Abstract—Single photon emission computed tomography (SPECT) and positron emission tomography (PET) are com- monly used for the study of neurodegenerative diseases such as Alzheimer’s disease (AD) and frontotemporal dementia (FTD). Many methods have been proposed to identify different types of dementia based on PET and SPECT images. However, an extensive evaluation and comparison of different methods for feature extraction and classification of such image data has not been performed yet. In this work, two commonly used feature extraction methods, principal component analysis (PCA) and partial least squares analysis (PLS), were used for dimensionality reduction, and three classification methods comprising multi- ple discriminant analysis (MDA), elastic-net logistic regression (ENLR) and support-vector machine (SVM) were used for classification of SPECT image data of asymptomatic controls (CTR), AD and FTD participants. Hence, six image classification procedures were evaluated and compared. The results indicate that PCA-based procedures have more robust and reliable performance than PLS-based procedures, and PCA-ENLR has the best estimated predictive accuracy among all three PCA- based procedures. Index Terms—Single photon emission computed tomography (SPECT), Alzheimer’s disease (AD), frontotemporal dementia (FTD), multivariate analysis, dimensionality reduction. I. I NTRODUCTION A LZHEIMER’S DISEASE (AD) and FTD are among the most common neurodegenerative diseases. Automated diagnosis and differentiation of these two dementias is of importance, in particular for choosing the appropriate thera- peutic approach. A commonly accepted biomarker of these neurodegenerative diseases is regional cerebral blood flow (rCBF), which shows different patterns in AD and FTD: In AD patients, reduced rCBF has been consistently observed in the temporo-parietal association cortices [1], whereas the frontal associated cortex is only affected in some cases. On the other hand, in FTD patients reduced rCBF was reported to be mainly present in the temporal and frontal regions without posterior extension [2]. As one of the ways of estimating rCBF, Technetium-99m-ethylen cysteinat dimer ( 99m Tc-ECD) Manuscript received Jul. 8, 2013. 1 University of Konstanz, Konstanz, Germany. 2 Siemens Molecular Imaging EU, Erlangen, Germany. 3 Department of Psychiatry and Psychotherapy, Friedrich-Alexander- University of Erlangen-Nuremberg, Erlangen, Germany. 4 Clinic of Nuclear Medicine, Friedrich-Alexander-University of Erlangen- Nuremberg, Erlangen, Germany. SPECT imaging has been used in previous research for the differential diagnosis of AD and FTD [2], [3]. Since SPECT data is multivariate in nature, multivari- ate methods have been previously proposed to differentiate CTR, AD and FTD participants based on acquired SPECT images [3], [4]. Taking the covariance patterns of voxels into account, multivariate analysis methods showed better diagnostic performance than univariate analysis methods [5]. Multivariate analysis of high-dimensional imaging data is commonly performed in two steps: First, the dimensionality of the data is reduced and then a classifier is applied to the dimensionality reduced space. In this work, two well-known methods, PCA and PLS, were investigated to reduce the dimensionality of the SPECT data and to extract group-specific patterns. Subsequently, three lin- ear discriminative classifiers (MDA, ENLR and linear kernel SVM) were used for differentiating the three groups (CTR, AD and FTD). Hence, six two-step image classification procedures were employed. Due to the limited number of subjects in a clinical dataset, which is significantly smaller than the number of variables (voxels), two-step classification procedures are more suscep- tible to the randomness of sampling. Additionally, potential clinical misdiagnosis would also introduce bias to the accu- racy, since all classifiers in this study (MDA, ENLR, SVM) are supervised methods and susceptible to errors of class labels in the training set. Small numbers of subjects may further exaggerate the effect of misdiagnosis. Another impact of small numbers of subjects is that random patterns which are not characteristic of the disease may accidentally correlate with the class labels. Therefore, the main focus of this paper is to evaluate these two-step classification procedures in the light of limited subject numbers in a clinical dataset. The six classification procedures investigated in this work are assessed in the fol- lowing way: (1) Accuracy and robustness of the classification procedures are investigated based on bootstrap resampling. The robustness of dimensionality reduction is assessed through the distance between two PCA (or PLS) subspaces, which is also employed to determine an adequate number of PCs (or PLS) components for classification. The robustness of classifiers (MDA, ENLR, SVM after performing either PCA or PLS) is assessed through the standard error (SE) map of the discriminative vector.

Upload: others

Post on 18-May-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Comparison of Methods for Classification of Alzheimer’s

Comparison of Methods for Classification ofAlzheimer’s Disease, Frontotemporal Dementia and

Asymptomatic ControlsZhijie Wang 1, Pawel J. Markiewicz 1, Gunther Platsch 2, Johannes Kornhuber 3,

Torsten Kuwert 4, Dorit Merhof 1

Abstract—Single photon emission computed tomography(SPECT) and positron emission tomography (PET) are com-monly used for the study of neurodegenerative diseases such asAlzheimer’s disease (AD) and frontotemporal dementia (FTD).Many methods have been proposed to identify different typesof dementia based on PET and SPECT images. However, anextensive evaluation and comparison of different methods forfeature extraction and classification of such image data has notbeen performed yet. In this work, two commonly used featureextraction methods, principal component analysis (PCA) andpartial least squares analysis (PLS), were used for dimensionalityreduction, and three classification methods comprising multi-ple discriminant analysis (MDA), elastic-net logistic regression(ENLR) and support-vector machine (SVM) were used forclassification of SPECT image data of asymptomatic controls(CTR), AD and FTD participants. Hence, six image classificationprocedures were evaluated and compared. The results indicatethat PCA-based procedures have more robust and reliableperformance than PLS-based procedures, and PCA-ENLR hasthe best estimated predictive accuracy among all three PCA-based procedures.

Index Terms—Single photon emission computed tomography(SPECT), Alzheimer’s disease (AD), frontotemporal dementia(FTD), multivariate analysis, dimensionality reduction.

I. INTRODUCTION

ALZHEIMER’S DISEASE (AD) and FTD are among themost common neurodegenerative diseases. Automated

diagnosis and differentiation of these two dementias is ofimportance, in particular for choosing the appropriate thera-peutic approach. A commonly accepted biomarker of theseneurodegenerative diseases is regional cerebral blood flow(rCBF), which shows different patterns in AD and FTD: InAD patients, reduced rCBF has been consistently observedin the temporo-parietal association cortices [1], whereas thefrontal associated cortex is only affected in some cases. Onthe other hand, in FTD patients reduced rCBF was reported tobe mainly present in the temporal and frontal regions withoutposterior extension [2]. As one of the ways of estimatingrCBF, Technetium-99m-ethylen cysteinat dimer (99mTc-ECD)

Manuscript received Jul. 8, 2013.1 University of Konstanz, Konstanz, Germany.2 Siemens Molecular Imaging EU, Erlangen, Germany.3 Department of Psychiatry and Psychotherapy, Friedrich-Alexander-University of Erlangen-Nuremberg, Erlangen, Germany.4 Clinic of Nuclear Medicine, Friedrich-Alexander-University of Erlangen-Nuremberg, Erlangen, Germany.

SPECT imaging has been used in previous research for thedifferential diagnosis of AD and FTD [2], [3].

Since SPECT data is multivariate in nature, multivari-ate methods have been previously proposed to differentiateCTR, AD and FTD participants based on acquired SPECTimages [3], [4]. Taking the covariance patterns of voxelsinto account, multivariate analysis methods showed betterdiagnostic performance than univariate analysis methods [5].Multivariate analysis of high-dimensional imaging data iscommonly performed in two steps: First, the dimensionalityof the data is reduced and then a classifier is applied to thedimensionality reduced space.

In this work, two well-known methods, PCA and PLS, wereinvestigated to reduce the dimensionality of the SPECT dataand to extract group-specific patterns. Subsequently, three lin-ear discriminative classifiers (MDA, ENLR and linear kernelSVM) were used for differentiating the three groups (CTR, ADand FTD). Hence, six two-step image classification procedureswere employed.

Due to the limited number of subjects in a clinical dataset,which is significantly smaller than the number of variables(voxels), two-step classification procedures are more suscep-tible to the randomness of sampling. Additionally, potentialclinical misdiagnosis would also introduce bias to the accu-racy, since all classifiers in this study (MDA, ENLR, SVM)are supervised methods and susceptible to errors of class labelsin the training set. Small numbers of subjects may furtherexaggerate the effect of misdiagnosis. Another impact of smallnumbers of subjects is that random patterns which are notcharacteristic of the disease may accidentally correlate withthe class labels.

Therefore, the main focus of this paper is to evaluatethese two-step classification procedures in the light of limitedsubject numbers in a clinical dataset. The six classificationprocedures investigated in this work are assessed in the fol-lowing way:

(1) Accuracy and robustness of the classification proceduresare investigated based on bootstrap resampling. The robustnessof dimensionality reduction is assessed through the distancebetween two PCA (or PLS) subspaces, which is also employedto determine an adequate number of PCs (or PLS) componentsfor classification. The robustness of classifiers (MDA, ENLR,SVM after performing either PCA or PLS) is assessed throughthe standard error (SE) map of the discriminative vector.

Page 2: Comparison of Methods for Classification of Alzheimer’s

(2) Clinical misdiagnosis is simulated by deliberately mis-labeling participants of the clinical dataset. The sensitivity ofthe investigated classification procedures to misdiagnosis isassessed as the average accuracy of all mislabeled samples.

(3) The statistical significance is evaluated by a permutationtest, which uses accuracy as test statistic and is able to indicatethe reliability of accuracy, or in other words, how likely theaccuracy of the classification procedure could be obtained bychance.

II. METHODS

A. Image Data

All SPECT images were acquired at the Clinic of NuclearMedicine, University of Erlangen-Nuremberg. Overall, 26CTR subjects (minimum age 55, mean age 64.2 ± 5.1), 26AD subjects (minimum age 52, mean age 67.0 ± 7.2) and21 FTD subjects (minimum age 51, mean age 65.9 ± 8.3)were included in the analysis (mixed population of femaleand male). Filtered back projection was used to process theprojection data recorded on a Siemens MultiSPECT3 scanner,and Chang attenuation correction was applied.

All datasets were further preprocessed based on affineregistration, Gaussian smoothing with an FWHM of 12 mmand intensity normalization according to the 25% brightestvoxels within the whole brain region [4]. Finally, a wholebrain mask was applied to each image to discard all voxelsfrom further analysis which are outside the brain region.

B. Dimensionality Reduction and Classification

After preprocessing, the SPECT image data of each subjecti is represented by an image vector xi. Through bootstrapresampling, vectors xi were resampled into a training datamatrix X, where each column represents the image datavector from one subject. Prior to multivariate analysis, thedata matrix was centered by subtracting the mean vector ofall contained subjects from each column. Due to the largeamount of variables (the whole brain region comprises 226.985voxels), two different dimensionality reduction techniques,namely PCA and PLS [7], were investigated to reduce thenumber of variables prior to classification. Both PCA andPLS are linear transformations of the original variables. PCAis performed through singular value decomposition (SVD) ofthe original data matrix X, i.e. X = VSUT . The columnsof V are sorted according to the magnitude of their associatedsingular values, which are the diagonal elements of S. PCscores for all subjects were computed by VTX, i.e. eachsubject was projected onto a PC-subspace. In this way, PCAmaximizes the variance of the PC scores on each successivenew axis. Similarly, PLS is performed through SVD of thecovariance matrix between the original data matrix X and theclass label matrix Y. Again, X is projected by the matrix ofleft singular vectors resulting from SVD of this covariancematrix, resulting in the PLS score matrix of X.

After dimensionality reduction, three linear discriminativeclassifiers comprising MDA, ENLR and SVM with a linearkernel are applied to differentiate the classes.

As a multiclass generalization of Fisher’s discriminat anal-ysis (FDA) [8], MDA can be applied to differentiate be-tween more than two classes. MDA projects multidimensionaldata comprising n classes to an n-1 dimensional space, i.e.yi = Wtxi, where W is a matrix containing n-1 discriminantvectors. MDA searches for a matrix W which maximizes theseparation J(W) between the classes

J(W) =WtSBW

WtSWW, (1)

where SB is the between-class scatter matrix and SW is thewithin-class scatter matrix.

ENLR is a variant of logistic regression regularized by anelastic-net penalty. It offers a good classification performanceby using a minimal number of predictors. There have beenmany reports using dimensionality reduction methods coupledwith ENLR as classification procedure in gene research [9],[10]. For class label yi ∈ {0, 1} of the ith observation (i.e.subject) xi, we have

Pr( yi = 1|xi) =1

1 + e−(β0+xiTβ)

(2)

Pr( yi = 0|xi) =1

1 + e+(β0+xiTβ)

(3)

The model parameter vector β in ENLR is estimated byminimizing the classification cost criterion

− 1

N

(N∑i=1

yi(β0 + xi

Tβ)− log

(1 + e(β0+xi

Tβ)))

+λPα (β) ,

(4)where N is the number of observations and λ is the penaltyparameter. For p predictor variables, the elastic-net penalty Pαis

Pα(β) =

p∑j=1

(1

2(1− α)βj2 + α | βj |

). (5)

The elastic-net penalty is a convex combination of lassopenalty (α = 1) and ridge regression penalty (α = 0). Insteadof simply including the first few PC (or PLS) scores (orderedaccording to variance) for classification as in MDA, elastic-net selects the most predictive scores after dimensionalityreduction. In this work, the ENLR problem is solved withthe algorithm proposed by Friedman et al. provided in theMatlab toolbox GLMNET [11], which uses cyclical coordinatedescent as optimization method.

For the SVM method, one SVM was built for each one-against-one pair of classes. A linear kernel was chosen for afair comparison with MDA and ENLR. The linear kernel SVMwas trained to find the optimized classification hyperplaneseparating two classes. This is equivalent to searching thesolution to the problem

min1

2wTw + C

∑i=1N

(ξi) , (6)

Page 3: Comparison of Methods for Classification of Alzheimer’s

Fig. 1. Box-plot of the sampling distribution of the largest principal angle between PCA subspaces (left) or PLS subspaces (right) spanned by a number ofPCs or PLS components ranging from 2 to 15. Subspaces are regarded as more robust for lower median angles. Subspaces with angles close to 90◦ can beconsidered as unstable.

subject to

yi(wTφ(xi) + b) ≥ 1− ξi, ξi ≥ 0, (7)

where the slack variable ξ is introduced to permit trainingerrors, i.e. nonseparable samples. The penalty parameter Ccontrols the trade-off between the training error and themodel complexity. The Matlab toolbox LIBSVM was usedfor calculating the SVM [12].

C. Classification accuracy

The predictive accuracy of the investigated classification procedures was estimated with .632 bootstrap resampling. For a total of 1000 resampling iterations, subjects were drawn randomly with replacement from the previously stratified (same number of subjects in each class) training set. After that, dimensionality reduction and classification were applied to each bootstrap sample. According to [14], the classification accuracy is estimated as follows

accboot =1

b

b∑i=1

(0.632 · acci + 0.368 · acctraining) , (8)

where b is the number of bootstrap iterations, acci is theaccuracy of the individual bootstrap sample corresponding toiteration i, and acctraining is the accuracy of the trainingset. Combining acctraining and acci provides a more realisticestimate of the true accuracy, which would be underestimatedby acci alone, as it is based on less samples (0.632 of allsamples) and is therefore unable to reflect the whole variabilityin the data.

D. Mislabeling simulation

Since all classifiers (MDA, ENLR, SVM) used in this workare supervised methods, they depend on the accuracy of theclinical diagnosis. For assessing the robustness of the classifi-cation procedures to possible misdiagnosis, one participant ofthe whole sample (including all cases of AD, FTD and CTR) ata time was selected and its diagnosis was changed on purpose

to either of the other two classes (class mislabeling). Thisprocess was repeated until all participants in the sample hadtheir diagnosis (class) changed to all other possible labels. Therobustness of the classification procedures to possible misdi-agnosis was evaluated as the average classification accuracy(accmislabel) of the mislabeled cases, for different numbers ofPCs or PLS components used for multivariate analysis.

E. Robustness of PCA/PLS

Assessing the robustness of PCA and PLS subspaces helpsto select a suitable number of PCs or PLS componentsfor dimensionality reduction prior to classification [13]. Forcalculating the robustness, the distance between two equidi-mensional PCA or PLS subspaces spanned by a given numberof PCs or PLS components is calculated in each bootstrap it-eration. One subspace is based on the original dataset whereasthe other subspace is based on the bootstrap resampling set.The distance between two subspaces is computed as the largestprincipal angle between these two subspaces. Further detailscan be found in [14].

F. Robustness of classifiers

For each classifier (MDA, ENLR or SVM), the discrimina-tive vector can be considered as an image indicating the areaspredictive for classification [13]. The corresponding standarderror (SE) image (estimated by the standard deviation of 1000bootstrap resampling iterations) of the discriminative vectorindicates the brain areas which are less reliable with respect todiscrimination for different classification procedures. The SEvalue of the discriminative vector also indicates the robustnessof the classifiers based on PCA or PLS subspaces.

G. Permutation tests for classification

The reliability of the classification accuracy was furtherestimated through statistical significance in a permutation test.The permutation is based on the hypothesis that the imagedata and the disease labels are independent. In this work,the labels of image data corresponding to different pairs of

Page 4: Comparison of Methods for Classification of Alzheimer’s

Fig. 2. Classification accuracies of three PCA-based procedures (solid)and of three PLS-based procedures (dotted) for the bootstrap .632 estimator(corrected, averaged) accboot, with 1000 iterations, depending on the numberof PCs included in the analysis.

classes (CTR-AD, CTR-FTD, AD-FTD) were permutated andthe accuracy of 5-fold cross-validation (CV) was used as thetest statistic [15]. The p-values of six image classificationprocedures were estimated as a fraction of accuracies of CVby permutation.

III. RESULTS AND DISCUSSION

The box-plot in Figure 1 shows the relation between thenumber of PCs or PLS components and the sampling dis-tribution of the angle between PCA subspaces (left) or PLSsubspaces (right). The smaller the angle, the more robust arethe subspaces in the presence of resampling. On the otherhand, large angles (i.e. close to 90◦) indicate that PCA orPLS depend on the underlying training set, e.g. if too manyPCs (introducing too much noise) are included in the analysis.

Fig. 3. Average accuracy of the six classification procedures to mislabeledsamples accmislabel (red dashed dotted line). The predicted classificationaccuracy for unchanged non-mislabeling samples is shown using the blackdashed lines.

Figure 1 indicates that the robustness of both the PCAsubspaces and the PLS subspaces deteriorates, as the numberof PCs or PLS components increases. Including more than thefirst three PCs or PLS components will make the median anglebetween PCA subspaces or PLS subspaces greater than 50◦.However, PCA subspaces generally deteriorate much slowerthan PLS subspaces. A potential explanation for the fasterdeterioration of PLS subspaces is that PLS as a supervisedmethod is able to remember the class labels of a dataset morerapidly. According to these observations, the first three PCs forPCA and the first two PLS components for PLS were chosenfor further analysis.

The predictive accuracy accboot of the six classification pro-cedures is shown in Figure 2. Regarding PCA-based methods(solid lines), it indicates that PCA-ENLR has a predictiveaccuracy of 92.66% when the first three PCs are includedin the ENLR model, which is higher than the correspondingaccuracies of PCA-MDA (88.26%) and PCA-SVM (91.18%).Meanwhile, all three PLS-based procedures (dotted lines)have relatively high predictive accuracy (PLS-MDA, 90.36%;PLS-ENLR, 93.82%; PLS-SVM, 92.84%) when the first twoPLS components are chosen. The accuracies of PLS-ENLRand PLS-SVM are even slightly higher than that of PCA-ENLR.

The classification accuracy for deliberately mislabeled sam-ples (red dash dotted line) in Figure 3 indicates the ro-bustness to misdiagnosis for each classification proceduredepending on different numbers of PCs or PLS componentsincluded. Although the accuracy of PCA-ENLR is still veryhigh when only three PCs are included, it deteriorates rapidlywith increasing number of PCs. This may indicate a trade-off such that ENLR is more effective in utilizing the mostpredictive variables, but is more likely to be influenced bymislabeled samples as more PCs are included. On the otherhand, accuracies of PLS-MDA and PLS-ENLR to mislabeledsamples are always significantly lower than those of corre-sponding PCA-based procedures, suggesting less robustnessof PLS-based procedures to mislabeled training samples.

Figure 4 presents the SE images of discriminative vectorsof all six classification procedures for classifying CTR andAD. Both PCA and PLS classification procedures show thatthe greatest SE is exhibited close to the falx cerebri andaround the anterior and posterior lateral ventricles due to thespatial variability of these regions in dementia patients. Amongall classification procedures, the SE value of PCA-ENLR isthe smallest. The SE values of PLS-based procedures arealways significantly larger than those of corresponding PCA-based procedures, suggesting that the patterns extracted byPLS-based procedures are easier to be influenced by spatialvariability (which could not be accounted for by spatialnormalization). The SE values of discriminative vectors werealso estimated for more PCs or PLS components included intothe model (data not shown). Similarly to [13], it could beobserved that more PCs or PLS components will result in ahigher global value and a broader distribution of SE, indicatingless robustness of classification procedures with increasingnumber of PCs or PLS components.

Using the classification accuracy (with the first three PCs

Page 5: Comparison of Methods for Classification of Alzheimer’s

Fig. 4. SE-map of discriminative pattern of six classification procedures for classifying AD and CTR. The first three PCs are included in case of PCA-basedprocedures and the first two PLS components in case of PLS-based procedures. Visual inspection indicates that PCA-based procedures generally have lowerSE values than PLS-based procedures, and PCA-ENLR has the lowest SE values among all classification procedures.

included for PCA-based procedures, and the first two PLScomponents included for PLS-based procedures) of 5-fold CVas test statistic, Table I shows the p-values of permutationtests for all classification procedures. Most p-values here arearound the significance level (0.05), indicating the reliabil-ity of the classification results. It should be noted that forclassifying CTR vs. AD (CTR-AD), the p-values of PLS-based procedures are higher than those for other class pairs, orof PCA-based procedures. These high p-values suggest somekind of overfitting for PLS-based procedures in that particularcase. Nevertheless, on the whole the results of permutationtests indicate that these classification procedures with smallnumbers of PCs or PLS components are able to detect theclass structure of the data.

In summary, the aforementioned results comprising (a)accuracy results in the presence of mislabeled samples (Fig-ure 3), (b) SE images of discriminate vectors (Figure 4), and(c) p-value of permutation tests (Table I), reflect differentthings:

(1) The accuracy to mislabeled samples indicates the influ-ence of possible clinical misdiagnosis on each classificationprocedure. It also supports an educated decision regarding thenumber of PCs or PLS components included in a classifiermodel, which should preferably generate stable results evenfor mislabeled samples. The highest accuracies to mislabeledsamples were achieved when the first three PCs (or the firsttwo PLS components) were included for PCA-based (or PLS-

TABLE IP-VALUES OF PERMUTATION TESTS

p-value

Method CTR-AD CTR-FTD AD-FTD

PCA-MDA 0.0680 0.0568 0.0528

PCA-ENLR 0.0700 0.0548 0.0578

PCA-SVM 0.0508 0.0630 0.0476

PLS-MDA 0.0908 0.0492 0.0566

PLS-ENLR 0.1056 0.0576 0.0714

PLS-SVM 0.0770 0.0584 0.0552

based) procedures, respectively. This is consistent with theoptimal number of PCs or PLS components suggested inFigure 1.

(2) The SE image of the discriminative vector highlightsimage regions which are less reliable for discrimination. Itshows the robustness of the classification procedures based onbootstrap resampling.

(3) The p-value obtained form a permutation test indicateswhether the classification procedure is able to identify trueclass structures instead of random patterns which happento correlate with class labels. It can be used to assess thecompetence of a classification procedure.

In all these three metrics, PCA-based procedures showedgenerally better performance than PLS-based procedures.However, as illustrated in Figure 2, PLS-ENLR and PLS-SVM showed a slightly higher predictive accuracy than PCA-ENLR. Nevertheless, since all further assessments (accuracyto mislabeled samples (Figure 3), SE-map (Figure 4) and p-value by permutation test (Table I)) provide superior resultsfor PCA-based procedures, this suggests that PCA-ENLR ismore robust and reliable.

IV. CONCLUSION

Six linear classification procedures were investigated inthis work. Among these classification procedures, PLS-basedprocedures generally showed a better predictive accuracyestimated by bootstrap resampling compared to PCA-basedprocedures. However, PCA-based procedures proved to bemore robust (higher accuracies when classifying mislabeledsamples; lower SE value of discriminative vectors in bootstrapresampling) and more reliable (lower p-value in permutationtest) than PLS-based procedures. In all three PCA-based pro-cedures, PCA-ENLR has the highest predictive accuracy whenthe first three PCs are included. Overall, PCA-ENLR appearsto be more successful than other classification proceduresinvestigated in this work.

V. ACKNOWLEDGMENT

We gratefully acknowledge the scholarship program ”Mod-ern Applications of Biotechnology - Research Grants for

Page 6: Comparison of Methods for Classification of Alzheimer’s

Postdoctoral Students from China and Germany” for funding,which is jointly run by the German Academic ExchangeService (DAAD) and the China Scholarship Council (CSC)with the financial support of the Federal Ministry of Educationand Research (BMBF) and Ministry of Education.

REFERENCES

[1] K. Herholz, Perfusion SPECT and FDG-PET, International Psychogeri-atrics 23(S2), pp. S25-S31, 2011

[2] F.J. Bonte, T.S. Harris, C.A. Roney and L.S. Hynan, Differential Diag-nosis Between Alzheimer’s and Frontotemporal Disease by the PosteriorCingulate Sign, J Nucl Med 45(5), pp. 771-774, 2004

[3] J.-F. Horn, M.-O. Habert, A. Kas, Z. Malek, P. Maksud, L. La-comblez, A. Giron and B. Fertil, Differential automatic diagnosis betweenAlzheimer’s disease and frontotemporal dementia based on perfusionSPECT images, Artificial Intelligence in Medicine 47(2), pp. 147-158,2009

[4] D. Merhof, P.J. Markiewicz, G. Platsch, J. Declerck, M. Weih, J. Ko-rnhuber, T. Kuwert, J.C. Matthews and K. Herholz, Optimized datapreprocessing for multivariate analysis applied to 99mTc-ECD SPECTdata sets of Alzheimer’s patients and asymptomatic controls, J CerebBlood Flow Metab 31(1), pp. 371-383, 2011

[5] C. Habeck, N.L. Foster, R. Perneczky, A. Kurz, P. Alexopoulos,R.A. Koeppe, A. Drzezga and Y. Stern, Multivariate and univariateneuroimaging biomarkers of Alzheimer’s disease, NeuroImage 40(4),pp. 1503-1515, 2008

[6] J. Ashburner and K.J. Friston, Nonlinear spatial normalization using basisfunctions, Human Brain Mapping 7, pp. 254-266, 1999

[7] A. Krishnan, L.J. Williams, A. R. McIntosh and H. Abdi, PartialLeast Squares (PLS) methods for neuroimaging: a tutorial and review,Neuroimage 56(2), pp. 455-475, 2011

[8] R.O. Duda, P.E. Hart and D.G. Stork, Pattern classification, John Wiley& Sons, Inc., 2001.

[9] G.M. D’Angelo, D. Rao and C.C. Gu, Combining least absolute shrinkageand selection operator (LASSO) and principal-components analysis fordetection of gene-gene interactions in genome-wide association studies,BMC Proc 3(Suppl 7): S62, 2009

[10] H. Zou, T. Hastie and R. Tibshirani, Sparse Principal ComponentAnalysis, J Com. Graph Stat 15(2), pp. 265-286, 2006

[11] J. Friedman, T. Hastie and R. Tibshirani, Regularization Paths forGeneralized Linear Models via Coordinate Descent, J Stat Softw 33(1),pp. 1-22, 2010

[12] C.-C. Chang and C.-J. Lin, LIBSVM: A library for support vectormachines, ACM Trans. Intell. Syst. Technol. 2(3), pp. 1-27, 2011

[13] P.J. Markiewicz, J.C. Matthews, J. Declerck and K. Herholz, Robust-ness of multivariate image analysis assessed by resampling techniquesand applied to FDG-PET scans of patients with Alzheimer’s diseases,NeuroImage 46(2), pp. 472-485, 2009

[14] B. Efron and R.J. Tibshirani, An Introduction to the Bootstrap, In:Chapman & Hall/CRC, 1993

[15] P. Golland, F. Liang, S. Mukherjee and D. Panchenko, PermutationTests for Classification, In: Learning Theory (Ed., P. Auer and R. Meir),Springer Berlin Heidelberg, 2005