school of electronics and physical sciences, university of

29
Detection of structural differences between the brains of schizophrenic patients and controls Vassili A Kovalev * 1 , Maria Petrou * and John Suckling + (*)School of Electronics and Physical Sciences, University of Surrey, Guildford GU2 7XH, United Kingdom (+)Institute of Psychiatry, Guy’s, King’s and St Thomas’ Medical School, London, United Kingdom Corresponding author: Prof M Petrou, School of Electronics and Physical Sciences, University of Surrey, Guildford GU2 7XH, United Kingdom Tel: +44 1483 689801, Fax: +44 1483 686031 email: [email protected] 1 Present address: Max-Planck Institute of Cognitive Neuroscience, Stephanstrasse 1A, D-04103 Leipzig, Germany 1

Upload: others

Post on 01-Jun-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: School of Electronics and Physical Sciences, University of

Detection of structural differences between the brains of schizophrenic patients and

controls

Vassili A Kovalev∗1, Maria Petrou∗ and John Suckling+

(*)School of Electronics

and Physical Sciences,

University of Surrey,

Guildford GU2 7XH, United Kingdom

(+)Institute of Psychiatry, Guy’s, King’s and St Thomas’ Medical School,

London, United Kingdom

Corresponding author:

Prof M Petrou,

School of Electronics

and Physical Sciences,

University of Surrey,

Guildford GU2 7XH,

United Kingdom

Tel: +44 1483 689801,

Fax: +44 1483 686031

email: [email protected]

1Present address: Max-Planck Institute of Cognitive Neuroscience, Stephanstrasse 1A, D-04103 Leipzig, Germany

1

Page 2: School of Electronics and Physical Sciences, University of

Abstract

This paper investigates the validity of the null hypothesis:

There are no structural differences between the brains of schizophrenics and normal controls that

manifest themselves in MRI-T2 data and distinguish the two populations in a statistically significant

way. The data used refer to 21 schizophrenic patients and 19 normal controls, matched for age, sex

and social background. The methodology used is based on 3-dimensional texture analysis which is

used to quantify anisotropy in the data at scales of the order of a few millimetres. These data reject

the null hypothesis. In addition, this paper attempts to identify the regions of the brain which

are responsible for the morphological characteristics that distinguish the two populations. For this

purpose, it utilises a second texture analysis method that, in spite of being a global method, allows

one to trace back to the data the origin of the features that most distinctly distinguish the two

populations. This method indicates that the features that distinguish the two populations with p

values smaller than 10−6, are located in the most inferior part of the brain and in particular in the

tissue that makes up the sulci. It is stressed that in order to preserve the integrity of the data for

texture calculations, no registration of anatomical structures is performed, and the most inferior

part of the brain is identified as referring to those slices of the scans that visually correspond to

slices 1–12 of the Talairach and Tournoux brain atlas.

Key words: schizophrenia, magnetic resonance imaging, brain morphology

2

Page 3: School of Electronics and Physical Sciences, University of

1 Introduction

Although the aetiology of schizophrenia is not fully understood, it is becoming increasingly clear that

the symptoms of schizophrenia have their origins in disordered brain chemistry and diffuse cortical

abnormalities and that the condition is associated with certain morphological brain characteristics

(eg Okazaki 1998, Pfefferbaum et al 1999, Supprian et al 1997).

Prior to modern imaging techniques anatomic studies of schizophrenia were limited to post-

mortem measurements. Computerised tomography (CT) and more recently magnetic resonance

imaging of the brain (MRI) have caused a renaissance in the investigation of in-vivo cortical anatomy.

Analysis has commonly proceeded using region-of-interest (ROI) investigations involving the manual

delineation of a particular structure hypothesised a-priori to be statistically different between a

group of patients and a matched group of control subjects. Results from such studies indicate that

whole brain volume is reduced in patients relative to controls by around 3% (Lawrie and Abukmeil

1998) and that many specific areas including the temporal and frontal lobes, hippocampus and

thalamus, are similarly affected (for example Nelson et al 1998, Gur et al 1998). The ROI approach

has a number of potential problems. First, inter- and intra-operator reproducibility can affect the

power to detect differences. Second, neuroanatomic boundaries are often indistinct and radiological

definitions of structures may vary making cross-study comparisons difficult. Finally, and perhaps

most importantly, the need to initially specify which areas are of interest may preclude assessment of

other regions involved in the condition. In contrast, voxel-based analyses can encompass the entire

brain negating the necessity for prior hypotheses. Furthermore, if the algorithms are automated,

reproducibility and compatibility are ensured. An example of a voxel-based study which pointed

out structural changes in the brains of schizophrenics was that of Sigmundsson et al, 2001, which

was based on segmenting the MRI data into component tissue maps. An earlier example was that

of Gaser et al, 1999, which measured the deformation needed to register different MRI brain scans

to a reference brain.

In addition, a lot of studies which were based on magnetic resonance imaging, have been con-

cerned with the differences in relaxation times between the two groups. Several other studies were

concerned with the qualitative and quantitative study of signal hyperintensities (eg Sanchev et al,

3

Page 4: School of Electronics and Physical Sciences, University of

1999 and Sanchev and Brodaty, 1999). In this study we quantify signal hyperintensities making use

of some recently developed texture analysis techniques in the field of image processing, appropriate

for the analysis of 3D data (Kovalev et al, 1996, and Kovalev and Petrou, 2000.) In particular,

we are concerned with the quantification of textural morphological anisotropies in the brains of

schizophrenic patients, and the identification of the regions of the brain where these characteristics

differ from those of the normal controls. Previous studies of anisotropy (eg Pfefferbaum et al, 1999)

have concentrated on anisotropies at scales of a few microns. Our study refers to the scale of the

imaging resolution, ie to scales of the order of millimetres. Pioneering work of texture analysis for

MRI data has been reported by Freeborough and Fox 1998. Their study, however was restricted to

2D analysis, while the analysis performed here is fully 3D.

The null hypothesis which this study is testing is: There are not any structural differences

between the brains of schizophrenics and normal controls that manifest themselves in MRI-T2 data

and distinguish the two populations in a statistically significant way.

2 Materials and Methods

2.1 Data Description

All subjects were scanned with a 1.5-T GE Signa (GE medical systems Milwaukee) at the Maudsley

Hospital, London. Proton density and T2 weighted images were acquired with a dual-echo fast spin-

echo sequence (TR = 4000ms, TE = 20.85ms). Contiguous interleaved images were calculated

with 3mm slice thickness and 0.856mm × 0.856mm in plane pixel size in the axial plane parallel

to the intercommissural line. The subjects were 21 normal controls and 19 schizophrenics. The

two groups were matched for age and social class as defined by the occupation of the head of the

household at the time of birth. (Mean age for controls 31.5, with a standard deviation of 5.9. Mean

age for patients 33.7 with a standard deviation of 6.9. Social class 1–3, contols 18/21 and patients

16/19.) The mean premorbid IQs for both groups were estimated by the National Adult Reading

Test with the normal controls having slightly higher IQ. (Mean IQ of schizophrenics 101.5 with

a standard deviation 11.3 and mean IQ for normal controls 107.4 with a standard deviation 9.6.)

The schizophrenics had an average of 13.1 education years with standard deviation 6.0, while the

4

Page 5: School of Electronics and Physical Sciences, University of

normal controls had an average of 14.3 education years with standard deviation 2.5. All patients had

been taking typical antipsychotic medication for more than 10 years. All subjects were male and

righthanded. All subjects satisfied the following criteria: no history of alcohol or drug dependence,

no history of head injury causing loss of consciousness for one hour or more and no history of

neurological or systemic illness. The normal controls in addition satisfied the criterion of having no

DSM-IV axis I disorder.

The dual-echo sequence used for image acquisition is that commonly acquired as part of neurolog-

ical examinations at the Maudsley Hospital, London, UK. The sequence parameters were optimised

to achieve maximum separation of clusters representing the tissues of the brain and the feature

space formed by the voxel intensities of the two echoes (Simmons et al., 1994).

In all cases the images were preprocessed so that only the brain parenchyma was extracted for

further analysis (Suckling et al, 1999).

Each scan consists of 32 slices. Initially all slices referring to the same subject were analysed

together, thus resulting in global brain characterisation for each subject. Separate analyses were

performed using the slices that refer to the inferior half and the inferior quarter of each brain.

The registration of the data with a brain atlas was avoided in order to avoid any distortion caused

by re-sampling and interpolation that any such registration necessarily entails, especially since the

sizes of the brains of the different subject were different and we are interested in quantifying the

micro-structural properties of the data. So, the inferior 50% or the inferior 25% of each brain was

identified by visually identifying the two slices which are most anatomically similar to slice 24 and

slice 12 of the Talairach and Tournoux, 1988, anatomical atlas, respectively.

2.2 Data analysis

We use two techniques to study the textural properties of such data: 3D orientation histograms

as described in Kovalev and Petrou 2000, and generalised co-occurrence matrices as described in

Kovalev and Petrou 1996. Both techniques will be used to test for the null hypothesis made in

the introduction. Both techniques are global techniques, ie they construct features from the whole

region of interest. However, the second technique has the extra advantage that it allows one to

identify the localities in the data which are responsible for the measurements that characterise each

5

Page 6: School of Electronics and Physical Sciences, University of

group.

Method I

According to the first method, one first has to compute the gradient vector of the T2 intensity

at each voxel position by convolving the data with an appropriate kernel. The Zucker and Hummel

(1981) filter can be used for this purpose. This filter is 3× 3× 3 in size, and is represented here by

its three cross-sections orthogonal to the direction of convolution:

1√3

1√2

1√3

1√2

1 1√2

1√3

1√2

1√3

0 0 00 0 00 0 0

− 1√3

− 1√2

− 1√3

− 1√2

−1 − 1√2

− 1√3

− 1√2

− 1√3

(1)

The convolution of the image with this mask along the three axes produces the three components

of the gradient vector at each voxel. Next, one has to consider equal solid angles in all directions in

3D and count how many voxels have gradient vectors in each solid angle. These solid angles are the

so called “bins” of the 3D orientation histogram. To define these solid angles one considers a sphere

with unit radius (unit sphere). If the surface of the sphere is divided into patches of equal area, all

solid angles sustained by the centre of the sphere and each one of these patches are equal. One way

to create patches of equal area on the surface of the unit sphere is to divide the azimuthal angle

φ (measured along the equator of the unit sphere) and the height z above or below the equatorial

plane of the unit sphere into equal segments. The division of 0 ≤ φ < 3600 into M equal intervals

and the division of −1 ≤ z ≤ 1 in N equal segments results in (N − 2) × M spherical quadrangles

and 2M spherical triangles, all sustaining the same solid angle of 4π/(NM). Then, a gradient

vector (a, b, c) belongs to bin (i, j) if the following two conditions are met:

Mi ≤ φ <

M(i + 1), where sinφ =

b√a2 + b2

and cosφ =a√

a2 + b2(2)

−1 +2

Nj ≤ c̄ < −1 +

2

N(j + 1), where c̄ =

c√a2 + b2 + c2

(3)

This way the 3D orinetation histogram of the data is created. It can be visualised by being

plotted as a 3D graph, where along each direction the radius is proportional to the number of

gradient vectors identified in that direction. If the data are totally isotropic, this 3D structure is

6

Page 7: School of Electronics and Physical Sciences, University of

expected to be a sphere. The way this structure differs from a sphere depends on the anisotropy of

the data. By quantifying the shape of this structure, one may quantify the anisotropy in the data.

To compare then the two populations, one has to compare the distributions of these numbers over

the two population samples. The three numbers (features) defined below are some of many that

may be used to characterise the shape of the orientation histogram (Kovalev and Petrou, 2000):

• Anisotropy Coefficient:

F1 =Hmax

Hmin

, (4)

where Hmin 6= 0 and Hmax correspond to the minimum and the maximum values of the

orientation histogram, respectively.

• Integral anisotropy measure:

F2 =

∑Ni=1

∑Mj=1 (H(i, j) − Hm)2

NM, (5)

where Hm is the mean value of the histogram, H(i, j) is the value at orientation (i, j) and

N × M is the total number of distinct orientations considered. This integral feature can

be considered as the standard deviation of the distribution of the gradient vectors over the

various orientations.

• Local mean curvature:

F3 =

∑Ni=1

∑M−1j=2

(

H(i, j) − 1

4(H(i − 1, j) + H(i + 1, j) + H(i, j − 1) + H(i, j + 1))

)2

N(M − 2)(6)

This local mean curvature is expressed by the average value of the Laplacian calculated for

all orientations of the histogram.

Each of these numbers was tested using the package STATISTICA for its ability to discriminate

between the two populations with a student’s t-test.

7

Page 8: School of Electronics and Physical Sciences, University of

Method II

The second method used counts the frequency with which a certain combination of characteristic

values appears in the data in the same relative position. In particular, one counts the frequency

with which a particular combination of intensity gradient magnitudes appears with the same relative

orientation and at the same relative distance from each other. In image processing terminology,

this is a generalised co-occurrence matrix, denoted as

w[g(i), g(j), a(i, j), d(i, j)] (7)

where w is the frequency of occurrence of a voxel pair (i, j) with gradient magnitude g(i) and g(j)

respectively, an angle a(i, j) between their gradient vectors and at distance d(i, j) from each other.

More details for the particular implementation of both these methods can be found in the

Appendix.

3 Results

Several preliminary experiments (Kovalev and Petrou, 2001) were conducted which are not reported

in detail here. From those preliminary experiments we were guided to narrow our reported results

as follows:

• Best results could be obtained from MRI-T2 images and not from MRI-PD images, so we do

not present any results from the MRI-PD data.

• The two classes could not be discriminated well if all gradient vectors with reliable orientation

estimates were used. Some thresholding of the gradient magnitudes was necessary. Empiri-

cally, the best results were obtained when only gradient vectors stronger than 75 units were

retained. (The range of gradient values is [0 − 600].) This is a very significant conclusion,

because such strong gradients roughly correspond to the cortical surface.

• The two groups could not be discriminated using features computed from the slices from the

8

Page 9: School of Electronics and Physical Sciences, University of

most superior 25% of the brain, ie slices of the scans that correspond to slices 25 and above

to the Talairach and Tournoux brain atlas.

As a consequence of the above preliminary conclusions, a series of three experiments were per-

formed using orientation histograms. In the first experiment whole brains were analysed. In the

second experiment the inferior half of the brain was used, and in the third only the inferior quarter

25%.

Results of Method I

In all cases the strongest gradient vectors were used, with magnitude greater than 75 units. In

all cases the orientation histogram was constructed according to the details given in section 2 and

the Appendix.

Table 1 gives the mean and standard deviation value of each of the three features for each of the

experiments performed (namely whole brain, half brain and most inferior part of the brain) and for

each of the two groups of subjects. The t and p values for each feature are also given in this table.

The p values of the features which can be used as class discriminators with confidence level higher

than 95% are highlighted in the table. It can be seen that all three features are good discriminators

when they are computed from the orientation histogram that corresponds to the inferior part of the

brain. The box and whiskers graphs of population discrimination with the help of these features

are shown in figure 1.

These results indicate that the null hypothesis may be refuted. However, they do not identify the

particular parts of the brain which are morphologically different in the two populations, apart from

the fact that the discrimination is statistically more significant when the data slices that correspond

to slices 1–12 of the Talairach and Tournoux brain atlas are used.

Having refuted the null hypothesis for this set of data, one may use the second method to iden-

tify which exact localities of each brain most characterise the classification of the brain in either of

the two populations.

9

Page 10: School of Electronics and Physical Sciences, University of

Results of Method II

Various experiments were conducted using the co-occurrence matrix w described in section 2.

In each experiment each element of the co-occurrence matrix was tested as a class discriminator

according to the t−test. Feature selection was performed by thresholding the t values using various

thresholds.

The best results were obtained when the region of interest was restricted to be that which

corresponds to the first 12 slices of the Talairach and Tournoux brain atlas (“most inferior part”).

Table 2 shows the best features from these experiments. They are features with t > 7.5. The p

value for all these features has its first significant figure beyond the 6th decimal point. As explained

in the Appendix, the measured quantities used to construct matrix w are quantised in a few bins.

The columns of the table identified by g(i) and g(j) give the quantised gradient values of the pairs

of voxels that make up each feature. As the maximum number of bins used for gradient strength is

8, all these numbers are out of 8. The column marked a(i, j) gives the relative angle between the

gradient vectors of the voxels of each pair measured in units of 15o (since that is the width of the

quantisation bins used). The column marked d(i, j) gives the relative distance between the voxels

of each pair measured in units of 0.856mm which is the width of each quantisation bin used.

To understand the significance of these features one may consider the example shown in figure 2.

Three voxels are identified there, as one zooms in some part of the image. Each voxel has associated

with it a white arrow, representing its gradient vector. Let us consider the two voxels marked with

letter “a” as forming a pair. The relative orientation a(i, j) of their associated gradient vectors

is small, as these vectors are almost parallel, and so the pair will have a(i, j) = 1. The gradient

magnitude of the voxel on the left is g(i) = 4, and that of the voxel on the right is g(j) = 2. The

distance of the two voxels is d(i, j) = 4 (ie 4 × 0.856 = 3.4mm). So, this pair will be counted as

a pair contributing to feature number 3 of table 2, since it has g(i) = 4, g(j) = 2, a(i, j) = 1 and

d(i, j) = 4. One may also combine the voxels marked with “b” to form a pair. These two voxels

have their gradient vectors in almost opposite direction, so their relative angle is large, and it turns

out to be in the range 135o−150o, which places this pair into the quantisation bin with a(i, j) = 10.

The gradient magnitude of the voxel at the top is g(i) = 6, and the distance of these two voxels is

10

Page 11: School of Electronics and Physical Sciences, University of

d(i, j) = 6 (ie 6× 0.856 = 5.1mm). This pair therefore contributes to feature number 26 of table 2.

The features measured by this method are the relative abundances of all possible pairs of this

kind.

One can use these features to perform classification of all the subjects using the tree clustering

method of package STATISTICA. In addition, this package allows the projection of these clusters

in a 2D dimensional space for visualisation. The tree of figure 3 and the scattergram of figure 4

indicate that only 2 subjects were misclassified.

It seems therefore, that this method also rejects the null hypothesis. One may work backwards

now, to identify which voxels contributed to pairs that were counted when constructing the elements

of the co-occurrence matrix that could act as such good population discriminators. This way, one

may identify the localities of the brain that particularly characterise the schizophrenic subjects

in this study. All one has to do is to keep the identity of the pairs of voxels that were counted

in each bin of matrix w (Kovalev and Petrou, 1996). Figure 5 shows some slices of the brains

of schizophrenic subjects with the parts of the brain that contribute to the characterisation as

schizophrenic brains highlighted in green.

There is an important point that must be questioned with respect to these results:

It is well known from statistics that if one has at one’s disposal hundreds of features that can

be used as discriminators and only 40 cases to discriminate into two classes, some of these features

may discriminate the cases into the right classes by chance. It is necessary, therefore, to examine

the validity of the identified features as genuine features. There are two ways by which the validity

of the presented results might be tested:

• Use of the “leave-one-out” method for testing classification consistency: The best features

were chosen on the basis of the 39 cases, and the 40th case was classified using the pre-defined

feature. In all cases the best 28 classifying features were selected ranked according to their

t values. The identity of the selected features was compared with the features selected when

all cases were used, and the number of identical features were identified. The process was

repeated for every single case separately. Table 3 shows these results. In the first column

the case which is left out is identified. It is highlighted if it were misclassified. One can see

11

Page 12: School of Electronics and Physical Sciences, University of

that the average classification error is 4 out of 40 cases, ie 10%. The second column gives

the number of features among those selected that were also selected when all 40 cases were

considered together. This number is out of 28 and the third column gives their percentage

out of 100. The mean value of stable persisting features from experiment to experiment is

86.9%. The selection of features in each case was done on the basis of a fixed number, and

it was not optimised in order to reduce the misclassified cases in the training set. So, apart

from the case that is used for testing, some other cases may also be misclassified. The 4th

column identifies the cases that were classified as schizophrenic while they were normal, and

the 5th column the cases that were schizophrenic and were misclassified as normal. In almost

all experiments the two outliers cases were subjects N20 and S38.

• If the identified features are true, they must refer to real structures on the brain, and therefore

they must be strongly correlated, and robust: perturbing the computational parameters should

not change the results. The robustness of identification of the characteristic brain localities

was tested by varying the criterion with which the discriminating features were selected. The

results of this analysis remained the same even when the threshold value of t used to identify

the features was changed by 1 to 3 units. Correlation analysis showed that the elements of the

co-occurrence matrix constructed are strongly correlated and they form four groups. Change

of threshold value of t simply selects a smaller number of representative features from each of

the four groups, without affecting the final result. The correlation coefficient between pairs of

features was found to vary between 0.79 and 0.99, with a mean 0.92 and a standard deviation

of 0.04. The column marked with C in table 2 indicates to which cluster of features each

feature belongs. Note that these clusters of features are not anatomically coherent. They

rather represent groups of features that can be characterised qualitatively by expressions

like: “Pairs of voxels that have strong gradient values and large relative orientations and are

at (relatively) large distances from each other are more abundant in schizophrenics than in

normal controls.” This sentence for example, describes all features of class C=4, which as it

can be seen from table 2, have higher mean value in schizophrenics than in controls.

12

Page 13: School of Electronics and Physical Sciences, University of

4 Conclusions and Discussion

The results presented in the previous section reject the null hypothesis, namely they support the

conclusion that there are morphological differences between the brains of the schizophrenic subjects

and the normal controls, which manifest themselves as different types of texture anisotropy at scales

of a few millimetres in MRI-T2 scans, and in particular in the parts of the scans that correspond to

slices 1–12 of the Talairach and Tournoux brain atlas. In addition, the second analysis method used

could identify the regions of difference in the parts of the brain that correspond to slices 1–12, and

localised them in the cortical surface, in the volume of the tissue that makes up the sulcal windings.

The T2 weighted images that we have analysed essentially discriminate between CSF and brain

parenchyma. Although we avoided gradients that correspond to the sulcal and ventral surface, and

looked only at gradients in the volume of the sulcal tissue, we are still picking up the effects of the

anatomy of that surface. The features which seem to characterise the group of schizophrenics are

localised in the vicinity of that surface, and the analysis of the nature of these features reported in

table 2 and exemplified in figure 3, indicates that the excess of pairs of voxels with strong gradient

values and large relative orientations in the patients relative to the controls (higher mean values of

these features in the schizophrenics than in the normals) indicates an enlargement of the volume

of these structures. The enlargement of ventricular volumes in schizophrenic patients (Johnstone

et al., 1976) was the earliest finding from a neuroimaging study. Our findings are consistent with

this and lend some face validity to the techniques described. Associated sulcal CSF increases have

been reported much less frequently (Nopoulos et al., 1997; Cannon et al., 1998), probably due to

the labour intensive methods employed. The origin of these increases could be attributed to either

sulcal widening or generalised atrophy; our results appear to suggest the former, although they

do not exclude the latter. So, our results concur with the general hypothesis that schizophrenia

manifests itself neurologically as generalised changes to a wide number of brain regions and circuits.

We do not consider our method to be a new diagnostic tool for schizophrenia as there is overlap

with results of normal population and the specificity of our results to schizophrenia among other

psychiatric disorders has not been established. Besides, there is no need of an MRI scan to diagnose

schizophrenia. The important thing about this and other similar studies is to show that there are

13

Page 14: School of Electronics and Physical Sciences, University of

neurological differences between patient and control populations. Only recently as neuroimaging

techniques have become available, has the extent and range of differences in the brain structure

become apparent. It now appears that schizophrenia is a multi-system problem manifesting itself

not only in volume reductions of specific areas of the brain (McCarley et al., 1999; Lieberman

et al., 2001) and changes in sulcal anatomy, but also in sulcal tissue structure that we appear to

demonstrate here. Neurophysiological and Neuroanatomical indices like the ones used in studies of

this type could contribute to the investigations of the aetiology of schizophrenia and the development

of vulnerability markers for the condition (Kasai et al., 2002).

Acknowledgements: This work was made possible with the help of INTAS grant 96-785 from

the European Union.

Appendix: Technical details

The anisotropic sampling of the data along the intra-slice direction (Z axis) is handled by

appropriate scaling of the metric used to compute distances. The scaling factor in our case is

3/0.856 = 3.5. This scaling factor is used in all experiments that involve metric relationships

without appearing explicitly in the equations.

The orientation changes between the scans of different subjects do not affect the results. In

the first method, the orientation bins used to quantise the different orientations were chosen so

that the inter-subject orientation difference was less than half the bin width, so that the effect

of such variations was eliminated. Besides, the comparisons made refer to comparisons between

features computed from the whole texture representations, and as such they are independent of

global orientation effects (Kovalev and Petrou 1999).

The values of the elements of the co-occurrence matrix constructed are independent of rotation

and translation of the data, and so they are robust to the exact positioning of the subject in the

device used to capture the data (Kovalev and Petrou 1996).

In Method I the cones of equal solid angle needed were constructed by dividing the diameter of

the unit sphere along the Z axis into 13 (N = 13) equally sized intervals, and the azimuthal angle φ

into 13 (M = 13) equally sized intervals. All orientation histograms were normalised so that their

14

Page 15: School of Electronics and Physical Sciences, University of

bin values could sum up to MN = 169.

In Method II, in order to be able to use the elements of matrix w to characterise an object by

the relative abundance of certain structures, one must also normalise first the matrices. As the data

are distributed over a finite volume, within the same object, certain relative distances will be more

common than others. To take account of this effect, matrix w is normalised so that all its elements

that refer to the same distance value d sum up to 1. This way the elements of the matrix express

the relative abundance of certain characteristics that occur at a fixed distance apart.

The distance values d used are all the relative distances present in the data, up to a maximum

distance. They are quantised to a small number of values. The numbers of discrete quantisation

levels used for all quantities needed in the construction of matrix w are the main parameters of

the method. For the gradient magnitude, we experimented with the number of bins varying in the

range [5, 16]. It was found that no significant difference in the results was observed as the number

of bins changed. So, in all the experiments reported the value of 8 was used. The width of each bin

was 75 units. The relative distance of the paired voxels was restricted in the range [1, 6], ie from

0.856mm to 5.136mm. Note that as the inter-slice distance is 3mm, distances smaller than this value

examine only co-occurrences within the same slice and inter-slice correlations are not used. Again,

experiments showed that pairing voxels at relative distances greater than 5.136mm was creating

very “noisy” bin values as far as the t-test was concerned. The relative orientation between the

gradient vectors of the paired voxels was quantised into 12 bins, 15o wide each. This way the co-

occurrence matrix consisted of 2592 different elements: it is symmetric with respect to the gradient

magnitude (because it does not matter with what order we treat the gradient magnitudes of the

paired voxels) so that from its total number of 4608 elements, we have 4608−6×12×8

2+6×12×8 = 2592

independent elements.

15

Page 16: School of Electronics and Physical Sciences, University of

References

Cannon, T. D., van Erp, T. G. M., Huttunen, M., Lonnqvist, J., Salonen, O., Valanne, L.,

Poutanen, V. P., Standertskjold-Nordenstam, C. G., Gur, R. E. and Yan M., 1998. Regional gray

matter, white matter and cerebrospinal fluid distributions in schizophrenic patients, their siblings

and controls. Archives of General Psychiatry, 55, 1084–1091.

Freeborough, P. A. and Fox, N. C., 1998. MR image texture analysis applied to the diagnosis

and tracking of Alzheimer’s disease. IEEE Transactions of Medical Imaging, 17, 475–479.

Gaser, C., Volz, H.-P., Kiebel, S., Riehemann, S. and Sauer, H., 1999. Detecting structural

changes in whole brain based on nonlinear deformation-application to schizophrenia research. Neu-

roimage 10, 107–113.

Gur, R. E., Maany, V., Mozley, P. D., Swanson, C., Bilker W. and Gur, R. C., 1998. Subcortical

MRI volumes in neuroleptic-naive and treated patients with schizophrenia. Am. J. Psych., 155,

1711–1717.

Johnstone, E. C., Crow, T. J., Frith, C. D., Husband, J. and Kreel, L., 1976. Cerebral ventric-

ular size and cognitive impairment in chronic schizophrenia. Lancet, 2, 924–926.

Kasai, K., Iwanami, A., Yamasue, H., Kuroki, N., Nakagome, K. and Fukuda, M., 2002. Neu-

roanatomy and Neurophysiology in Schizophrenia. Neuroscience Research, 43, 93–110.

Kovalev V. A. and Petrou, M., 1996. Multidimensional co-occurrence matrices for object recog-

nition and matching. Graphical Models and Image Processing, 58, 187–197.

Kovalev, V. A., Petrou, M. and Bondar, Y. S., 1999. Texture anisotropy in 3D images. IEEE

Transactions on Image Processing, 8, 346–360.

16

Page 17: School of Electronics and Physical Sciences, University of

Kovalev, V. A. and Petrou, M., 2000. Chapter 15: Texture analysis in Three Dimensions as a

cue to Medical Diagnosis, in Handbook of Medical Imaging, Processing and Analysis, I Bankman,

editor, Academic Press, ISBN 0-12-077790-8, 231–247.

Kovalev, V. A. and Petrou, M., 2001. Analysis and classification of 3D MR Images of Schizophre-

nia and Norm. Technical Report, VSSP-TR-3/2001. University of Surrey.

Lawrie, S. M. and Abukmeil, S. S., 1998. Brain abnormality in schizophrenia. A systematic and

quantitative review of volumetric magnetic resonance imaging studies. Br. J. Psych., 160, 179–186.

Lieberman, J., Chakos, M., Wu, H., Alvir, J., Hoffman, E., Robinson, D., Bilder, R., 2001.

Longitudinal Study of Brain Morphology in First Episode Schizophrenia. Biological Psychiatry, 49,

487–499.

McCarley, R. W., Wible, C. G., Frumin, M., Hirayasu, Y., Levitt, J. J., Fischer, I. A. and

Shenton, M. E., 1999. MRI Anatomy of Schizophrenia. Biological Psychiatry, 45, 1099–1119.

Nelson, M. D., Saykin, A. J., Flashman, L. A. and Riordan, H. J., 1998. “Hippocampal vol-

ume reduction in schizophrenia as assessed by magnetic resonance imaging: A meta-analytic study.

Arch. Gen. Psych., 55, 433–440.

Nopoulos, P., Flaum, M. and Andeasen, N. C., 1997. Sex differences in brain morphology and

schizophrenia. American Journal of Psychiatry, 154, 1648–1654.

Okazaki, Y., 1998. Morphological brain imaging studies on major psychoses. Psychiatry and

Clinical Neurosciences, 52 (Suppl), S215–S218.

Pfefferbaum, A., Sullivan, E. V., Hedehus, M., Moseley, M. and Lim, K. O., 1999. Brain gray

17

Page 18: School of Electronics and Physical Sciences, University of

and white matter transverse relaxation time in schizophrenia. Psychiatry Research: Neuroimaging,

91, 93–100.

Sanchev, P., Cathcart S., Shnier R., Wen W. and Brodaty, H., 1999. Reliability and validity

of ratings of signal hyperintensities on MRI by visual inspection and computerised measurement.

Psychiatry Res, 92, 103–115.

Sanchev, P. and Brodaty, H., 1999. Quantitative study of signal hyperintensities on T2-weighted

magnetic resonance imaging in late onset schizophrenia. American Journal of Psychiatry, 156, 1958–

1967.

Sigmundsson, T., Suckling J., Maier M., Williams S., Bullmore E. T., Greenwood K. E., Fukuda

R., Ro M. A. and Toone B. K., 2001. Structural abnormalities in frontal, temporal and limbic re-

gions and interconnecting white matter tracts in schizophrenic patients with prominent negative

symptoms. American Journal of Psychiatry, 158, 234–243.

Simmons, A., Arridge, S. R., Barker, G. J., Cluckie, A. J. and Tofts P. S., 1994. Improvements

in the quality of MRI cluster analysis. Magnetic Resonance Imaging, 12, 1191–1204.

Suckling, J., Brammer M. J., Lingford-Hughes A. and Bullmore E. T., 1999. Removal of extrac-

erebral tissues in dual-echo magnetic resonance images via linear scale-space features. Magnetic

Resonance Imaging, 17, 247–256.

Supprian, T., Hofmann, E., Warmuth-Metz, M., Franzek, E. and Becker, T., 1997. MRI T2 re-

laxation times of brain regions in schizophrenic patients and control subjects. Psychiatry Research:

Neuroimaging, 75, 173–182.

Talairach, J. and Tournoux, P., 1988. Co-planar stereotaxic atlas of the human brain. Thieme,

New York.

18

Page 19: School of Electronics and Physical Sciences, University of

Zucker, S. W. and Hummel, R. A., 1981. A 3D Edge Operator. IEEE Trans on Pattern Analysis

and Machine Intelligence, 3, 324–331.

19

Page 20: School of Electronics and Physical Sciences, University of

Figure 1:

20

Page 21: School of Electronics and Physical Sciences, University of

Figure 2:

21

Page 22: School of Electronics and Physical Sciences, University of

Figure 3:

22

Page 23: School of Electronics and Physical Sciences, University of

Figure 4:

23

Page 24: School of Electronics and Physical Sciences, University of

(a) (b) (c)

Figure 5:

24

Page 25: School of Electronics and Physical Sciences, University of

Exp Feature mean-NRM mean-SCH STD-NRM STD-SCH t-value p-valueAll F1 4.84900 4.81167 0.515792 0.450872 0.23633 0.814513

slices F2 3.95925 3.84172 0.161860 0.243005 1.77120 0.084994F3 5.33906 5.49373 0.359787 0.341258 -1.35567 0.183647

Slices F1 10.42900 8.61667 2.891739 1.625184 2.34459 0.024682

1–24 F2 5.00575 4.58356 0.433552 0.430297 3.00795 0.004777

F3 8.65335 8.60354 0.894389 0.891148 0.17170 0.864638Slices F1 13.86350 10.70611 3.975026 2.733707 2.82093 0.007743

1–12 F2 5.86100 5.26967 0.585009 0.510092 3.30383 0.002164

F3 11.36615 10.57330 1.336450 0.830297 2.16698 0.036931

Table 1:

25

Page 26: School of Electronics and Physical Sciences, University of

No g(i) g(j) a(i, j) d(i, j) C mean-NRM mean-SCH t-value p-value1 6 3 1 1 2 0.221586 0.493311 -8.11809 < 10−6

2 5 5 10 2 3 0.061619 0.117016 -7.62078 < 10−6

3 4 2 1 4 1 0.619648 1.041732 -7.80472 < 10−6

4 5 2 11 4 4 0.162662 0.286242 -7.63650 < 10−6

5 5 3 10 4 4 0.162690 0.299405 -7.84881 < 10−6

6 5 3 11 4 4 0.129267 0.251737 -7.79190 < 10−6

7 6 2 10 4 3 0.085971 0.160721 -7.81694 < 10−6

8 5 2 8 5 2 0.294752 0.520095 -7.51294 < 10−6

9 5 2 10 5 2 0.289333 0.523337 -7.95262 < 10−6

10 5 2 11 5 2 0.239105 0.441684 -7.78903 < 10−6

11 5 3 10 5 4 0.186329 0.361711 -7.50794 < 10−6

12 5 3 11 5 4 0.155143 0.321284 -8.13435 < 10−6

13 6 2 10 5 4 0.127776 0.236332 -7.85166 < 10−6

14 6 2 11 5 4 0.104767 0.205500 -7.64787 < 10−6

15 6 2 12 5 3 0.042171 0.089474 -7.77267 < 10−6

16 4 2 11 6 1 0.715381 1.220326 -8.08146 < 10−6

17 4 3 11 6 1 0.291562 0.586668 -7.50107 < 10−6

18 5 2 8 6 2 0.335852 0.626668 -7.53221 < 10−6

19 5 2 9 6 2 0.343976 0.656074 -7.96115 < 10−6

20 5 2 10 6 2 0.340214 0.671805 -7.76601 < 10−6

21 5 2 11 6 2 0.279033 0.578511 -7.99355 < 10−6

22 5 2 12 6 4 0.114148 0.249426 -8.11884 < 10−6

23 5 3 11 6 4 0.129376 0.301711 -7.62095 < 10−6

24 6 2 8 6 4 0.148014 0.289363 -8.05803 < 10−6

25 6 2 9 6 4 0.156486 0.300000 -7.77688 < 10−6

26 6 2 10 6 4 0.152143 0.311047 -8.07683 < 10−6

27 6 2 11 6 4 0.120957 0.271668 -7.57933 < 10−6

28 6 2 12 6 3 0.049629 0.118158 -8.21823 < 10−6

Table 2:

26

Page 27: School of Electronics and Physical Sciences, University of

Scan left out Common features Misclassified casesnumber % NRM to SCH SCH to NRM

none 28 100 N20 S38N01 28 100.0 N20 S38N02 25 89.3 N20 S38N03 25 89.3 N20 S38N04 26 92.9 N20 S38N05 24 85.8 N20 S38N06 25 89.3 N20 S38N07 27 96.4 N20 S38N08 24 85.7 N20 S26,S38N09 24 85.7 N20 S38N10 24 85.7 N20 S38N11 27 96.4 N20 S38N12 25 89.3 N20 S38N13 25 89.3 N20 S38N14 22 78.6 N20 S38N15 26 92.9 N20 S38N16 27 96.4 N20 S38N17 28 100.0 N20 S38N18 23 82.1 N20 S38N19 25 89.3 N20 S38N20 18 64.3 N20 S38N21 27 96.4 N20 S38S22 25 89.3 N20 S38S23 26 92.9 N20 S38S24 25 89.3 N20 S38S25 19 67.7 N20 S25,S36,S38S26 24 85.7 N20 S26,S36,S38S27 22 78.6 N20 S38S28 27 96.4 N20 S38S29 24 85.7 N20 S38S30 24 85.7 N20 S38S31 22 78.6 N20 S38S32 27 96.4 N20 S38S33 25 89.3 N20 S38S34 24 85.7 N20 S38S35 27 96.4 N20 S38S36 23 82.1 N20 S38S37 25 89.3 N20 S38S38 19 67.9 N20 S38S39 24 85.7 N20 S38S40 16 57.1 N20 S38

Mean 86.9

Table 3:

Page 28: School of Electronics and Physical Sciences, University of

Figure Captions

Figure 1: Box and whiskers plots of the features computed from slices of the data that corre-

spond to slices 1–12 of the Talairach and Tournoux brain atlas.

Figure 2: An example of two characteristic voxel pairs that correspond to features with t > 7.5

(see table 2). Pair “a” illustrates feature number 3 of the table, with g(i) = 4, g(j) = 2, a(i, j) = 1

(ie in the range 0–15 degrees), d(i, j) = 4 (ie 3.4mm). Pair “b” would have contributed to feature 26,

as it has g(i) = 6, g(j) = 2, a(i, j) = 10 (ie in the range 135–150 degrees), and d(i, j) = 6 (ie 5.1mm).

Figure 3: Tree classification using the elements of the co-occurrence matrix w with |t| value

greater than 7.5, constructed from the data that correspond to slices 1–12 of the Talairach and

Tournoux brain atlas. Along the horizontal axis the subjects (N=normal, S=schizophrenic) are

arranged in the clusters created by the classifier.

Figure 4: Scattergram in 2D that corresponds to the classification of figure 3. The vertical

and horizontal axes are chosen automatically so that the distances between the points plotted are

preserved as much as possible when the points are projected from the high dimensionality space to

the 2D space.

Figure 5: (a): Some original slices from a schizophrenic scan. (b): The localities from which

the characteristic features arose. They are mainly concentrated in the sulci windings. (c): The

characteristic localities superimposed on the original slices.

Page 29: School of Electronics and Physical Sciences, University of

Captions of tables

Table 1: Features F1, F2 and F3 computed from the orientation histogram for three experiments

where data referring to the whole brain, inferior half of the brain, or inferior quarter of the brain

are used. The data that refer to “the inferior half” of the brain are identified as the slices that

correspond to slices 1–24 of the Talairach and Tournoux brain atlas, while the “inferior quarter”

refers to the slices that correspond to slices 1–12 of the same atlas. Alongside the t and p values

the mean and standard deviations of the distribution of values for each population are also given.

NRM=normal, SCH=schizophrenic, STD=standard deviation.

Table 2: The mean values of certain elements of matrix w over the two populations, and their

corresponding t and p values when used as class discriminators. The number in the first column

is just an identity number of the matrix element with no other meaning. g(i) and g(j) are the

gradient strengths of the pair of voxels considered, in quantisation units from 0 to 8 (maximum).

a(i, j) is the relative angle between the two vectors, measured in units of 15o. d(i, j) is the distance

between the two voxels, measured in units of 0.856mm. C is the cluster of features to which the

particular feature belongs. NRM=normal, SCH=schizophrenic.

Table 3: Results of the leave-one-out classification. First column: the subject that is left out.

In bold the cases when this subject was misclassified on the basis of the best 28 features computed

from the remaining 39 subjects. Second column: the number of features (out of the 28 best) that

are also among the best 28 features when no subject is left out. Third column: the percentage of

the 28 best features that are also among the best 28 features when no subject is left out. Fourth

column: Normal subjects misclassified as schizophrenic when the best 28 features, computed from

the 39 subjects, are used. Schizophrenic subjects misclassified as normals when the best 28 features,

computed from the 39 subjects, are used. NRM=normal, SCH=schizophrenic.