school of electronics and physical sciences, university of
TRANSCRIPT
Detection of structural differences between the brains of schizophrenic patients and
controls
Vassili A Kovalev∗1, Maria Petrou∗ and John Suckling+
(*)School of Electronics
and Physical Sciences,
University of Surrey,
Guildford GU2 7XH, United Kingdom
(+)Institute of Psychiatry, Guy’s, King’s and St Thomas’ Medical School,
London, United Kingdom
Corresponding author:
Prof M Petrou,
School of Electronics
and Physical Sciences,
University of Surrey,
Guildford GU2 7XH,
United Kingdom
Tel: +44 1483 689801,
Fax: +44 1483 686031
email: [email protected]
1Present address: Max-Planck Institute of Cognitive Neuroscience, Stephanstrasse 1A, D-04103 Leipzig, Germany
1
Abstract
This paper investigates the validity of the null hypothesis:
There are no structural differences between the brains of schizophrenics and normal controls that
manifest themselves in MRI-T2 data and distinguish the two populations in a statistically significant
way. The data used refer to 21 schizophrenic patients and 19 normal controls, matched for age, sex
and social background. The methodology used is based on 3-dimensional texture analysis which is
used to quantify anisotropy in the data at scales of the order of a few millimetres. These data reject
the null hypothesis. In addition, this paper attempts to identify the regions of the brain which
are responsible for the morphological characteristics that distinguish the two populations. For this
purpose, it utilises a second texture analysis method that, in spite of being a global method, allows
one to trace back to the data the origin of the features that most distinctly distinguish the two
populations. This method indicates that the features that distinguish the two populations with p
values smaller than 10−6, are located in the most inferior part of the brain and in particular in the
tissue that makes up the sulci. It is stressed that in order to preserve the integrity of the data for
texture calculations, no registration of anatomical structures is performed, and the most inferior
part of the brain is identified as referring to those slices of the scans that visually correspond to
slices 1–12 of the Talairach and Tournoux brain atlas.
Key words: schizophrenia, magnetic resonance imaging, brain morphology
2
1 Introduction
Although the aetiology of schizophrenia is not fully understood, it is becoming increasingly clear that
the symptoms of schizophrenia have their origins in disordered brain chemistry and diffuse cortical
abnormalities and that the condition is associated with certain morphological brain characteristics
(eg Okazaki 1998, Pfefferbaum et al 1999, Supprian et al 1997).
Prior to modern imaging techniques anatomic studies of schizophrenia were limited to post-
mortem measurements. Computerised tomography (CT) and more recently magnetic resonance
imaging of the brain (MRI) have caused a renaissance in the investigation of in-vivo cortical anatomy.
Analysis has commonly proceeded using region-of-interest (ROI) investigations involving the manual
delineation of a particular structure hypothesised a-priori to be statistically different between a
group of patients and a matched group of control subjects. Results from such studies indicate that
whole brain volume is reduced in patients relative to controls by around 3% (Lawrie and Abukmeil
1998) and that many specific areas including the temporal and frontal lobes, hippocampus and
thalamus, are similarly affected (for example Nelson et al 1998, Gur et al 1998). The ROI approach
has a number of potential problems. First, inter- and intra-operator reproducibility can affect the
power to detect differences. Second, neuroanatomic boundaries are often indistinct and radiological
definitions of structures may vary making cross-study comparisons difficult. Finally, and perhaps
most importantly, the need to initially specify which areas are of interest may preclude assessment of
other regions involved in the condition. In contrast, voxel-based analyses can encompass the entire
brain negating the necessity for prior hypotheses. Furthermore, if the algorithms are automated,
reproducibility and compatibility are ensured. An example of a voxel-based study which pointed
out structural changes in the brains of schizophrenics was that of Sigmundsson et al, 2001, which
was based on segmenting the MRI data into component tissue maps. An earlier example was that
of Gaser et al, 1999, which measured the deformation needed to register different MRI brain scans
to a reference brain.
In addition, a lot of studies which were based on magnetic resonance imaging, have been con-
cerned with the differences in relaxation times between the two groups. Several other studies were
concerned with the qualitative and quantitative study of signal hyperintensities (eg Sanchev et al,
3
1999 and Sanchev and Brodaty, 1999). In this study we quantify signal hyperintensities making use
of some recently developed texture analysis techniques in the field of image processing, appropriate
for the analysis of 3D data (Kovalev et al, 1996, and Kovalev and Petrou, 2000.) In particular,
we are concerned with the quantification of textural morphological anisotropies in the brains of
schizophrenic patients, and the identification of the regions of the brain where these characteristics
differ from those of the normal controls. Previous studies of anisotropy (eg Pfefferbaum et al, 1999)
have concentrated on anisotropies at scales of a few microns. Our study refers to the scale of the
imaging resolution, ie to scales of the order of millimetres. Pioneering work of texture analysis for
MRI data has been reported by Freeborough and Fox 1998. Their study, however was restricted to
2D analysis, while the analysis performed here is fully 3D.
The null hypothesis which this study is testing is: There are not any structural differences
between the brains of schizophrenics and normal controls that manifest themselves in MRI-T2 data
and distinguish the two populations in a statistically significant way.
2 Materials and Methods
2.1 Data Description
All subjects were scanned with a 1.5-T GE Signa (GE medical systems Milwaukee) at the Maudsley
Hospital, London. Proton density and T2 weighted images were acquired with a dual-echo fast spin-
echo sequence (TR = 4000ms, TE = 20.85ms). Contiguous interleaved images were calculated
with 3mm slice thickness and 0.856mm × 0.856mm in plane pixel size in the axial plane parallel
to the intercommissural line. The subjects were 21 normal controls and 19 schizophrenics. The
two groups were matched for age and social class as defined by the occupation of the head of the
household at the time of birth. (Mean age for controls 31.5, with a standard deviation of 5.9. Mean
age for patients 33.7 with a standard deviation of 6.9. Social class 1–3, contols 18/21 and patients
16/19.) The mean premorbid IQs for both groups were estimated by the National Adult Reading
Test with the normal controls having slightly higher IQ. (Mean IQ of schizophrenics 101.5 with
a standard deviation 11.3 and mean IQ for normal controls 107.4 with a standard deviation 9.6.)
The schizophrenics had an average of 13.1 education years with standard deviation 6.0, while the
4
normal controls had an average of 14.3 education years with standard deviation 2.5. All patients had
been taking typical antipsychotic medication for more than 10 years. All subjects were male and
righthanded. All subjects satisfied the following criteria: no history of alcohol or drug dependence,
no history of head injury causing loss of consciousness for one hour or more and no history of
neurological or systemic illness. The normal controls in addition satisfied the criterion of having no
DSM-IV axis I disorder.
The dual-echo sequence used for image acquisition is that commonly acquired as part of neurolog-
ical examinations at the Maudsley Hospital, London, UK. The sequence parameters were optimised
to achieve maximum separation of clusters representing the tissues of the brain and the feature
space formed by the voxel intensities of the two echoes (Simmons et al., 1994).
In all cases the images were preprocessed so that only the brain parenchyma was extracted for
further analysis (Suckling et al, 1999).
Each scan consists of 32 slices. Initially all slices referring to the same subject were analysed
together, thus resulting in global brain characterisation for each subject. Separate analyses were
performed using the slices that refer to the inferior half and the inferior quarter of each brain.
The registration of the data with a brain atlas was avoided in order to avoid any distortion caused
by re-sampling and interpolation that any such registration necessarily entails, especially since the
sizes of the brains of the different subject were different and we are interested in quantifying the
micro-structural properties of the data. So, the inferior 50% or the inferior 25% of each brain was
identified by visually identifying the two slices which are most anatomically similar to slice 24 and
slice 12 of the Talairach and Tournoux, 1988, anatomical atlas, respectively.
2.2 Data analysis
We use two techniques to study the textural properties of such data: 3D orientation histograms
as described in Kovalev and Petrou 2000, and generalised co-occurrence matrices as described in
Kovalev and Petrou 1996. Both techniques will be used to test for the null hypothesis made in
the introduction. Both techniques are global techniques, ie they construct features from the whole
region of interest. However, the second technique has the extra advantage that it allows one to
identify the localities in the data which are responsible for the measurements that characterise each
5
group.
Method I
According to the first method, one first has to compute the gradient vector of the T2 intensity
at each voxel position by convolving the data with an appropriate kernel. The Zucker and Hummel
(1981) filter can be used for this purpose. This filter is 3× 3× 3 in size, and is represented here by
its three cross-sections orthogonal to the direction of convolution:
1√3
1√2
1√3
1√2
1 1√2
1√3
1√2
1√3
0 0 00 0 00 0 0
− 1√3
− 1√2
− 1√3
− 1√2
−1 − 1√2
− 1√3
− 1√2
− 1√3
(1)
The convolution of the image with this mask along the three axes produces the three components
of the gradient vector at each voxel. Next, one has to consider equal solid angles in all directions in
3D and count how many voxels have gradient vectors in each solid angle. These solid angles are the
so called “bins” of the 3D orientation histogram. To define these solid angles one considers a sphere
with unit radius (unit sphere). If the surface of the sphere is divided into patches of equal area, all
solid angles sustained by the centre of the sphere and each one of these patches are equal. One way
to create patches of equal area on the surface of the unit sphere is to divide the azimuthal angle
φ (measured along the equator of the unit sphere) and the height z above or below the equatorial
plane of the unit sphere into equal segments. The division of 0 ≤ φ < 3600 into M equal intervals
and the division of −1 ≤ z ≤ 1 in N equal segments results in (N − 2) × M spherical quadrangles
and 2M spherical triangles, all sustaining the same solid angle of 4π/(NM). Then, a gradient
vector (a, b, c) belongs to bin (i, j) if the following two conditions are met:
2π
Mi ≤ φ <
2π
M(i + 1), where sinφ =
b√a2 + b2
and cosφ =a√
a2 + b2(2)
−1 +2
Nj ≤ c̄ < −1 +
2
N(j + 1), where c̄ =
c√a2 + b2 + c2
(3)
This way the 3D orinetation histogram of the data is created. It can be visualised by being
plotted as a 3D graph, where along each direction the radius is proportional to the number of
gradient vectors identified in that direction. If the data are totally isotropic, this 3D structure is
6
expected to be a sphere. The way this structure differs from a sphere depends on the anisotropy of
the data. By quantifying the shape of this structure, one may quantify the anisotropy in the data.
To compare then the two populations, one has to compare the distributions of these numbers over
the two population samples. The three numbers (features) defined below are some of many that
may be used to characterise the shape of the orientation histogram (Kovalev and Petrou, 2000):
• Anisotropy Coefficient:
F1 =Hmax
Hmin
, (4)
where Hmin 6= 0 and Hmax correspond to the minimum and the maximum values of the
orientation histogram, respectively.
• Integral anisotropy measure:
F2 =
√
√
√
√
∑Ni=1
∑Mj=1 (H(i, j) − Hm)2
NM, (5)
where Hm is the mean value of the histogram, H(i, j) is the value at orientation (i, j) and
N × M is the total number of distinct orientations considered. This integral feature can
be considered as the standard deviation of the distribution of the gradient vectors over the
various orientations.
• Local mean curvature:
F3 =
√
√
√
√
√
∑Ni=1
∑M−1j=2
(
H(i, j) − 1
4(H(i − 1, j) + H(i + 1, j) + H(i, j − 1) + H(i, j + 1))
)2
N(M − 2)(6)
This local mean curvature is expressed by the average value of the Laplacian calculated for
all orientations of the histogram.
Each of these numbers was tested using the package STATISTICA for its ability to discriminate
between the two populations with a student’s t-test.
7
Method II
The second method used counts the frequency with which a certain combination of characteristic
values appears in the data in the same relative position. In particular, one counts the frequency
with which a particular combination of intensity gradient magnitudes appears with the same relative
orientation and at the same relative distance from each other. In image processing terminology,
this is a generalised co-occurrence matrix, denoted as
w[g(i), g(j), a(i, j), d(i, j)] (7)
where w is the frequency of occurrence of a voxel pair (i, j) with gradient magnitude g(i) and g(j)
respectively, an angle a(i, j) between their gradient vectors and at distance d(i, j) from each other.
More details for the particular implementation of both these methods can be found in the
Appendix.
3 Results
Several preliminary experiments (Kovalev and Petrou, 2001) were conducted which are not reported
in detail here. From those preliminary experiments we were guided to narrow our reported results
as follows:
• Best results could be obtained from MRI-T2 images and not from MRI-PD images, so we do
not present any results from the MRI-PD data.
• The two classes could not be discriminated well if all gradient vectors with reliable orientation
estimates were used. Some thresholding of the gradient magnitudes was necessary. Empiri-
cally, the best results were obtained when only gradient vectors stronger than 75 units were
retained. (The range of gradient values is [0 − 600].) This is a very significant conclusion,
because such strong gradients roughly correspond to the cortical surface.
• The two groups could not be discriminated using features computed from the slices from the
8
most superior 25% of the brain, ie slices of the scans that correspond to slices 25 and above
to the Talairach and Tournoux brain atlas.
As a consequence of the above preliminary conclusions, a series of three experiments were per-
formed using orientation histograms. In the first experiment whole brains were analysed. In the
second experiment the inferior half of the brain was used, and in the third only the inferior quarter
25%.
Results of Method I
In all cases the strongest gradient vectors were used, with magnitude greater than 75 units. In
all cases the orientation histogram was constructed according to the details given in section 2 and
the Appendix.
Table 1 gives the mean and standard deviation value of each of the three features for each of the
experiments performed (namely whole brain, half brain and most inferior part of the brain) and for
each of the two groups of subjects. The t and p values for each feature are also given in this table.
The p values of the features which can be used as class discriminators with confidence level higher
than 95% are highlighted in the table. It can be seen that all three features are good discriminators
when they are computed from the orientation histogram that corresponds to the inferior part of the
brain. The box and whiskers graphs of population discrimination with the help of these features
are shown in figure 1.
These results indicate that the null hypothesis may be refuted. However, they do not identify the
particular parts of the brain which are morphologically different in the two populations, apart from
the fact that the discrimination is statistically more significant when the data slices that correspond
to slices 1–12 of the Talairach and Tournoux brain atlas are used.
Having refuted the null hypothesis for this set of data, one may use the second method to iden-
tify which exact localities of each brain most characterise the classification of the brain in either of
the two populations.
9
Results of Method II
Various experiments were conducted using the co-occurrence matrix w described in section 2.
In each experiment each element of the co-occurrence matrix was tested as a class discriminator
according to the t−test. Feature selection was performed by thresholding the t values using various
thresholds.
The best results were obtained when the region of interest was restricted to be that which
corresponds to the first 12 slices of the Talairach and Tournoux brain atlas (“most inferior part”).
Table 2 shows the best features from these experiments. They are features with t > 7.5. The p
value for all these features has its first significant figure beyond the 6th decimal point. As explained
in the Appendix, the measured quantities used to construct matrix w are quantised in a few bins.
The columns of the table identified by g(i) and g(j) give the quantised gradient values of the pairs
of voxels that make up each feature. As the maximum number of bins used for gradient strength is
8, all these numbers are out of 8. The column marked a(i, j) gives the relative angle between the
gradient vectors of the voxels of each pair measured in units of 15o (since that is the width of the
quantisation bins used). The column marked d(i, j) gives the relative distance between the voxels
of each pair measured in units of 0.856mm which is the width of each quantisation bin used.
To understand the significance of these features one may consider the example shown in figure 2.
Three voxels are identified there, as one zooms in some part of the image. Each voxel has associated
with it a white arrow, representing its gradient vector. Let us consider the two voxels marked with
letter “a” as forming a pair. The relative orientation a(i, j) of their associated gradient vectors
is small, as these vectors are almost parallel, and so the pair will have a(i, j) = 1. The gradient
magnitude of the voxel on the left is g(i) = 4, and that of the voxel on the right is g(j) = 2. The
distance of the two voxels is d(i, j) = 4 (ie 4 × 0.856 = 3.4mm). So, this pair will be counted as
a pair contributing to feature number 3 of table 2, since it has g(i) = 4, g(j) = 2, a(i, j) = 1 and
d(i, j) = 4. One may also combine the voxels marked with “b” to form a pair. These two voxels
have their gradient vectors in almost opposite direction, so their relative angle is large, and it turns
out to be in the range 135o−150o, which places this pair into the quantisation bin with a(i, j) = 10.
The gradient magnitude of the voxel at the top is g(i) = 6, and the distance of these two voxels is
10
d(i, j) = 6 (ie 6× 0.856 = 5.1mm). This pair therefore contributes to feature number 26 of table 2.
The features measured by this method are the relative abundances of all possible pairs of this
kind.
One can use these features to perform classification of all the subjects using the tree clustering
method of package STATISTICA. In addition, this package allows the projection of these clusters
in a 2D dimensional space for visualisation. The tree of figure 3 and the scattergram of figure 4
indicate that only 2 subjects were misclassified.
It seems therefore, that this method also rejects the null hypothesis. One may work backwards
now, to identify which voxels contributed to pairs that were counted when constructing the elements
of the co-occurrence matrix that could act as such good population discriminators. This way, one
may identify the localities of the brain that particularly characterise the schizophrenic subjects
in this study. All one has to do is to keep the identity of the pairs of voxels that were counted
in each bin of matrix w (Kovalev and Petrou, 1996). Figure 5 shows some slices of the brains
of schizophrenic subjects with the parts of the brain that contribute to the characterisation as
schizophrenic brains highlighted in green.
There is an important point that must be questioned with respect to these results:
It is well known from statistics that if one has at one’s disposal hundreds of features that can
be used as discriminators and only 40 cases to discriminate into two classes, some of these features
may discriminate the cases into the right classes by chance. It is necessary, therefore, to examine
the validity of the identified features as genuine features. There are two ways by which the validity
of the presented results might be tested:
• Use of the “leave-one-out” method for testing classification consistency: The best features
were chosen on the basis of the 39 cases, and the 40th case was classified using the pre-defined
feature. In all cases the best 28 classifying features were selected ranked according to their
t values. The identity of the selected features was compared with the features selected when
all cases were used, and the number of identical features were identified. The process was
repeated for every single case separately. Table 3 shows these results. In the first column
the case which is left out is identified. It is highlighted if it were misclassified. One can see
11
that the average classification error is 4 out of 40 cases, ie 10%. The second column gives
the number of features among those selected that were also selected when all 40 cases were
considered together. This number is out of 28 and the third column gives their percentage
out of 100. The mean value of stable persisting features from experiment to experiment is
86.9%. The selection of features in each case was done on the basis of a fixed number, and
it was not optimised in order to reduce the misclassified cases in the training set. So, apart
from the case that is used for testing, some other cases may also be misclassified. The 4th
column identifies the cases that were classified as schizophrenic while they were normal, and
the 5th column the cases that were schizophrenic and were misclassified as normal. In almost
all experiments the two outliers cases were subjects N20 and S38.
• If the identified features are true, they must refer to real structures on the brain, and therefore
they must be strongly correlated, and robust: perturbing the computational parameters should
not change the results. The robustness of identification of the characteristic brain localities
was tested by varying the criterion with which the discriminating features were selected. The
results of this analysis remained the same even when the threshold value of t used to identify
the features was changed by 1 to 3 units. Correlation analysis showed that the elements of the
co-occurrence matrix constructed are strongly correlated and they form four groups. Change
of threshold value of t simply selects a smaller number of representative features from each of
the four groups, without affecting the final result. The correlation coefficient between pairs of
features was found to vary between 0.79 and 0.99, with a mean 0.92 and a standard deviation
of 0.04. The column marked with C in table 2 indicates to which cluster of features each
feature belongs. Note that these clusters of features are not anatomically coherent. They
rather represent groups of features that can be characterised qualitatively by expressions
like: “Pairs of voxels that have strong gradient values and large relative orientations and are
at (relatively) large distances from each other are more abundant in schizophrenics than in
normal controls.” This sentence for example, describes all features of class C=4, which as it
can be seen from table 2, have higher mean value in schizophrenics than in controls.
12
4 Conclusions and Discussion
The results presented in the previous section reject the null hypothesis, namely they support the
conclusion that there are morphological differences between the brains of the schizophrenic subjects
and the normal controls, which manifest themselves as different types of texture anisotropy at scales
of a few millimetres in MRI-T2 scans, and in particular in the parts of the scans that correspond to
slices 1–12 of the Talairach and Tournoux brain atlas. In addition, the second analysis method used
could identify the regions of difference in the parts of the brain that correspond to slices 1–12, and
localised them in the cortical surface, in the volume of the tissue that makes up the sulcal windings.
The T2 weighted images that we have analysed essentially discriminate between CSF and brain
parenchyma. Although we avoided gradients that correspond to the sulcal and ventral surface, and
looked only at gradients in the volume of the sulcal tissue, we are still picking up the effects of the
anatomy of that surface. The features which seem to characterise the group of schizophrenics are
localised in the vicinity of that surface, and the analysis of the nature of these features reported in
table 2 and exemplified in figure 3, indicates that the excess of pairs of voxels with strong gradient
values and large relative orientations in the patients relative to the controls (higher mean values of
these features in the schizophrenics than in the normals) indicates an enlargement of the volume
of these structures. The enlargement of ventricular volumes in schizophrenic patients (Johnstone
et al., 1976) was the earliest finding from a neuroimaging study. Our findings are consistent with
this and lend some face validity to the techniques described. Associated sulcal CSF increases have
been reported much less frequently (Nopoulos et al., 1997; Cannon et al., 1998), probably due to
the labour intensive methods employed. The origin of these increases could be attributed to either
sulcal widening or generalised atrophy; our results appear to suggest the former, although they
do not exclude the latter. So, our results concur with the general hypothesis that schizophrenia
manifests itself neurologically as generalised changes to a wide number of brain regions and circuits.
We do not consider our method to be a new diagnostic tool for schizophrenia as there is overlap
with results of normal population and the specificity of our results to schizophrenia among other
psychiatric disorders has not been established. Besides, there is no need of an MRI scan to diagnose
schizophrenia. The important thing about this and other similar studies is to show that there are
13
neurological differences between patient and control populations. Only recently as neuroimaging
techniques have become available, has the extent and range of differences in the brain structure
become apparent. It now appears that schizophrenia is a multi-system problem manifesting itself
not only in volume reductions of specific areas of the brain (McCarley et al., 1999; Lieberman
et al., 2001) and changes in sulcal anatomy, but also in sulcal tissue structure that we appear to
demonstrate here. Neurophysiological and Neuroanatomical indices like the ones used in studies of
this type could contribute to the investigations of the aetiology of schizophrenia and the development
of vulnerability markers for the condition (Kasai et al., 2002).
Acknowledgements: This work was made possible with the help of INTAS grant 96-785 from
the European Union.
Appendix: Technical details
The anisotropic sampling of the data along the intra-slice direction (Z axis) is handled by
appropriate scaling of the metric used to compute distances. The scaling factor in our case is
3/0.856 = 3.5. This scaling factor is used in all experiments that involve metric relationships
without appearing explicitly in the equations.
The orientation changes between the scans of different subjects do not affect the results. In
the first method, the orientation bins used to quantise the different orientations were chosen so
that the inter-subject orientation difference was less than half the bin width, so that the effect
of such variations was eliminated. Besides, the comparisons made refer to comparisons between
features computed from the whole texture representations, and as such they are independent of
global orientation effects (Kovalev and Petrou 1999).
The values of the elements of the co-occurrence matrix constructed are independent of rotation
and translation of the data, and so they are robust to the exact positioning of the subject in the
device used to capture the data (Kovalev and Petrou 1996).
In Method I the cones of equal solid angle needed were constructed by dividing the diameter of
the unit sphere along the Z axis into 13 (N = 13) equally sized intervals, and the azimuthal angle φ
into 13 (M = 13) equally sized intervals. All orientation histograms were normalised so that their
14
bin values could sum up to MN = 169.
In Method II, in order to be able to use the elements of matrix w to characterise an object by
the relative abundance of certain structures, one must also normalise first the matrices. As the data
are distributed over a finite volume, within the same object, certain relative distances will be more
common than others. To take account of this effect, matrix w is normalised so that all its elements
that refer to the same distance value d sum up to 1. This way the elements of the matrix express
the relative abundance of certain characteristics that occur at a fixed distance apart.
The distance values d used are all the relative distances present in the data, up to a maximum
distance. They are quantised to a small number of values. The numbers of discrete quantisation
levels used for all quantities needed in the construction of matrix w are the main parameters of
the method. For the gradient magnitude, we experimented with the number of bins varying in the
range [5, 16]. It was found that no significant difference in the results was observed as the number
of bins changed. So, in all the experiments reported the value of 8 was used. The width of each bin
was 75 units. The relative distance of the paired voxels was restricted in the range [1, 6], ie from
0.856mm to 5.136mm. Note that as the inter-slice distance is 3mm, distances smaller than this value
examine only co-occurrences within the same slice and inter-slice correlations are not used. Again,
experiments showed that pairing voxels at relative distances greater than 5.136mm was creating
very “noisy” bin values as far as the t-test was concerned. The relative orientation between the
gradient vectors of the paired voxels was quantised into 12 bins, 15o wide each. This way the co-
occurrence matrix consisted of 2592 different elements: it is symmetric with respect to the gradient
magnitude (because it does not matter with what order we treat the gradient magnitudes of the
paired voxels) so that from its total number of 4608 elements, we have 4608−6×12×8
2+6×12×8 = 2592
independent elements.
15
References
Cannon, T. D., van Erp, T. G. M., Huttunen, M., Lonnqvist, J., Salonen, O., Valanne, L.,
Poutanen, V. P., Standertskjold-Nordenstam, C. G., Gur, R. E. and Yan M., 1998. Regional gray
matter, white matter and cerebrospinal fluid distributions in schizophrenic patients, their siblings
and controls. Archives of General Psychiatry, 55, 1084–1091.
Freeborough, P. A. and Fox, N. C., 1998. MR image texture analysis applied to the diagnosis
and tracking of Alzheimer’s disease. IEEE Transactions of Medical Imaging, 17, 475–479.
Gaser, C., Volz, H.-P., Kiebel, S., Riehemann, S. and Sauer, H., 1999. Detecting structural
changes in whole brain based on nonlinear deformation-application to schizophrenia research. Neu-
roimage 10, 107–113.
Gur, R. E., Maany, V., Mozley, P. D., Swanson, C., Bilker W. and Gur, R. C., 1998. Subcortical
MRI volumes in neuroleptic-naive and treated patients with schizophrenia. Am. J. Psych., 155,
1711–1717.
Johnstone, E. C., Crow, T. J., Frith, C. D., Husband, J. and Kreel, L., 1976. Cerebral ventric-
ular size and cognitive impairment in chronic schizophrenia. Lancet, 2, 924–926.
Kasai, K., Iwanami, A., Yamasue, H., Kuroki, N., Nakagome, K. and Fukuda, M., 2002. Neu-
roanatomy and Neurophysiology in Schizophrenia. Neuroscience Research, 43, 93–110.
Kovalev V. A. and Petrou, M., 1996. Multidimensional co-occurrence matrices for object recog-
nition and matching. Graphical Models and Image Processing, 58, 187–197.
Kovalev, V. A., Petrou, M. and Bondar, Y. S., 1999. Texture anisotropy in 3D images. IEEE
Transactions on Image Processing, 8, 346–360.
16
Kovalev, V. A. and Petrou, M., 2000. Chapter 15: Texture analysis in Three Dimensions as a
cue to Medical Diagnosis, in Handbook of Medical Imaging, Processing and Analysis, I Bankman,
editor, Academic Press, ISBN 0-12-077790-8, 231–247.
Kovalev, V. A. and Petrou, M., 2001. Analysis and classification of 3D MR Images of Schizophre-
nia and Norm. Technical Report, VSSP-TR-3/2001. University of Surrey.
Lawrie, S. M. and Abukmeil, S. S., 1998. Brain abnormality in schizophrenia. A systematic and
quantitative review of volumetric magnetic resonance imaging studies. Br. J. Psych., 160, 179–186.
Lieberman, J., Chakos, M., Wu, H., Alvir, J., Hoffman, E., Robinson, D., Bilder, R., 2001.
Longitudinal Study of Brain Morphology in First Episode Schizophrenia. Biological Psychiatry, 49,
487–499.
McCarley, R. W., Wible, C. G., Frumin, M., Hirayasu, Y., Levitt, J. J., Fischer, I. A. and
Shenton, M. E., 1999. MRI Anatomy of Schizophrenia. Biological Psychiatry, 45, 1099–1119.
Nelson, M. D., Saykin, A. J., Flashman, L. A. and Riordan, H. J., 1998. “Hippocampal vol-
ume reduction in schizophrenia as assessed by magnetic resonance imaging: A meta-analytic study.
Arch. Gen. Psych., 55, 433–440.
Nopoulos, P., Flaum, M. and Andeasen, N. C., 1997. Sex differences in brain morphology and
schizophrenia. American Journal of Psychiatry, 154, 1648–1654.
Okazaki, Y., 1998. Morphological brain imaging studies on major psychoses. Psychiatry and
Clinical Neurosciences, 52 (Suppl), S215–S218.
Pfefferbaum, A., Sullivan, E. V., Hedehus, M., Moseley, M. and Lim, K. O., 1999. Brain gray
17
and white matter transverse relaxation time in schizophrenia. Psychiatry Research: Neuroimaging,
91, 93–100.
Sanchev, P., Cathcart S., Shnier R., Wen W. and Brodaty, H., 1999. Reliability and validity
of ratings of signal hyperintensities on MRI by visual inspection and computerised measurement.
Psychiatry Res, 92, 103–115.
Sanchev, P. and Brodaty, H., 1999. Quantitative study of signal hyperintensities on T2-weighted
magnetic resonance imaging in late onset schizophrenia. American Journal of Psychiatry, 156, 1958–
1967.
Sigmundsson, T., Suckling J., Maier M., Williams S., Bullmore E. T., Greenwood K. E., Fukuda
R., Ro M. A. and Toone B. K., 2001. Structural abnormalities in frontal, temporal and limbic re-
gions and interconnecting white matter tracts in schizophrenic patients with prominent negative
symptoms. American Journal of Psychiatry, 158, 234–243.
Simmons, A., Arridge, S. R., Barker, G. J., Cluckie, A. J. and Tofts P. S., 1994. Improvements
in the quality of MRI cluster analysis. Magnetic Resonance Imaging, 12, 1191–1204.
Suckling, J., Brammer M. J., Lingford-Hughes A. and Bullmore E. T., 1999. Removal of extrac-
erebral tissues in dual-echo magnetic resonance images via linear scale-space features. Magnetic
Resonance Imaging, 17, 247–256.
Supprian, T., Hofmann, E., Warmuth-Metz, M., Franzek, E. and Becker, T., 1997. MRI T2 re-
laxation times of brain regions in schizophrenic patients and control subjects. Psychiatry Research:
Neuroimaging, 75, 173–182.
Talairach, J. and Tournoux, P., 1988. Co-planar stereotaxic atlas of the human brain. Thieme,
New York.
18
Zucker, S. W. and Hummel, R. A., 1981. A 3D Edge Operator. IEEE Trans on Pattern Analysis
and Machine Intelligence, 3, 324–331.
19
Figure 1:
20
Figure 2:
21
Figure 3:
22
Figure 4:
23
(a) (b) (c)
Figure 5:
24
Exp Feature mean-NRM mean-SCH STD-NRM STD-SCH t-value p-valueAll F1 4.84900 4.81167 0.515792 0.450872 0.23633 0.814513
slices F2 3.95925 3.84172 0.161860 0.243005 1.77120 0.084994F3 5.33906 5.49373 0.359787 0.341258 -1.35567 0.183647
Slices F1 10.42900 8.61667 2.891739 1.625184 2.34459 0.024682
1–24 F2 5.00575 4.58356 0.433552 0.430297 3.00795 0.004777
F3 8.65335 8.60354 0.894389 0.891148 0.17170 0.864638Slices F1 13.86350 10.70611 3.975026 2.733707 2.82093 0.007743
1–12 F2 5.86100 5.26967 0.585009 0.510092 3.30383 0.002164
F3 11.36615 10.57330 1.336450 0.830297 2.16698 0.036931
Table 1:
25
No g(i) g(j) a(i, j) d(i, j) C mean-NRM mean-SCH t-value p-value1 6 3 1 1 2 0.221586 0.493311 -8.11809 < 10−6
2 5 5 10 2 3 0.061619 0.117016 -7.62078 < 10−6
3 4 2 1 4 1 0.619648 1.041732 -7.80472 < 10−6
4 5 2 11 4 4 0.162662 0.286242 -7.63650 < 10−6
5 5 3 10 4 4 0.162690 0.299405 -7.84881 < 10−6
6 5 3 11 4 4 0.129267 0.251737 -7.79190 < 10−6
7 6 2 10 4 3 0.085971 0.160721 -7.81694 < 10−6
8 5 2 8 5 2 0.294752 0.520095 -7.51294 < 10−6
9 5 2 10 5 2 0.289333 0.523337 -7.95262 < 10−6
10 5 2 11 5 2 0.239105 0.441684 -7.78903 < 10−6
11 5 3 10 5 4 0.186329 0.361711 -7.50794 < 10−6
12 5 3 11 5 4 0.155143 0.321284 -8.13435 < 10−6
13 6 2 10 5 4 0.127776 0.236332 -7.85166 < 10−6
14 6 2 11 5 4 0.104767 0.205500 -7.64787 < 10−6
15 6 2 12 5 3 0.042171 0.089474 -7.77267 < 10−6
16 4 2 11 6 1 0.715381 1.220326 -8.08146 < 10−6
17 4 3 11 6 1 0.291562 0.586668 -7.50107 < 10−6
18 5 2 8 6 2 0.335852 0.626668 -7.53221 < 10−6
19 5 2 9 6 2 0.343976 0.656074 -7.96115 < 10−6
20 5 2 10 6 2 0.340214 0.671805 -7.76601 < 10−6
21 5 2 11 6 2 0.279033 0.578511 -7.99355 < 10−6
22 5 2 12 6 4 0.114148 0.249426 -8.11884 < 10−6
23 5 3 11 6 4 0.129376 0.301711 -7.62095 < 10−6
24 6 2 8 6 4 0.148014 0.289363 -8.05803 < 10−6
25 6 2 9 6 4 0.156486 0.300000 -7.77688 < 10−6
26 6 2 10 6 4 0.152143 0.311047 -8.07683 < 10−6
27 6 2 11 6 4 0.120957 0.271668 -7.57933 < 10−6
28 6 2 12 6 3 0.049629 0.118158 -8.21823 < 10−6
Table 2:
26
Scan left out Common features Misclassified casesnumber % NRM to SCH SCH to NRM
none 28 100 N20 S38N01 28 100.0 N20 S38N02 25 89.3 N20 S38N03 25 89.3 N20 S38N04 26 92.9 N20 S38N05 24 85.8 N20 S38N06 25 89.3 N20 S38N07 27 96.4 N20 S38N08 24 85.7 N20 S26,S38N09 24 85.7 N20 S38N10 24 85.7 N20 S38N11 27 96.4 N20 S38N12 25 89.3 N20 S38N13 25 89.3 N20 S38N14 22 78.6 N20 S38N15 26 92.9 N20 S38N16 27 96.4 N20 S38N17 28 100.0 N20 S38N18 23 82.1 N20 S38N19 25 89.3 N20 S38N20 18 64.3 N20 S38N21 27 96.4 N20 S38S22 25 89.3 N20 S38S23 26 92.9 N20 S38S24 25 89.3 N20 S38S25 19 67.7 N20 S25,S36,S38S26 24 85.7 N20 S26,S36,S38S27 22 78.6 N20 S38S28 27 96.4 N20 S38S29 24 85.7 N20 S38S30 24 85.7 N20 S38S31 22 78.6 N20 S38S32 27 96.4 N20 S38S33 25 89.3 N20 S38S34 24 85.7 N20 S38S35 27 96.4 N20 S38S36 23 82.1 N20 S38S37 25 89.3 N20 S38S38 19 67.9 N20 S38S39 24 85.7 N20 S38S40 16 57.1 N20 S38
Mean 86.9
Table 3:
Figure Captions
Figure 1: Box and whiskers plots of the features computed from slices of the data that corre-
spond to slices 1–12 of the Talairach and Tournoux brain atlas.
Figure 2: An example of two characteristic voxel pairs that correspond to features with t > 7.5
(see table 2). Pair “a” illustrates feature number 3 of the table, with g(i) = 4, g(j) = 2, a(i, j) = 1
(ie in the range 0–15 degrees), d(i, j) = 4 (ie 3.4mm). Pair “b” would have contributed to feature 26,
as it has g(i) = 6, g(j) = 2, a(i, j) = 10 (ie in the range 135–150 degrees), and d(i, j) = 6 (ie 5.1mm).
Figure 3: Tree classification using the elements of the co-occurrence matrix w with |t| value
greater than 7.5, constructed from the data that correspond to slices 1–12 of the Talairach and
Tournoux brain atlas. Along the horizontal axis the subjects (N=normal, S=schizophrenic) are
arranged in the clusters created by the classifier.
Figure 4: Scattergram in 2D that corresponds to the classification of figure 3. The vertical
and horizontal axes are chosen automatically so that the distances between the points plotted are
preserved as much as possible when the points are projected from the high dimensionality space to
the 2D space.
Figure 5: (a): Some original slices from a schizophrenic scan. (b): The localities from which
the characteristic features arose. They are mainly concentrated in the sulci windings. (c): The
characteristic localities superimposed on the original slices.
Captions of tables
Table 1: Features F1, F2 and F3 computed from the orientation histogram for three experiments
where data referring to the whole brain, inferior half of the brain, or inferior quarter of the brain
are used. The data that refer to “the inferior half” of the brain are identified as the slices that
correspond to slices 1–24 of the Talairach and Tournoux brain atlas, while the “inferior quarter”
refers to the slices that correspond to slices 1–12 of the same atlas. Alongside the t and p values
the mean and standard deviations of the distribution of values for each population are also given.
NRM=normal, SCH=schizophrenic, STD=standard deviation.
Table 2: The mean values of certain elements of matrix w over the two populations, and their
corresponding t and p values when used as class discriminators. The number in the first column
is just an identity number of the matrix element with no other meaning. g(i) and g(j) are the
gradient strengths of the pair of voxels considered, in quantisation units from 0 to 8 (maximum).
a(i, j) is the relative angle between the two vectors, measured in units of 15o. d(i, j) is the distance
between the two voxels, measured in units of 0.856mm. C is the cluster of features to which the
particular feature belongs. NRM=normal, SCH=schizophrenic.
Table 3: Results of the leave-one-out classification. First column: the subject that is left out.
In bold the cases when this subject was misclassified on the basis of the best 28 features computed
from the remaining 39 subjects. Second column: the number of features (out of the 28 best) that
are also among the best 28 features when no subject is left out. Third column: the percentage of
the 28 best features that are also among the best 28 features when no subject is left out. Fourth
column: Normal subjects misclassified as schizophrenic when the best 28 features, computed from
the 39 subjects, are used. Schizophrenic subjects misclassified as normals when the best 28 features,
computed from the 39 subjects, are used. NRM=normal, SCH=schizophrenic.