theory of mind and executive function impairments …...theory of mind and executive function...

Theory of Mind and Executive Function Impairments in Autism Spectrum Disorders and their Broader Phenotype:

Profile, Primacy, and Independence

Dana Wong, B.Sc. (Hons.)

This thesis is presented in partial fulfilment of the degree of

Doctor of Philosophy/Master of Psychology (Clinical Neuropsychology)

of the University of Western Australia

School of Psychology, 2004

ABSTRACT

Impairments in both theory of mind (ToM; the ability to attribute mental states to

oneself and others) and executive function (EF; a group of high-level cognitive

functions which help guide and control goal-directed behaviour) have been

demonstrated in individuals with autism spectrum disorders (ASDs). Both deficits have

been proposed by different groups of researchers as being the single primary cognitive

deficit of autism, which can subsume the other deficit as secondary or artefactual.

However, few studies have examined the nature of the relationship between ToM and

EF in ASDs or conducted a systematic investigation of their relative primacy. This

research principally sought to establish the primacy and independence of impairments in

ToM and EF in ASDs and thereby evaluate the validity of single versus multiple

primary deficit models of autism.

These aims were addressed in two studies, both broad in scope. The first study

was an investigation of the profile, primacy, and independence of ToM and EF

impairments in individuals with ASDs. The sample included 46 participants with ASDs

and 48 control participants matched on age and non-verbal ability. The profile of

impairments was examined by measuring ToM and a range of EF components using

tasks employing, wherever possible, process-pure indices of performance. Primacy was

measured by focussing on i) whether or not the deficits observed were universal among

individuals with ASDs; ii) whether the deficits were able to discriminate individuals

with ASDs from matched controls (i.e., predict group membership); and iii) the ability

of ToM and EF deficits to explain the full range of autistic symptomatology, as

measured by correlating cognitive performances with behavioural indices. The

relationship between ToM and EF impairments was investigated by conducting

correlations between ToM and EF variables as well as analysing the incidence of

dissociations between impairments in the two domains. The ASD group was found to

demonstrate significant impairments in ToM and several components of EF including

planning, verbal inhibition, working memory (in a context where inhibitory control was

required), and both verbal and non-verbal generativity. However, neither ToM nor EF

impairments were able to meet all of the criteria for a primary deficit in ASDs. EF

deficits were found to be more primary, but could not account for ToM as a secondary

deficit, as ToM and EF were found to be independent (i.e., uncorrelated and dissociable)

deficits in the ASD group. This pattern of results suggested that a multiple deficits

model involving at least two independent impairments appeared to best characterise

ASDs, but the data were compatible with several variants of such a model (e.g.,

involving distinct subtypes versus a multidimensional spectrum).

The second study was an investigation of ToM and EF impairments in siblings

of individuals with ASDs, who have previously been found to demonstrate a subclinical

“broad autism phenotype”. The main aims of this study were i) to identify whether

ToM or EF deficits could meet criteria for an “endophenotype” or vulnerability marker

for the autism genotype in unaffected relatives, which would have further implications

about the primacy of ToM and EF in ASDs; and ii) to further investigate the validity of

various multiple deficits models of ASDs by examining the pattern of ToM and EF

performance in those showing the broad phenotype. Participants were 108 siblings of

individuals with ASDs and 67 siblings of controls, tested on the same ToM and EF

tasks used in the first study. Confirming the superior primacy of EF deficits found in

Study One, there was no significant difference in ToM performance between ASD and

control siblings, but ASD siblings showed weaknesses on two measures of EF.

Furthermore, there appeared to be different subgroups of siblings demonstrating

different cognitive profiles, consistent with the heterogeneity evident in the first study.

This research indicated that ASDs cannot be explained by a single primary

cognitive deficit. These findings hold important theoretical and empirical implications

and highlight further questions about which type of multiple deficits model might best

explain ASDs.

TABLE OF CONTENTS

ABSTRACT............................................................................................... i LIST OF TABLES...................................................................................... vii LIST OF FIGURES.................................................................................... ix ACKNOWLEDGEMENTS......................................................................... x CHAPTER 1. General Introduction: Explaining Autism...................... 1 1.1 Autism: Diagnosis and epidemiology.................................................. 2 1.2 Explaining autism: The cognitive level of explanation........................ 4 1.3 Overview of the thesis........................................................................ 9 1.3.1 Rationale and aims................................................................... 9 1.3.2 Thesis structure........................................................................ 11 CHAPTER 2. Literature Review: Theory of Mind and Executive Function in Typical Development and in Autism..................................

2.1 Theory of mind (ToM) ........................................................................ 14 2.1.1 Defining and measuring ToM.................................................... 14 2.1.2 Models of ToM and its development......................................... 16 2.1.3 ToM in autism........................................................................... 21 2.2 Executive function (EF) ...................................................................... 32 2.2.1 Defining and measuring EF...................................................... 32 2.2.2 Models of EF and its development........................................... 36 2.2.3 EF in autism.............................................................................. 42 2.3 The ToM-EF relationship.................................................................... 54 2.3.1 Models of the ToM-EF relationship........................................... 54 2.3.1.1 Expression accounts.................................................... 55 2.3.1.2 Common conceptual requirements of ToM and EF..... 62 2.3.1.3 Emergence accounts................................................... 66 2.3.1.4 Common neuroanatomical bases for ToM and EF...... 72 2.3.2 The ToM-EF relationship in autism........................................... 78 CHAPTER 3. Selection and Description of Measures......................... 87 3.1 Diagnostic measures.......................................................................... 88 3.1.1 Autism Screening Questionnaire.............................................. 88 3.1.2 Autism Diagnostic Interview – Revised..................................... 89 3.2 IQ measures....................................................................................... 90 3.3 ToM measures.................................................................................... 90 3.3.1 Simple false belief task............................................................. 91 3.3.2 First-order false belief task....................................................... 92 3.3.3 Second-order false belief task.................................................. 93 3.3.4 Dewey stories........................................................................... 94

3.4 EF measures...................................................................................... 95 3.4.1 Tower of London....................................................................... 96 3.4.2 Intra-dimensional, Extra-dimensional Set-shifting task............. 99 3.4.3 Response Inhibition and Load task........................................... 104 3.4.4 Opposite Worlds....................................................................... 106 3.4.5 Relational Complexity............................................................... 107 3.4.6 Pattern Meanings...................................................................... 110 3.4.7 Uses of Objects........................................................................ 113 3.4.8 Stamps task.............................................................................. 114 3.5 Behavioural measures........................................................................ 116 3.5.1 Measures of repetitive behaviour.............................................. 116 3.5.1.1 Repetitive Behaviours Questionnaire.......................... 116 3.5.1.2 Repetitive Behaviours Interview.................................. 117 3.5.2 Measures of social behaviour and communication................... 121 3.5.2.1 Social Behaviour Questionnaire.................................. 121 3.5.2.2 Social and communication ADI-R domains................. 121 CHAPTER 4. Study One: Profile, Primacy, and Independence of Theory of Mind and Executive Function Impairments in Autism Spectrum Disorders................................................................................

4.1 Introduction......................................................................................... 124 4.1.1 Aims.......................................................................................... 124 4.1.2 Hypotheses............................................................................... 126 4.2 Method................................................................................................ 131 4.2.1 Participants............................................................................... 131 4.2.2 Procedure................................................................................. 134 4.3 Results................................................................................................ 136 4.3.1 Data screening.......................................................................... 136 4.3.2 Group comparisons on ToM and EF tasks............................... 136 4.3.2.1 False belief tasks......................................................... 139 4.3.2.2 Dewey Stories.............................................................. 141 4.3.2.3 Tower of London.......................................................... 142 4.3.2.4 IDED set-shifting task.................................................. 143 4.3.2.5 Response Inhibition and Load task.............................. 144 4.3.2.6 Opposite Worlds task................................................... 147 4.3.2.7 Relational Complexity.................................................. 149 4.3.2.8 Pattern Meanings......................................................... 149 4.3.2.9 Uses of Objects........................................................... 151 4.3.2.10 Stamps task............................................................... 153 4.3.2.11 Summary and effect sizes of group comparisons...... 154 4.3.3 Universality of ToM and EF deficits.......................................... 157 4.3.4 Ability of ToM and EF variables to predict group membership. 159

4.3.5 Behavioural measures: Group comparisons and derivation of indices used in correlational analyses......................................

4.3.5.1 Repetitive Behaviours Interview.................................. 162 4.3.5.2 Social and communicative functioning......................... 164 4.3.6 Correlations between ToM/EF and behavioural measures....... 165 4.3.7 Relationship between ToM and EF........................................... 171 4.3.7.1 Correlations between ToM and EF.............................. 171 4.3.7.2 Dissociations between ToM and EF............................ 175 4.4 Discussion.......................................................................................... 176 4.4.1 Profile of ToM and EF deficits................................................... 177 4.4.2 Primacy of ToM and EF deficits................................................ 186 4.4.3 Independence of ToM and EF deficits...................................... 193 4.4.4 Towards a “multiple primary deficits” model of ToM and EF in ASDs.........................................................................................

CHAPTER 5. Literature Review: The Broad Autism Phenotype......... 207 5.1 Autism as a genetic disorder.............................................................. 208 5.2 The broad phenotype.......................................................................... 210 5.2.1 The behavioural phenotype...................................................... 210 5.2.2 The cognitive phenotype........................................................... 212 5.2.2.1 General intellectual ability............................................ 213 5.2.2.2 Specific cognitive deficits............................................. 215 CHAPTER 6. Study Two: Theory of Mind and Executive Function in Siblings of Individuals with Autism Spectrum Disorders....................

6.1 Introduction......................................................................................... 222 6.1.1 Aims.......................................................................................... 222 6.1.2 Hypotheses............................................................................... 225 6.2 Method................................................................................................ 226 6.2.1 Participants............................................................................... 226 6.2.2 Procedure................................................................................. 228 6.3 Results................................................................................................ 228 6.3.1 Sibling group comparisons on ToM and EF tasks.................... 228 6.3.1.1 False belief tasks......................................................... 229 6.3.1.2 Tower of London.......................................................... 231 6.3.1.3 IDED Set-shifting task.................................................. 232 6.3.1.4 Response Inhibition and Load task.............................. 233 6.3.1.5 Opposite Worlds task................................................... 235 6.3.1.6 Pattern Meanings......................................................... 238 6.3.1.7 Uses of Objects........................................................... 238 6.3.1.8 Stamps task................................................................. 239 6.3.1.9 Summary of sibling group comparisons....................... 241 6.3.2 Comparisons between ASD siblings and ASD probands......... 242

6.3.3 Ability of cognitive variables to predict sibling group membership.............................................................................

6.3.4 Proband-sibling relationships within the ASD families............. 244 6.3.4.1 Correlations between proband IQ and siblings’ cognitive performances...............................................

6.3.4.2 Correlations between probands’ and siblings’ cognitive performances...............................................

6.3.5 Prevalence of deficits in ASD siblings...................................... 246 6.3.6 Correlations between ToM and EF........................................... 246 6.3.7 Dissociations between ToM and EF......................................... 251 6.3.8 Results from behavioural measures......................................... 252 6.4 Discussion.......................................................................................... 254 6.4.1 Endophenotype status of ToM and EF impairments................. 254 6.4.2 Differentiating the multiple deficits models............................... 260 CHAPTER 7. General Discussion: Constructing an Explanatory Model for ASDs........................................................................................

7.1 Summary of the findings..................................................................... 266 7.2 Methodological strengths and limitations............................................ 267 7.3 Conclusions on constructing an explanatory model for ASDs............ 269 7.4 Future directions................................................................................. 272 REFERENCES.......................................................................................... 279 APPENDIX A. Repetitive Behaviours Interview – Current Version..... 333 APPENDIX B. Correlations between EF task variables in the control group (Study One)...................................................................................

APPENDIX C. Separate ToM-EF correlations for young and old age subgroups within the control sample (Study One)..............................

APPENDIX D. Separate group comparisons for young and old age subgroups on EF tasks (Study One)......................................................

LIST OF TABLES Table: 1. The five scores computed for each item on the Tower of London............. 99 2. Demographic characteristics of the samples............................................... 133 3. Order of test battery and age range for each test........................................ 135 4. False belief task results: Percentage of participants in each group with

perfect scores [or high scores in the case of the alternative aggregate score] on belief questions, and significance of group comparisons...........

5. IDED Set-shifting task results: Percentage of low error scorers in each group for each stage of each task condition, and significance of group comparisons................................................................................................

6. RIL task results: Mean (and SD) of each group, and significance of group comparisons, for error and RT difference scores and the shape error score...................................................................................................

7. Opposite Worlds results: Mean (and SD) of each group for error/time scores in each condition and difference scores, and significance of group comparisons................................................................................................

8. Pattern Meanings results: Mean (and SD) of each subgroup [or the percentage of low error scorers for dichotomous variables], and significance of group comparisons.............................................................

9. Uses of Objects results: Mean (and SD) of each group [or the percentage of low error scorers for dichotomous variables], and significance of group comparisons......................................................................................

10. Stamps task results: Mean (and SD) of each group [or the percentage of low scorers for dichotomous variables], and significance of group comparisons................................................................................................

11. Summary and effect sizes of significant group differences........................ 155 12. Universality of ToM and EF deficits in the ASD group............................. 159 13. Logistic regression analysis of group membership as a function of VIQ,

ToM and EF variables................................................................................. 161

14. Median (and range) of RBI severity summary scores for the ASD and control groups.............................................................................................

15. Factor loadings of RBI severity summary scores....................................... 163 16. Raw and partial correlations between cognitive measures and

behavioural factors within the ASD group................................................. 166

17. Raw and partial correlations between cognitive measures and RBI composite scores within the ASD group.....................................................

18. Raw and partial correlations between ToM and EF measures within the control group...............................................................................................

19. Raw and partial correlations between ToM and EF measures within the ASD group..................................................................................................

Table: 20. Summary of significant partial correlations between ToM and EF

variables in the control and ASD groups.................................................... 175

21. The incidence of ToM-EF dissociations in the ASD group........................ 176 22. Demographic characteristics of the sibling samples................................... 227 23. False belief task results: Percentage of siblings in each group with

perfect scores [or high scores for the alternative aggregate] on belief questions, and significance of group comparisons.....................................

24. IDED Set-shifting task results: Percentage of low error scorers in each sibling group for each stage of each task condition, and significance of group comparisons......................................................................................

25. RIL task results: Mean (and SD) of each sibling group, and significance of group comparisons, for error and RT difference scores and the shape error score...................................................................................................

26. Opposite Worlds results: Mean (and SD) and significance of group comparisons for each sibling group for error/time scores in each condition and difference scores, and for each gender for time scores........

27. Uses of Objects results: Mean (and SD) of each sibling group, and significance of group comparisons.............................................................

28. Stamps task results: Mean (and SD) of each sibling group [or the percentage of low scorers for dichotomous variables], and significance of group comparisons..................................................................................

29. Effect sizes, r (and d), of significant group differences between sibling groups and between proband groups..........................................................

30. Results of logistic regression analysis of sibling group membership......... 243 31. Raw and partial correlations between proband PIQ and VIQ and

siblings’ scores on ToM and EF measures................................................. 245

32. Raw and partial correlations between ToM and EF variables within control siblings............................................................................................

33. Raw and partial correlations between ToM and EF variables within ASD siblings........................................................................................................

34. Summary of partial correlations between ToM and EF variables in the control and ASD probands and siblings.....................................................

35. The incidence of ToM-EF dissociations in the ASD siblings.................... 252

LIST OF FIGURES Figure: 1. A single primary cognitive deficit model of autism.............................. 5 2. A multiple cognitive deficits model of autism, in which each

cognitive deficit underlies a different domain of symptomatology....... 8

3. An example of a Dewey Story............................................................... 95 4. The starting configuration for the Tower of London stimuli................. 98 5. Stimuli for the Perseveration condition of the IDED set-shifting task.. 102 6. Stimuli for the Learned Irrelevance condition of the IDED set-

shifting task............................................................................................ 103

7. Example of a Relational Complexity item with 1 relational change..... 109 8. Example of a Relational Complexity item with 4 relational changes.... 109 9. Example of a more difficult Relational Complexity item without

consistent relational changes.................................................................. 110

10. One of the five test stimuli for the Pattern Meanings task..................... 111 11. The practice stimulus for the Pattern Meanings task............................. 112

ACKNOWLEDGEMENTS

First and foremost credit clearly goes to my principal supervisor Murray Maybery, who

is a rare treasure in putting his students’ needs before his own. He is unfailingly patient,

encouraging, logical, and sensible. Thanks also to my co-supervisor Joachim

Hallmayer, whose expertise in autism and genetics and constructive feedback on a draft

improved the clarity, accuracy, and coherence of the thesis.

This PhD research formed part of a larger project on the broad autism

phenotype, the Western Australia Family Study of Autistic Spectrum Disorders

(WAFSASD), which was funded by a National Health and Medical Research Council

grant. Alana Maley, research assistant extraordinaire on the WAFSASD, put in

countless hours of recruiting families, driving to opposite ends of the city and state, and

interviewing and testing a seemingly endless number of participants. My hugest

appreciation for all that you contributed. Dorothy Bishop, one of the WAFSASD’s

chief investigators, offered expert guidance throughout the project. Wayne Hill put

together a monstrous database as well as doing a number of the ADI-Rs. Sarah

Davenport, Isabel Fernandez, Kate Fitzpatrick, Elise Mengler, Sarra Miller, Bronny

Morgan, Nicole Petterson and Keira Thomson all helped with testing and/or data entry

for the WAFSASD. Valued assistance was also provided by Matt Huitson, whose task

programming skills saved me a lot of time and frustration, and Herb Jurkiewicz, who

helped with some of the stimuli.

Liz Pellicano shared with me the questions, ideas, and bafflement that go along

with doing autism research, and in doing so managed to help rekindle my enthusiasm

for not only my own research but also research in general, right when it was needed.

My officemates, co-whingers, and distractors Kate Harwood and Mark Woodman

served proficiently as my credibility meters (as well as keeping me up to date on world

affairs). Opinions, grievances, ridicule, coffee, and gossip were also shared with Kate

Frencham, Keira Thomson, Flavie Waters, and Allyson Browne. I was kept fed and

financed by my generous family, particularly in the later stages after my scholarship had

run out. My gorgeous Glen saw me through to the finishing line with a constant supply

of comfort, silliness, and (bad) humour.

Finally, my humble and sincere gratitude to the participants of this research – the

kids both with and without ASDs, their brothers and sisters, and mums and dads – who

gave their time and effort so generously. May this thesis be a step forward in

understanding the puzzle of autism.

CHAPTER 1

General Introduction: Explaining Autism

1.1 Autism: Diagnosis and epidemiology

1.2 Explaining autism: The cognitive level of explanation

1.3 Overview of the thesis

1.3.1 Rationale and aims

1.3.2 Thesis structure

1.1 Autism: Diagnosis and epidemiology

Autism is classified as a pervasive developmental disorder and is defined and diagnosed

by its clinical symptomatology, rather than biological markers or aetiology. Current

diagnostic criteria, as specified by the Diagnostic and Statistical Manual of Mental

Disorders, 4th edition (DSM-IV; APA, 1994) and the International Classification of

Diseases, 10th edition (ICD-10; WHO, 1992) require the presence of symptoms in three

categories: i) impairment in social interactions, ii) abnormal development of language

and nonverbal communication, and iii) restricted and repetitive patterns of behaviour,

interests and activities. Examples of specific symptoms listed in DSM-IV in the social

domain include impaired use of nonverbal behaviours such as eye contact, facial

expressions and gestures, failure to develop appropriate relationships with peers, and a

lack of spontaneous seeking to share enjoyment and interests; the communication

domain lists features such as a delay in or total lack of language development,

pragmatic difficulties, stereotyped use of language, and impaired pretend play and

imitation; and examples of repetitive behaviours include intense preoccupations, rigid

adherence to routines and rituals, and stereotyped motor mannerisms such as hand

flapping. The DSM-IV criteria specify that six of the twelve symptoms listed must be

present, with at least two from the social domain and one from each of the other two

domains. Delayed or abnormal functioning in at least one of the three domains must

also have been present prior to the age of three years. While it possible for autism to be

identified as young as 18 months (Baron-Cohen, Allen, & Gillberg, 1992; Johnson,

Siddons, Frith, & Morton, 1992), it is more commonly and reliably diagnosed at around

the age of three years or older.

Other pervasive developmental disorders1 such as Asperger syndrome

(individuals with autistic symptomatology who have normal intelligence and adaptive

skills and no delay in the onset of speech) and Pervasive Developmental Disorder Not

Otherwise Specified (PDDNOS; individuals who show significant symptomatology but

who do not meet full criteria for a specific PDD) are generally considered related but

distinct entities on the autism spectrum, although the boundaries and validity of each

diagnosis remain a matter of current debate (Bishop, 2000; Macintosh & Dissanayake,

2004; Miller & Ozonoff, 2000; Ozonoff, South, & Miller, 2000; Rapin, 1997). One of

1 The term “pervasive developmental disorder” refers to the DSM-IV/ICD-10 category which includes autism, Asperger syndrome, Pervasive Developmental Disorder Not Otherwise Specified, Rett’s disorder, and Childhood Disintegrative disorder. Throughout this thesis, the term “autism spectrum disorder” will be used to refer to the former three of these diagnoses.

the characteristics of autism is its variability, with symptom severity, intellectual ability,

and degree of language impairment varying widely across individuals. Most studies

estimate that around 70% of individuals with autism are mentally retarded – that is,

have an IQ below 70 (see Fombonne, 2003). When individuals with more broadly

defined autism spectrum disorders (ASDs) are included, the proportion of affected

individuals with comorbid mental retardation decreases substantially; for example,

Chakrabarti and Fombonne (2001) found that less than half of children with ASDs have

Performance IQs less than 70.

Conservative prevalence estimates for autism currently stand at 10/10,000, with

estimates for Asperger syndrome at 2.5/10,000, and at 15/10,000 for PDDNOS, making

a combined prevalence for all ASDs of 27.5/10,000 (Fombonne, 2003). The prevalence

of ASDs has reportedly increased in recent years, with three of the latest surveys

providing estimates around twice as high as the above figures (Baird et al., 2000;

Bertrand et al., 2001; Chakrabarti & Fombonne, 2001). Fombonne (2003) reports that

the median prevalence rate for autism in 16 surveys published between 1966 and 1991

was 4.4/10,000, whereas the median rate for 16 surveys published in the period 1992-

2001 was 12.7/10,000. While this apparent increase has led some to propose various

environmental aetiologies for autism, other possible contributing factors include

changes in diagnostic practice, increased awareness, “diagnostic substitution” (e.g.,

choosing a diagnosis of autism instead of mental retardation for the purposes of

educational placement or funding), earlier diagnosis, and methodological issues (see

Volkmar, Lord, Bailey, Schultz, & Klin, 2004).

Autism is more common in boys than in girls, with a mean sex ratio of 4.3:1

across epidemiological studies; the ratio is higher for non-retarded individuals with

autism, with a median of 5.75:1 across studies (Fombonne, 2003). High socioeconomic

status and immigrant status have been associated with higher rates of autism in some

small samples, but larger, well-designed studies have not supported these associations

(Fombonne, 2003). A number of comorbid medical conditions have also been

commonly associated with autism, with the most prevalent being epilepsy (Fombonne’s

review estimates that 16.8% of individuals with autism also have epilepsy, but this may

be an underestimate given that the median age of the samples is lower than the usual age

of onset of seizures in autism). Proposed associations with other conditions such as

Fragile X, tuberose sclerosis, neurofibromatosis, and phenylketonuria (PKU) are less

well established as many studies do not provide evidence that the prevalence is higher

than predicted by chance (Fombonne, 2003; Volkmar et al., 2004).

1.2 Explaining autism: The cognitive level of explanation

The construction of a causal model of autism (and ASDs more broadly) has proven an

extremely complex and challenging task at all levels of explanation: genetics,

neurobiology, cognition, and behaviour2. While we are now confident that autism has a

genetic basis (see Chapter 5), the genetic mechanisms and specific genes involved are

still not understood and non-genetic factors have also been implicated. Attempts to

identify key neuroanatomical and neurobiological abnormalities have resulted in a

variable array of inconsistent findings, with almost all areas of the brain proposed as

being abnormal in autism at one time or another. At the level of behaviour, it remains

unclear whether autism is best conceived of as a unitary syndrome, a set of related but

distinct subtypes, or a continuum or spectrum of abnormalities (Boucher, 1996; this is

discussed further below).

Paralleling this search for convergence at the genetic, neurobiological, and

behavioural levels of explanation has been the pursuit of a core marker or single

primary deficit at the level of cognition. In the absence of a unique biological marker

for autism, the identification of a primary cognitive deficit could help to both define the

boundaries of the disorder and highlight possible neurobiological substrates. The notion

of a single primary cognitive deficit is attractive because it is parsimonious and provides

unity – that is, it is a way of explaining the regular co-occurrence of the triad of

impairments which characterise autism and it justifies the use of a single label,

“autism”. For these reasons, Morton and Frith (1995, 2001; see also Frith, Morton, &

Leslie, 1991) have argued strongly that autism may be explained by a single primary

cognitive deficit which underlies the whole range of autistic symptomatology. The basic

structure of this kind of model is presented in Figure 1.

The notion of a primary or core deficit has been crucial in guiding and

constraining cognitive theories of autism. Michael Rutter was one of the first autism

researchers to promote the idea of a primary deficit, with his treatment of the term

implying that he considered universal manifestation, early appearance, prognostic

significance, and ability to account for performance on a range of tasks to be important

signs of primacy (Rutter, 1968). More recently, a primary cognitive deficit has been

defined as “universal, specific, and necessary and sufficient to cause the symptoms of

2 This distinction of four broad levels of explanation follows Pennington and Welsh (1995), among others, and should be considered provisional. Other divisions are possible; for example, Morton and Frith (1995) collapse genetic and neurobiological factors under one heading, “biological”. Finer divisions are also possible, for example within the level of neurobiology.

the disorder...in other words,...the proximal cognitive cause of the behavioural

symptoms of the disorder” (Pennington & Ozonoff, 1996, p. 57). These three criteria of

universality, uniqueness to autism, and ability to explain the behavioural symptoms of

autism have consistently recurred in recent definitions of primacy (e.g., Hughes, 2001;

Ozonoff & McEvoy, 1994; Turner, 1997). An additional criterion commonly used to

assess primacy is that of causal precedence, or the ability of the proposed deficit to

predate and explain the earliest symptoms of autism (Boucher, 1996; Happé, 1994b;

Pennington & Ozonoff, 1991; Pennington & Welsh, 1995; Tager-Flusberg, 2001). Four

key criteria3 for judging the primacy of a cognitive deficit in autism may therefore be

identified as:

1. Its universality among individuals with autism;

2. Its uniqueness to individuals with autism;

3. Its causal precedence, or ability to account for the earliest symptoms of autism; and

4. Its explanatory value, or ability to explain the full range of autistic symptomatology.

Non-genetic factors

Genetic liability

Brain abnormalities

Cognitive deficit

Behavioural

symptom Behavioural

symptom

Figure 1. A single primary cognitive deficit model of autism.

3 These four criteria will be used to evaluate primacy throughout this thesis, although the list is not claimed to be comprehensive or definitive. Other features frequently cited as signifying a primary deficit include persistence or stability throughout development (e.g., Ozonoff & McEvoy, 1994; Pennington & Welsh, 1995; Rutter, 1983) and existence in the broad phenotype of autism (Bailey, Phillips, & Rutter, 1996; Hughes, 2001; see Chapter 5).

It could be argued that these criteria for primacy are too stringent, due to the phenotypic

variability which exists in any syndrome (Tager-Flusberg, 1999a) and the possibility of

subgroups within the autism spectrum. However, any single primary cognitive deficit

model of autism should theoretically be able to meet the criterion of universality and be

able to account for the range of symptoms displayed by individuals with autism

(multiple deficits models are discussed further below).

Over the years, cognitive theories of autism have adopted many different forms.

The pioneering work of Hermelin and O’Connor (1970) demonstrated that neither

general mental retardation or peripheral (i.e., sensory or motor) processing could

explain the specific pattern of impairments displayed by individuals with autism, instead

finding evidence of abnormal “central” processes such as sequencing, concept

formation, and abstraction. Around the same time, Rutter (1968) proposed that

language or “coding” deficits were primary to autism. Subsequent hypotheses regarding

the nature of the primary impairment in autism have included aberrant sensory

processing (Ornitz, 1969, 1988), deficits in arousal modulation and attention (Dawson,

1991; Dawson & Lewy, 1989; Hutt & Hutt, 1968), impaired complex information

processing (Minshew, Goldstein, Muenz, & Payton, 1992; Minshew, Johnson, & Luna,

2001), lack of socio-affective or interpersonal relatedness (Hobson, 1989, 1993), and

abnormal social responsiveness or orienting to social information (Klin & Volkmar,

1993; Mundy & Neal, 2001; Mundy & Sigman, 1989). However, difficulties meeting

the various criteria for primacy (particularly the criterion of explanatory value for the

full range of symptoms) have meant that none of these theories has established itself as

a widely accepted candidate for a single primary deficit. Current research is dominated

by three main theories of the primary cognitive deficit in autism: i) lack of theory of

mind (inability to attribute mental states to oneself and others), ii) executive dysfunction

(impairment in high-level cognitive functions which guide and control behaviour toward

attainment of a goal), and iii) weak central coherence (tendency for piecemeal or local

information processing). Significant impairments in these areas have been established

in numerous studies of individuals with ASDs4. Proponents of these theories, in

particular the former two, have strongly asserted that the impairment in question is the

single primary cognitive deficit in autism. Additional impairments in other domains are

usually accounted for as secondary, correlated, or artefactual consequences of the single

primary deficit.

4 Studies of theory of mind and executive function in ASDs are reviewed extensively in Chapter 2. Central coherence studies are briefly discussed in Section 4.4.4 of Chapter 4.

The idea of a single primary deficit has been subjected to increasing criticism,

however. Goodman (1989) is often cited as an advocate of the multiple primary deficits

approach, arguing that genetic and environmental insults may act upon several distinct

neural systems which share in common a vulnerability to those insults. These multiple

neurological abnormalities then create simultaneous impairments in several cognitive

domains, and “synergistic interactions” between these impairments result in a distinct

syndrome. In this model, the shared vulnerability of several neural systems (e.g.,

through shared blood supply or neurotransmitters) is the unifying factor in creating a

unitary syndrome. In a similar vein, Pennington et al. (1997) proposed that the unifying

explanation may occur at the level of neurochemistry (e.g., a dopaminergic deficit),

which would result in multiple cognitive impairments that were not necessarily

connected at a cognitive level.

Others in favour of multiple primary cognitive deficits have argued against the

notion that autism is a unitary syndrome which requires a single unifying level of

explanation. As mentioned earlier, at least two alternative conceptions are possible,

which also incorporate ASDs besides autism. One is the notion of related but distinct

subgroups, or a “categorical” system of subtyping. Categorical systems are “intended to

divide populations into subgroups that share a common aetiology, symptom

presentation, and course that is distinct from those of other subgroups” (Beglinger &

Smith, 2001, p. 412). Subgroup divisions in ASDs could be defined in a number of

different ways, such as according to PDD subtype (i.e., autism, Asperger syndrome,

PDDNOS), the domains in which symptoms are present, symptom severity, or level of

(intellectual/adaptive) functioning, with the latter variable appearing to hold the best

discriminative and predictive validity in studies employing cluster analysis (Fein et al.,

1999; Prior et al., 1998; Stevens et al., 2000). If ASDs are conceptualised as a group of

distinct subtypes, then a single primary deficit model would not be plausible, and

instead there would need to be as many primary deficits as there were subgroups (unless

more extreme or severe subgroups were characterised by a larger number of primary

deficits and other milder subgroups were characterised by fewer primary deficits).

Therefore, across ASDs as a whole, primary deficits would not meet the criteria of

universality or explanatory value (although, they should meet these criteria within the

relevant subgroup).

The other major alternative model of ASDs is that of a multidimensional

spectrum, where dimensions such as symptom severity or level of functioning are

conceptualised as a continuum ranging from “normal” to severe or extreme, rather than

forming discrete subgroups. The idea of autism as a unitary syndrome is also

compatible with the notion of a spectrum, but in that case it would be unidimensional in

nature. In a multiple primary deficits model, there would be more than one cognitive

deficit, each underlying a different dimension. Again, the various dimensions could be

defined in different ways; for example, each symptom domain could be a dimension, or

there could be one dimension for symptom number and severity, and another for level

of functioning (Szatmari et al., 2002). In the version where the dimensions are

symptom domains, there would need to be as many primary deficits as there were

symptom domains5 (thus, a minimum of three independent cognitive deficits of varying

severity would need to underlie the triad of impairments in autism, whereas individuals

with PDDs showing symptoms in only two domains would show two primary deficits).

Therefore, the criteria of universality and explanatory value across all individuals with

ASDs would not be met by primary deficits in this model either (although these criteria

should be met by anyone displaying the relevant symptom, with differing degrees of

impairment according to the severity of the symptomatology). Figure 2 presents an

example of a multiple primary cognitive deficit model of autism based on the concept of

autism as a continuum with three dimensions, with each dimension corresponding to a

symptom domain.

Genetic origins Non-genetic factors

Behavioural symptom

Cognitive deficit

Behavioural symptom

Cognitive deficit Cognitive deficit

Brain abnormalities

Behavioural symptom

Figure 2. A multiple cognitive deficits model of autism, in which each cognitive deficit

underlies a different domain of symptomatology.

5 This assumes that the symptom domains are dissociable, such that each symptom could potentially be displayed in isolation.

While these multiple cognitive deficits models of ASDs represent plausible alternatives

to the notion of autism as a unitary syndrome with a single primary cognitive deficit,

strong claims about singular primacy are still being made by proponents of the major

current cognitive hypotheses. The validity of these claims not only rests on how well

the proposed primary deficit can meet the four criteria for primacy, but also on whether

the deficit can explain or subsume the other cognitive impairments which characterise

ASDs. The construction of an integrated explanatory model of ASDs requires

identification of which cognitive processes are the most primary in ASDs and how they

relate both to each other and to the genetic, neurobiological, and behavioural levels of

explanation6.

1.3 Overview of the thesis

1.3.1 Rationale and aims

The overarching aim of the current research is to contribute to an explanatory model of

ASDs, primarily by investigating the structure of the cognitive level of explanation, but

also by examining its relationships with other levels of explanation (mainly the

behavioural, but also the genetic in an indirect sense) – and thereby to evaluate the

validity of a single versus multiple primary cognitive deficit model of ASDs. More

specifically, this thesis focusses on two of the major current cognitive theories of

primary deficits in ASDs: lack of theory of mind and executive dysfunction. These two

theories represent the most fertile ground for debate regarding the primacy of and

relationship between cognitive deficits in ASDs. This is firstly because proponents of

these theories have made the strongest claims, as well as presenting the most convincing

yet controversial evidence, about the deficit in question being the single primary deficit

in autism (whereas those arguing for weak central coherence have tended to more often

present it as one of multiple deficits); and secondly because the relationship between

theory of mind (ToM) and executive function (EF) has been the subject of considerable

theoretical and empirical scrutiny in typical development, but has been less well studied

6 Of course, this assumes that the cognitive level of analysis is necessary and/or useful in explaining autism. The importance of cognition in constructing causal models for developmental disorders has been justified persuasively by Morton and Frith (2001) and Tager-Flusberg (1999a), who argue that cognition is necessary to bridge the gap between brain and behaviour in a parsimonious and theory-driven manner. Postulating areas of strength and weakness at the mediating level of cognition allows us to form sensible, coherent interpretations of apparently unrelated behavioural and biological observations.

in ASDs (although several claims and assumptions have been made about their

relatedness in ASDs). This lack of empirical attention is somewhat surprising, as any

proponent of a single primary deficit model must show that the primary deficit (e.g., in

ToM) causes any other deficit (e.g., in EF) demonstrated by individuals with ASDs.

Moreover, most multiple deficits models would need to show that ToM and EF were

independent impairments (either characterising different subgroups or underlying

different dimensions of ASDs).

The current research consists of two studies, both broad in scope. The first study

examined the profile, primacy, and independence of ToM and EF impairments in

individuals with ASDs. This is only the second study to examine these issues together

in one large investigation, with the first (Ozonoff, Pennington, & Rogers, 1991)

containing several limitations which were addressed in this study (see Chapter 4). The

three central aims of Study One were to determine i) the specific profile of ToM and EF

deficits which characterises ASDs (as a necessary first step before further examining

primacy and independence); ii) whether impairments in ToM and/or EF can meet the

criteria for a primary cognitive deficit in ASDs (as assessed by its universality,

uniqueness, and explanatory value), and, should no impairment meet the criteria fully,

which appears to be the most primary; and iii) whether or not ToM and EF impairments

are related in ASDs, and if so, what the nature of that relationship might be. Several

competing hypotheses about the relative primacy of and relationship between ToM and

EF were tested, with each having different implications for which type of single or

multiple deficit model could best explain ASDs. These aims and hypotheses and the

way in which they were addressed are elaborated in Chapter 4.

The second study attempted to confirm and extend the results of Study One by

investigating ToM and EF impairments in siblings of individuals with ASDs. As ASDs

are genetic disorders (see Chapter 5), examining cognitive weaknesses in relatives of

individuals with ASDs can be a useful method of identifying potential markers of

genetic vulnerability as well as testing models of primary deficits in ASDs. The main

aims of Study Two were i) to identify whether ToM or EF performance can meet

criteria for an “endophenotype” or vulnerability marker for the autism genotype, and

thereby seek confirmation of the results of Study One regarding the relative primacy of

ToM and EF in ASDs; and ii) to further investigate the validity of various

single/multiple deficits models of ASDs by examining the pattern of ToM and EF

performance in individuals showing the broad phenotype. Again, the aims and

hypotheses of this second study and its extensions to previous research are further

discussed in Chapter 6.

1.3.2 Thesis structure

In Chapter 2, the constructs of ToM and EF are reviewed with reference to both typical

development and autism. Each ability is defined; its methods of measurement are

discussed; relevant models of its structure and typical development are presented; and

evidence for its impairment in and primacy to autism is critically reviewed. Next, the

various hypotheses about the nature of the relationship between ToM and EF in typical

development are considered, and these hypotheses are then re-examined with regard to

the relationship in autism. This critical analysis of previous research on the nature,

primacy, and independence of ToM and EF in typical development and autism provides

the context for the thesis and for Study One in particular.

A large range of diagnostic, IQ, cognitive, and behavioural measures were used

in both of the studies in the thesis. Chapter 3 is devoted to the description and rationale

for selection of these measures. For each questionnaire, interview, and task, the basis

for its inclusion in the research and a thorough description are both provided. This

reflects a general emphasis on the use of appropriate assessment tools, particularly in

the area of EF, which has suffered from a history of poor measurement precision.

Chapter 4 contains the major study of the thesis. The main aims of Study One

were outlined in the previous section. The broader phenotype of autism is reviewed in

Chapter 5, as a background for second study of the thesis. This briefer review covers

the genetic basis for autism and the behavioural and cognitive characteristics of first-

degree relatives of individuals with autism. Chapter 6 contains Study Two, the central

aims of which were also described in the previous section.

In the General Discussion in Chapter 7, the results of both studies are

summarised and their implications for conceptual models of ASDs are discussed. The

importance of integration between the various levels of explanation is highlighted and

emphasis is placed on the need to consider the process of development in constructing

explanatory models of developmental disorders.

CHAPTER 2

Literature Review: Theory of Mind and Executive Function in Typical Development and in

Autism

2.1 Theory of mind (ToM)

2.1.1 Defining and measuring ToM

2.1.2 Models of ToM and its development

2.1.3 ToM in autism

2.2 Executive function (EF)

2.2.1 Defining and measuring EF

2.2.2 Models of EF and its development

2.2.3 EF in autism

2.3 The ToM-EF relationship

2.3.1 Models of the ToM-EF relationship

2.3.1.1 Expression accounts

2.3.1.2 Common conceptual requirements of ToM and EF

2.3.1.3 Emergence accounts

2.3.1.4 Common neuroanatomical bases for ToM and EF

2.3.2 The ToM-EF relationship in autism

This chapter reviews previous research on the constructs of ToM and EF both in typical

development and in autism, providing a context for three of the central concerns of the

current research – the profile, primacy, and independence of ToM and EF impairments

in ASDs. ToM is discussed in the first section, followed by EF in the second section.

Each of these sections contains i) a solid background on how the construct is defined

and measured, reflecting a general emphasis on measurement precision, particularly in

the area of EF; ii) a review of relevant models of the typical development of ToM/EF,

in order to provide a theoretical context within which both evidence of the impairment

of ToM/EF in ASDs and models of the ToM-EF relationship may be evaluated; and iii)

a review of evidence for the impairment of ToM/EF in autism and the specific profile of

that impairment, followed by a critical analysis of evidence for the primacy of the

ToM/EF impairment to autism. The third section of the review addresses the

relationship between ToM and EF, covering both i) theories of the nature of the

relationship in typical development, which are outlined in detail as each makes different

predictions about the ToM-EF relationship in autism; and ii) evidence for the nature of

the relationship in autism, which not only has implications for the validity of theories of

the relationship based on typical development, but more importantly is relevant for the

question of primacy (i.e., can a primary deficit in ToM explain or subsume a secondary

deficit in EF, or vice versa?) This review of methodology, theory, and evidence in the

fields of ToM and EF is therefore intended as a backdrop against which findings from

the current research may be appraised and interpreted.

2.1 Theory of mind (ToM)

2.1.1 Defining and measuring ToM

The term “theory of mind” refers to the ability to attribute oneself and others with

mental states, such as desires, beliefs, and intentions, in order to explain and predict

actions. The phrase was first used by Premack and Woodruff (1978), who stated that:

In saying that an individual has a theory of mind, we mean that the individual imputes mental

states to himself and to others...A system of inferences of this kind is properly viewed as a

theory, first, because such states are not directly observable, and second, because the system can

be used to make predictions, specifically about the behavior of other organisms (p. 515).

After noting flaws in the methodology used by Premack and Woodruff (1978) to

examine whether or not chimpanzees have a ToM, Dennett (1978) pointed out that ToM

could be demonstrated conclusively only by predicting the way another person will

behave on the basis of a false belief (otherwise the actual situation, habitual or regular

aspects of the other person’s behaviour, or the subject’s own true beliefs could be used

to predict the person’s actions, without the need to appeal to mental states). This

proposal was first employed with humans by Wimmer and Perner (1983), who tested

typically developing children on what has now become a classic false belief task,

sometimes called the “unexpected transfer” test. A scenario is presented in which a

story character, Maxi, puts a chocolate in cupboard A before he goes out to play. While

he is gone, his mother moves the chocolate to cupboard B. Maxi then returns, and

participants are asked, “Where will Maxi look for the chocolate?”. Two control

questions testing knowledge of the chocolate’s original and current location ensure that

the child recalls the story and followed the sequence of events. In order to answer the

belief question correctly (that Maxi will look in cupboard A), the child requires an

understanding that Maxi holds a false belief, which is different from the child’s own

knowledge of the actual situation, and which will lead Maxi to behave in a way which

contradicts the actual situation (i.e., Maxi’s behaviour is a product of what he believes

to be true rather than what is really true). An accurate answer on the belief question

therefore suggests that the participant appreciates the distinction between mind (the

internal and mental) and world (events, situations or behaviours).

In another commonly used false belief task, variously called the “Smarties task”,

the “unexpected contents” task, or the “deceptive box test” (Perner, Leekam, &

Wimmer, 1987), the child is shown a box of Smarties and asked what s/he thinks is

inside. After responding “Smarties”, the child is shown that the box actually contains a

pencil. The pencil is then put back in the box and the box is closed. The child is asked

a control question about the actual content of the box (i.e., a pencil). The child’s ability

to attribute false beliefs to others is then assessed by asking what another child (or

family member) would think was in the box. In some versions, the child is also asked

about his/her own previous belief about the content of the box, when s/he was first

shown it. Gopnik and Astington (1988) found that children who fail the false belief

question also incorrectly answer that they thought the box contained pencils when they

themselves first saw it, consistent with the view that the development of ToM pertains

to knowledge of one’s own mind as well as the minds of others.

A variation on the Smarties task is a test of the “appearance-reality distinction”,

or the “unexpected identity” task (Flavell, Flavell, & Green, 1983). In this task, the

child is shown an object with a deceptive identity, such as a sponge that looks like a

rock, and is asked what it looks like. The real nature of the object is then demonstrated

to the child (e.g., by squeezing the sponge), and the child is asked what it really is. The

subsequent two questions follow the same structure as the Smarties task, with the child

being asked what s/he thought the object was when first shown it, and what another

child would think the object was.

Such false belief tasks are now central to current developmental research on

social cognition, serving as a marker for ToM in both typically developing and

disordered populations (Wellman, Cross & Watson, 2001). However, ToM has been

measured in many other ways, some of which also exploit the false belief concept, and

others of which measure other types of mentalistic understanding. Baron-Cohen (2000)

reviews 20 kinds of tasks which are purported to measure ToM, including tests of

deception, the mental-physical distinction, recognition and expression of mental-state

words, decoding mental states from the eyes, and understanding the mental functions of

the brain.

2.1.2 Models of ToM and its development

The timing and mechanisms of the normal development of ToM have been studied

extensively (see Astington, Harris & Olson, 1988; Carruthers & Smith, 1996; Lewis &

Mitchell, 1994; Mitchell & Riggs, 2000; Perner, 1991; Wellman, 1990; Wellman et al.,

2001; Whiten, 1991). The large majority of studies demonstrate a definitive

improvement in performance on false belief (and other ToM) tasks between the ages of

3 and 5 years, with 3-year-olds consistently making errors suggesting that they are

unable to separate belief from reality (e.g., in the unexpected transfer task they will

assert that Maxi will look for the chocolate in cupboard B, its actual location). In their

meta-analysis of 178 studies measuring young children’s performance on false belief

tasks, Wellman et al. (2001) found that average false belief performance changes

rapidly between 3 and 4.5 years from significantly incorrect (i.e., below chance) to

significantly correct (above chance). Although ToM development through middle

childhood to adolescence is much less well studied, evidence suggests that advances in

these years include “an understanding that people’s mental states...are often consistent

across situations in the form of personality traits, a greater appreciation of the mind as

an active constructor and interpreter of knowledge, and a growing awareness of the

presence, influence, and sources of ongoing thoughts – that is, active mental ideation”

(Wellman & Lagatutta, 2000, p. 31).

Theoretical accounts of ToM development tend to focus on the crucial period

between 3 and 5 years, centreing on the debate as to whether young children fail false

belief tasks because they lack the conceptual understanding required to respond

correctly (“competence”), or because the insufficient development of other cognitive

capacities (e.g., inhibitory control, linguistic comprehension) masks the access to or

expression of understanding (“performance”). Explication of these models is useful

because of their in-depth analysis of what is involved in successful performance on false

belief tasks. This is important both for understanding what underlies the ToM

impairment in autism, and for analysing the relationship between ToM and EF. An

overview of the major relevant accounts of ToM development therefore follows.

Competence accounts vary in their postulated mechanisms of ToM development,

but they share in common the idea that ToM matures continuously in a series of

successive stages of discovery, each of which is developmentally related to the next (as

opposed to an innate, modular ability which comes on-line early). Gopnik, Wellman

and colleagues (Gopnik, 1993; Gopnik & Meltzoff, 1997; Gopnik & Wellman, 1994;

Wellman & Gelman, 1998) favour the “theory theory”, which proposes that children’s

early conceptions of the mind are theory-like, sharing key features with scientific

theories: they are abstract (i.e., framed in a different vocabulary from empirical

observations), hold explanatory and predictive power, lead to distinctive interpretations

of evidence, and are open to revision based on counterevidence. The theory theory

holds that ToM development is a gradual transition from one view of the mind to

another, rather than being a simple all-or-none acquisition of “a” theory of mind.

In line with the theory theory, Wellman and colleagues (Bartsch & Wellman,

1989, 1995; Gopnik & Wellman, 1994; Wellman, 1990; Wellman & Woolley, 1990)

propose a specific developmental sequence in which children’s understanding of the

motivational forces behind people’s actions advances from a “simple desire”

psychology to a more adult “belief-desire” psychology. In Wellman’s framework, 2-

year-olds hold a simplified understanding of desire and perception (where others are

attributed internal dispositions toward or against certain actions or objects), but fail to

understand that people have internal mental representations1 of the world such as

beliefs; in an intermediate phase, 3-year-olds develop a nonrepresentational

understanding of belief while beginning to comprehend representational aspects of

desire and perception; and at around age four, children begin to realise that individuals’

beliefs (i.e., their representations of reality rather than reality itself) determine their

actions. According to the theory theorists, 2- and 3-year-olds fail the standard false

belief (unexpected transfer) task because as simple desire psychologists, they do not

attribute a belief to Maxi, but rather they predict that Maxi will act to fulfil his desire for

chocolate and will therefore look where the chocolate actually is.

Perner (1991, 1993, 1995, 2000) has articulated an alternative competence

account of ToM development which, like Wellman, Gopnik and colleagues, focuses on

children’s understanding of mental states as representations. While Perner considers

himself a theory theorist, he states that the use of the word “theory” is meant to signal

the notion that conceptual understanding unfolds as a result of the growth of

interdependent concepts, rather than suggesting that children’s intellectual growth is

analogous to scientists making new discoveries (which is explicitly proposed by Gopnik

& Wellman, 1994). He argues that young children begin with a nonrepresentational

conception of mind, and that it is only when children acquire a general theory of

representations (a Representational Theory of Mind; RTM) that they are able to solve

false belief tasks. In Perner’s model, an RTM involves comprehending that

propositions (e.g., beliefs) are semantically evaluable as being true or false. That is,

propositions are “about” a world against which their truth is evaluated (Perner, 2000).

In claiming that young children do not understand representations, his contention is that

they do not understand that a proposition can be evaluated by someone else as having a

different truth value than the one it has in reality (or the one assigned to it by the child

his/herself). Thus, Perner’s (1995) explanation of young children’s failure on the

unexpected transfer task is that:

...they cannot distinguish between the state of the world that the belief is about and how the

believer conceives of that state of the world, or in other words, children cannot conceive of

belief as misrepresenting where the chocolate really is as being in location A. Without this

understanding children cannot understand why an agent who wants to find the chocolate in its

real world location (B) would act as if the chocolate were in A. (p. 251).

1 Here, a “representation” may be defined as an entity in the mind which represents a state of affairs in the world, like a “picture-in-the-head” (Leslie & Thaiss, 1992). This is distinct from Leslie’s (1987, 1994a) concept of “metarepresentation”, which is discussed later.

However, understanding propositions as evaluable as true or false is not enough to

successfully solve false belief tasks (Perner, 2000). The child must additionally realise

that the belief has causal power – it takes precedence (over the world itself) in

determining behaviour. Thus, a false belief about the chocolate’s location, rather than

the chocolate’s actual location, makes Maxi look in the wrong place.

Another distinction between the positions outlined by Perner and Wellman is the

specific nature of the successive theories of mind that children are said to discover, with

Perner rejecting Wellman’s simple desire psychology in favour of his concept of

“prelief”. He argues that young children’s appreciation of pretence (i.e., their ability for

pretend play, which is first employed between 18 and 24 months) implies a realisation

that people do not always act in a way that satisfies their desires objectively. However,

since the young child cannot differentiate between actions based on a false belief which

is held as true and actions based on pretence where what is being pretended is not held

as true, s/he understands these states of belief and pretence as the amalgamated

protoconcept of prelief, which s/he conceptualises as “behaving as if” (Perner, Baker &

Hutton, 1994). Children eventually develop a more adult stage of understanding of

“behaving as is”, whereby people behave according to the beliefs they hold as being

true (this stage in Perner’s model is not distinguishable in obvious ways from the 4-

year-old stage of understanding outlined by Wellman and colleagues).

In contrast to competence accounts such as those of Wellman and Perner,

performance accounts hold that young children fail false belief tasks because processing

limitations2 mask their true ability, as evidenced by demonstrations of earlier

competence when the testing procedure is modified – such as by asking the child

“Where will Maxi look first for his chocolate?” in the unexpected transfer task (e.g.,

Chandler, Fritz, & Hala, 1989; Freeman & Lacohée, 1995; Mitchell & Lacohée, 1991;

Roth & Leslie, 1998; Siegal & Beattie, 1991). This position has been articulated most

thoroughly by Leslie and colleagues, who argue that ToM arises from an attentional

mechanism specialised for selectively attending to mental states (the Theory of Mind

Mechanism; ToMM) which is innate, domain-specific, operates spontaneously from

very early in life without formal instruction, and can be dissociably damaged – in other

words, it is modular (German & Leslie, 2000; Leslie, 1987, 1991, 1994a, 1994b; Leslie

& Roth, 1993; Leslie & Thaiss, 1992; Roth & Leslie, 1998; Scholl & Leslie, 1999,

2 These include executive functions such as inhibition and working memory. Accounts of ToM development based around advances in EF are reviewed in Section 2.3.1.

2001; Surian & Leslie, 1999). For Leslie, very young children’s apparent appreciation

for something as abstract and unobservable as others’ mental states is best explained by

an innately specified module.

As for Perner, the very early appearance of pretend play is an important factor in

Leslie’s model, however for Leslie it marks an early capacity for metarepresentation,

which also underlies the concept of belief and indicates the early presence of a ToMM.

Leslie (1987, 1994a), based on Pylyshyn (1978), distinguishes between primary

representations, which are direct, literal representations about a state of affairs in the

world; and metarepresentations3, which may be described as representations of

representations. A metarepresentation describes an agent’s (e.g., mother’s, self’s)

mental state, or provides an “agent-centred” description of a situation, which is

“decoupled” from the primary representation and processed as if it were a copy or report

of the primary representation (Leslie & Roth, 1993). It does this by specifying an

“informational relation” (or “propositional attitude”; e.g., DESIRING, PRETENDING)

between the agent, an aspect of reality (described by a primary representation) and an

imaginary situation (described by the “decoupled” representation). For example, the

metarepresentation mother PRETENDS [of] this banana [that] “it is a telephone”

allows the child to make sense of his/her mother’s behaviour of talking to a banana by

reference to his/her mother’s mental state (i.e., her attitude of pretence towards the

banana), without making the real-world inference that “bananas are telephones”.

Leslie’s assumption is that there is a small core set of innate informational

relations available to the ToMM early on, such as BELIEVING, DESIRING, and

PRETENDING. As these attitudes are all deployed within the same

metarepresentational structure, Leslie’s explanation for why young children are able to

demonstrate understanding of pretence and desire, but not belief, rests on an additional

component of his model termed the “Selection Processor” (SP; Leslie & Thaiss, 1992).

The SP is an inhibitory mechanism which allows the child to select the specific relevant

information that is required for the belief content inference, while disregarding

prepotent competing information (e.g., in the unexpected transfer task, inferring the

correct content of Maxi’s belief requires selecting the situation which Maxi was

3 Perner (1991) has criticised Leslie’s use of the term metarepresentation as suggesting that the young child has a conscious “theory of representation”. However, Leslie has specified (Leslie & Thaiss, 1992; Leslie & Roth, 1993) that he does not intend it in this sense, but rather intends it to denote a kind of data structure computed by our cognitive system, or an information processing mechanism which helps to create conceptual knowledge. He does not mean to imply that children have a conscious theory that mental states are representations in the head (as Perner does to some degree when he proposes the Representational Theory of Mind).

exposed to at the beginning of the scenario from memory and resisting basing the

inference on the current situation - a tendency which is prepotent because beliefs are

usually true representations of current reality; Leslie, 1994a). According to Leslie, 3-

and 4-year-old children do not differ fundamentally in their conception of belief, but 3-

year-olds fail the false belief task because the SP is poorly developed. Hence, task

manipulations which decrease the load on the SP often result in an improvement in

young children’s performance on false belief tasks.

A volley of criticisms directed at both theoretical foundations and

methodological approaches continues to shoot back and forth between competence and

performance theorists4 (see, for example, German & Leslie, 2000; Perner, 2000; Roth &

Leslie, 1998; Wellman et al., 2001). Theory theorists have accused modularity accounts

of being “antidevelopmental” (Gopnik & Wellman, 1994; for a defence see Scholl &

Leslie, 1999), while modularity theorists argue that theory theories are purely

descriptive (lacking a specification of the cognitive architecture and mechanisms

underlying theory development), and require that the young child develop explicit

theories about impossibly abstract concepts (Roth & Leslie, 1998; but see Perner, 2000).

Competence theorists claim that studies purporting to show improved performance of 3-

year-olds on simplified false belief tasks have not been consistently replicated and are

open to alternative interpretations (Perner, 2000), while performance theorists maintain

that 3-year-olds failure on standard false belief tasks is a false negative (Leslie, 1994a).

While it is not possible to do full justice to these arguments here, placing ToM in at

least a broad theoretical context is helpful both in evaluating models of the ToM-EF

relationship and in conceptualising the impairment of ToM in autism – the latter of

which we turn to now.

2.1.3 ToM in autism

Complementing the vast literature addressing the typical development of ToM, an

equally large, if not larger, number of studies have investigated ToM in children with

autism. This body of work began with a seminal paper by Baron-Cohen, Leslie and

Frith (1985), in which children with autism were tested on a variation of Wimmer and

Perner’s (1983) unexpected transfer task. Baron-Cohen et al.’s “Sally-Anne” scenario,

which has become the most frequently used version of the unexpected transfer test in

4 For other major accounts of ToM development which have not been reviewed here (e.g., simulation theory, counterfactuality), the reader is referred to Mitchell and Riggs (2000).

the autism literature, involves two doll protagonists, Sally and Anne. Sally places a

marble into her basket, then leaves the scene. Anne takes the marble and hides it in her

box. When Sally returns, the child is asked “Where will Sally look for her marble?”

(the Belief Question). Two control questions probe knowledge of the current location

of the marble (the Reality Question) and the marble’s initial location (the Memory

Question). Baron-Cohen et al. found that while all autistic and control children were

able to correctly answer the control questions, only 20% of children with autism passed

the Belief Question, compared with 85% of typically developing children and 86% of

children with Down’s Syndrome (suggesting that the poor performance in children with

autism was not attributable to intellectual disability). They interpreted this result as

evidence for a metarepresentational deficit specific to autism (based on Leslie’s (1987)

theory of ToM), which had the potential to explain autistic symptoms such as social

impairment and lack of pretend play.

Impaired performance of individuals with autism on false belief tasks has since

been replicated in numerous studies (although failures to replicate have also occurred,

as discussed later). These studies have included a range of task variations such as using

real people instead of puppets and a “think” question rather than a “look” question (i.e.,

“Where does Sally think the marble is?”), as well as using alternative false belief

paradigms such as the deceptive box (“Smarties”) test and the unexpected identity

(appearance-reality) task (Baron-Cohen, 1989a; Charman & Baron-Cohen, 1992;

Eisenmajer & Prior, 1991; Leekam & Perner, 1991; Leslie & Frith, 1988; Leslie &

Thaiss, 1992; Perner, Frith, Leslie & Leekam, 1989; Ozonoff et al., 1991; Reed &

Peterson, 1990; Surian & Leslie, 1999). Individuals with autism have also

demonstrated significantly poorer performance than controls on various other tasks

tapping mentalising ability5, such as sequencing of mentalistic picture stories (Baron-

Cohen, Leslie & Frith, 1986); tests of the mental-physical distinction (Baron-Cohen,

1989a; Ozonoff et al., 1991); describing the mental functions of the brain (Baron-

Cohen, 1989a; Ozonoff et al., 1991); recognition, comprehension and expression of

mental state terms (Baron-Cohen et al., 1994; Tager-Flusberg, 1992; Ziatas, Durkin &

Pratt, 1998); inferring the mentalistic significance of the eyes (Baron-Cohen, Campbell,

Karmiloff-Smith, Grant, & Walker, 1995; Baron-Cohen et al., 1999a; Baron-Cohen,

Jolliffe, Mortimore, & Robertson, 1997; Baron-Cohen, Wheelwright, Hill, Raste, &

Plumb, 2001a); attribution of mental states to animated shapes (Castelli, Frith, Happé &

5 The term “mentalising ability” is intended as a synonym for ToM (i.e., the ability to make inferences about mental states).

Frith, 2002); conceptual perspective-taking (Dawson & Fernald, 1987); tests of

deception (Baron-Cohen, 1992; Russell, Mauthner, Sharpe, & Tidswell, 1991; Sodian &

Frith, 1992); understanding that “seeing-leads-to-knowing” (Baron-Cohen & Goodhart,

1994; Leslie & Frith, 1988); understanding that beliefs cause emotions (Baron-Cohen,

1991a); and understanding of intentions (Phillips, Baron-Cohen & Rutter, 1998).

On the basis of this kind of evidence, some researchers have proposed that the

whole range of autistic symptomatology may be explained by a single, primary,

cognitive deficit in ToM (e.g., Baron-Cohen, 1988, 1991c; Frith et al., 1991; Leslie,

1987, 1991). Furthermore, the same authors have argued that the apparent domain

specificity of the ToM impairment in autism is existence proof that ToM is a modular

capacity. For example, Leslie (1987, 1991; Leslie & Thaiss, 1992) has argued that the

modular Theory of Mind Mechanism (ToMM), which automatically leads us to interpret

behaviour in terms of an agent’s mental states, is specifically impaired in autistic

individuals and can explain their social and communicative impairments and lack of

pretend play. Baron-Cohen (1994, 1995, 1998) outlined an alternative view whereby

ToMM does not come fully prepackaged as an innate module, but rather is preceded by

several lower-level modular mechanisms which extract relevant social information and

provide critical inputs to the development of ToM. These mechanisms include an Eye

Direction Detector (EDD) which alerts the infant to the eye region and thereby provides

opportunities to learn the mentalistic significance of eye gaze; an Intentionality Detector

(ID) that directs attention to animate actions, enabling the infant to learn about goal-

directedness; and a Shared Attention Mechanism (SAM), which uses inputs from the

other two mechanisms to allow the infant to work out if s/he and another person are

jointly attending to the same thing. In this model, ToMM is conceptualised as being

either a more mature development of SAM, or is triggered by SAM.

Tests of the validity of the ToM hypothesis of autism (the view that autism may

be explained by a primary deficit in a ToM module) have focused on the central criteria

required to uphold the position (outlined in Chapter 1, Section 1.2): that i) a ToM

impairment is universal among individuals with autism; ii) a ToM impairment is unique

to individuals with autism, iii) a ToM impairment can explain the earliest signs of

autism in infants (causal precedence); and iv) a ToM impairment can account for the

entire range of symptoms displayed by individuals with autism (explanatory value). In

addition, the modular ToM hypothesis must meet criterion v), that failure on ToM tasks

is best explained by a domain-specific ToM impairment, and cannot be accounted for in

terms of other cognitive constructs. The evidence for each of these claims is reviewed

below.

i) Universality. From the first study of ToM in autism (Baron-Cohen et al.,

1985), it was evident that a proportion of autistic individuals were able to pass false

belief tasks. The percentage of autistic individuals found to pass standard false belief

(unexpected transfer or “Sally-Anne”) tasks in subsequent studies has varied from 15%

(Reed & Peterson, 1990) to 55% (Prior, Dahlstrom, & Squires, 1990), with 90% of

participants with autism passing in one study (Dahlgren & Trillingsgaard, 1996).

Although in most cases the proportion of passers with autism is significantly smaller

than the proportion of successful control participants (usually matched on verbal mental

age), the finding that any child with autism passes false belief tasks poses a challenge to

the ToM hypothesis of autism (although random responding could result in a correct

response). Baron-Cohen (1989b) responded to this challenge with a study

demonstrating that individuals with autism who pass standard first-order false belief

tasks are still unable to make more complex second-order false belief attributions (i.e.,

of the form “Mary thinks that John thinks the icecream van is in the park”; Perner &

Wimmer, 1985). He proposed that autism is characterised by a specific developmental

delay in ToM, such that older and more able participants with autism are able to pass

first-order false belief tasks which are usually mastered by the age of four, but still fail

on more difficult tasks which are usually only passed by the age of six or seven.

However, Baron-Cohen’s (1989b) finding has not been replicated in a number of

subsequent studies, which have found that a subset of participants with high-functioning

autism or with Asperger syndrome also pass second-order false belief tasks (Bauminger

& Kasari, 1999; Bowler, 1992; Dahlgren & Trillingsgaard, 1996; Leekam & Prior,

1994; Ozonoff et al., 1991; Sparrevohn & Howie, 1995). Ozonoff et al. (1991) found

that EF deficits were more universal than ToM impairment among high-functioning

autistic individuals (see Section 2.2.3 for further discussion of this finding).

Furthermore, Tager-Flusberg and Sullivan (1994b) showed that both autistic and control

first-order task passers were able to pass a shorter and less complex second-order task

than the task used in previous studies, suggesting that their failure on traditional second-

order tasks was more likely to be due to the high information processing load than a

lack of conceptual understanding.

Nevertheless, it has been argued that ToM is not measurable only by

performance on false belief tasks (e.g., Tager-Flusberg, 2001). Studies using higher-

level tests of mentalising ability have found that first- and second-order false belief task

passers still demonstrate evidence of impairment in ToM. Happé (1994a) found that

individuals with autism who passed both first- and second-order false belief tasks were

significantly poorer than mentally handicapped and normal children and adults at

providing context-appropriate mental state explanations for nonliteral utterances made

by story characters, which she argues is a more “advanced”, naturalistic test of ToM.

First- and second-order passers performed more poorly than controls on a test requiring

inference of complex mental states from expression of the eyes (Baron-Cohen et al.,

1997), and high functioning adults with autism who passed a first-order false belief task

performed significantly worse than controls on tests measuring attribution of mental

states to voices and eyes (Kleinman, Marciano & Ault, 2001). Frith, Happé and

Siddons (1994) also found that first-order passers still showed impairments in everyday

social behaviours which require mentalising.

Studies examining the characteristics of autistic false belief passers have tended

to find that a high verbal mental age or verbal IQ is a necessary but not sufficient

condition for passing false belief tasks (Charman & Baron-Cohen, 1992; Eisenmajer &

Prior, 1991; Leekam & Perner, 1991; Prior et al., 1990; Sparrevohn & Howie, 1995). In

a review of the literature, Happé (1995) found that children with autism require a verbal

mental age more than twice as high as control participants in order to pass false belief

tasks. Other studies have found that chronological age is a significant factor, either in

addition to or instead of verbal mental age (Baron-Cohen, 1992; Prior et al., 1990),

while still others have found no relationship between age and ability variables and false

belief task performance (Baron-Cohen et al., 1985; Perner et al., 1989). The general

finding that passers tend to be of higher verbal ability is consistent with the idea, put

forward by a number of authors, that individuals with autism who pass false belief tasks

do so not by the usual use of ToM, but by using an alternative compensatory route to

success (Eisenmajer & Prior, 1991; Frith et al., 1991; Happé, 1995; Holroyd & Baron-

Cohen, 1993; Ozonoff et al., 1991). For example, Frith et al. (1991) suggested that able

autistic individuals may have learned or extracted explicit rules about certain social

situations, such as “When something in the world changes, people who just happen not

to have seen the change occur behave (for some reason) as if they do not know about

these changes” (p. 436). A study by Happé et al. (1996) provides some support for this

idea, finding that adults with Asperger syndrome showed activation of different areas of

the prefrontal cortex from controls when listening to mentalistic stories. However, more

direct evidence confirming that false belief task passers are using compensatory

strategies to deduce their solution is yet to be obtained.

ii) Uniqueness. Proponents of the ToM hypothesis argue that a ToM impairment

is unique to autism, citing evidence that control groups of children with either Down’s

syndrome (e.g., Baron-Cohen et al., 1985), other kinds of mental retardation (e.g.,

Charman & Baron-Cohen, 1992), or specific language impairment (Leslie & Frith,

1988) do not show impaired performance on false belief tasks in comparison with

children with autism. However, these findings have been challenged in a number of

other studies which have either failed to replicate significantly poorer performance of

children with autism on various ToM tasks compared with controls (Carpenter,

Pennington & Rogers, 2001; Charman & Lynggaard, 1998; Dahlgren & Trillingsgaard,

1996; Oswald & Ollendick, 1989; Prior et al.,1990, Tager-Flusberg & Sullivan, 1994a),

or have found ToM impairments in other clinical populations. It has become apparent

that mentally retarded, non-autistic individuals perform more poorly on false belief

tasks than would be expected given their chronological and mental age (Benson,

Abbeduto, Short, Bibler-Nuccio, & Maas, 1993; Yirmiya, Erel, Shaked, & Solomonica-

Levi, 1998; Yirmiya & Shulman, 1996; Yirmiya, Solomonica-Levi, Shulman, &

Pilowsky, 1996; Zelazo, Burack, Benedetto, & Frye, 1996a). Yirmiya et al.’s (1998)

meta-analysis comparing the ToM abilities of individuals with autism, mental

retardation (MR), and typically developing individuals showed that although autistic

individuals were the most severely impaired on ToM tasks, individuals with MR also

performed significantly more poorly than typically developing individuals. This result

led them to conclude that it may be the severity of ToM impairment rather than the

impairment itself that is unique to autism. They also found that the aetiology of the MR

was an important factor, with individuals with Down’s syndrome performing better than

other individuals with MR of unknown aetiologies.

In addition to MR, impairments in ToM have been found in deaf children (de

Villiers, 2000; Peterson, 2002; Peterson & Siegal, 1995), blind children (Brown,

Hobson, Lee, & Stevenson, 1997; Minter, Hobson, & Bishop, 1998), and individuals

with schizophrenia (Corcoran, Mercer, & Frith, 1995; Mazza, De Risio, Surian,

Roncone, & Casacchia, 2001; Pilowsky, Yirmiya, Arbelle, & Mozes, 2000), bipolar

affective disorder (Kerr, Dunbar, & Bentall, 2003), borderline personality disorder

(Fonagy et al., 1995), non-verbal learning disorder (Buitelaar, Swaab, van der Wees,

Wildschut, & van der Gaag, 1996), Parkinson’s disease (Mengelberg & Siegert, 2003;

Saltzman, Strauss, Hunter, & Archibald, 2000), and frontotemporal dementia (Gregory

et al., 2002; Lough & Hodges, 2002). Contrary to the findings of Leslie and Frith

(1988), other studies have found ToM deficits in children with specific language

impairment and other communicative disabilities (e.g., Dahlgren, Dahlgren Sandberg, &

Hjelmquist, 2003). These challenges to the uniqueness criterion of the ToM hypothesis

have been refuted by claims that these non-autistic clinical groups do not show as severe

an impairment on ToM tasks as individuals with autism, and that they fail ToM tasks

for different reasons than individuals with autism (i.e., their failure is not due to a

genuine metarepresentational deficit). For example, individuals with MR may fail

because of poor general cognitive and linguistic skills, deaf and blind children may fail

because they lack the necessary perceptual input, such as access to language or facial

information, and individuals with borderline personality disorder may fail because

parental neglect and abuse prevented the normal development of ToM (Baron-Cohen,

2000; Corcoran, 2000; Tager-Flusberg, 2001). However, these claims are yet to be

confirmed empirically. In addition, as discussed further in the domain specificity

section, it must be demonstrated that children with autism do not also fail ToM tasks

because of domain-general cognitive or linguistic difficulties.

iii) Causal precedence. Much of the evidence for the ToM hypothesis has

focused on performance on false belief tasks, on which successful performance

normally develops at around the age of four and is interpreted as evidence of a

metarepresentational capacity (Leslie, 1987) or representational understanding of mind

(Perner, 1991). However, in most cases autism is apparent at a much younger age, with

deficits in social responsiveness and reciprocity, symbolic play, gaze behaviour, joint

attention, and imitation often noticed during infancy or when the child is a toddler (e.g.,

Dawson & Adams, 1984; Klin, Volkmar & Sparrow, 1992; Mundy & Sigman, 1989;

Volkmar et al., 1987). Klin et al. (1992) pointed out that as the crux of the ToM

hypothesis of autism is the inability to represent others’ mental states, then the resulting

prediction would be that social impairment in autism should only become apparent at

the age at which metarepresentational skills appear in typically developing children. It

is unclear exactly when this is, with Leslie’s (1987) original thesis being that pretend

play may be the earliest manifestation at around 18 months, and later authors suggesting

that earlier behaviours such as protodeclarative pointing (11-12 months) and joint

attention (8-12 months) may be the earliest signs (Baron-Cohen, 1989d, 1991b),

although these latter abilities are proposed as “precursors” to ToM rather than signs of

an early ToM itself6. Regardless, Klin et al. (1992) found that six types of social

behaviour from the Vineland Adaptive Behavior Scales which emerged prior to the age

6 Leslie and Happé (1989) have, however, argued that joint attention may also indicate the emergence of an ability to represent mental states, as these behaviours convey the intention to communicate.

of eight months successfully discriminated autistic children from controls. The

nonrepresentational nature of these behaviours, such as “shows anticipation of being

picked up by a caregiver” and “reaches for familiar person”, was taken to indicate an

early pre-mentalising social impairment in autism.

These kind of findings of early, apparently non-mentalistic social impairment in

autism have been interpreted as evidence in favour of a more primary affective,

emotional, or intersubjective impairment in autism (e.g., Hobson, 1993; Klin &

Volkmar, 1993; Mundy, Sigman, & Kasari, 1993). However, the very early recognition

of autism in Klin et al.’s (1992) participants is not typical, with most other studies

finding that it is not possible to reliably detect autism until at least the age of 18 months

(e.g., Johnson et al., 1992). In addition, some of Klin et al.’s autistic participants did

show typical social behaviours, raising the possibility of different subgroups within the

autism spectrum. The question of whether the ToM hypothesis can meet the criterion of

causal precedence remains a matter of debate (see Charman, 2000).

iv) Explanatory value. The strongest form of the ToM hypothesis asserts that

impairment in the ToM module can explain the entire range of symptoms displayed by

individuals with autism (Frith et al., 1991), although original accounts of the ToM

hypothesis focussed mainly on the social and communicative impairments characteristic

of autism. For example, Baron-Cohen (1988) proposed that the ToM hypothesis would

predict impairments in social skills requiring an ability to represent mental states, as

well as pragmatic language skills, as conversing requires that the speaker be aware of

the listener’s mental state. While the relationship between ToM and real-life social

skills appears to make intuitive sense, it has not been directly investigated in many

studies. Dawson and Fernald (1987) reported a significant correlation between autistic

children’s conceptual perspective-taking ability and their teachers’ ratings of social

skills. Frith et al. (1994) found that individuals with autism who passed false belief

tasks were more likely to show evidence of “mind-reading” in their everyday social

behaviour and had better communicative abilities, while those who failed false belief

tasks showed few social behaviours requiring understanding of mental states. In their

sample of young French preschoolers with autism or PDDNOS, Hughes, Soares-

Boucaud, Hochmann, and Frith (1997) found significant differences between ToM

“passers” and “failers” in ratings of everyday social behaviours requiring mentalising

abilities, but only when the teacher rather than the parent was the informant. However,

neither Prior et al. (1990) nor Sparrevohn and Howie (1995) found a significant

correlation between false belief performance and social skills, as rated by parents and

teachers respectively.

The literature examining the relationship between ToM and language abilities in

autism is much larger, with most studies confirming Baron-Cohen’s (1988) prediction

of a relationship between ToM and pragmatic language skills (e.g., Capps, Kehres, &

Sigman, 1998; Tager-Flusberg & Sullivan, 1995). However, it has also become clear

that individuals with autism also show non-pragmatic language impairments (e.g., in

lexical and grammatical knowledge) which are not likely to be the result of a ToM

deficit (Tager-Flusberg, 1999b), but which do correlate with false belief task

performance (e.g., Happé, 1995; Sparrevohn & Howie, 1995). This raises the question

of the direction of the causal relationship between language and ToM, the answer to

which is “likely to be complex” (Tager-Flusberg, 2000). While longitudinal studies

have shown that joint attention behaviours (arguably “precursors” to ToM) in toddlers

with autism predicted language gains several years later (Sigman & Ruskin, 1999),

suggesting that ToM ability is necessary for adequate language development, the

reverse has also been demonstrated - that structural language skills play a key role in

ToM development (de Villiers & de Villiers, 1999). Regardless, it is clear that there is a

close relationship between ToM and language in autism.

The same cannot be said, however, for the relationship between ToM and the

much-neglected third feature of the autistic triad, the repetitive behaviours and restricted

interests which are part of the DSM-IV criteria for autism. While the ToM hypothesis is

able to account for the lack of pretend play displayed by autistic children, it is less

obvious how it might explain other aspects of the third feature of the triad, such as

obsessional interests or repetitive arm-flapping or toe-walking. Baron-Cohen (1989c)

and Carruthers (1996) both attempted to explain repetitive behaviours in autism by

proposing that they develop as a strategy to cope with and gain control over the

unpredictable and frightening social world that surrounds the child who is unable to

understand others’ mental states. This account predicts that the frequency of repetitive

activities should be higher in social settings, especially those which lack a predictable

structure. However, most studies have reported the converse finding, that rates of

stereotyped behaviour are lowest during periods of social interaction and highest during

periods where no interpersonal demands are made (Clark & Rutter, 1981; Dadds,

Schwartz, Adams, & Rose, 1988; Donnellan, Anderson, & Mesaros, 1984). In the only

study reported in the literature so far to directly investigate the relationship between

ToM and repetitive behaviours in autism, Turner (1996, 1997) found no relationship

between false belief task performance and the incidence or severity of a large range of

repetitive behaviours.

Similarly, the ToM hypothesis faces difficulty explaining so-called “non-triad”

features of autism, which appear frequently but are not part of the diagnostic criteria

(Frith & Happé, 1994; Tager-Flusberg, 2001). These include savant abilities,

exceptional visuospatial and visuoperceptual skills, over-selective attention, and

heightened sensory sensitivities. These aspects of autism do not bear an obvious

relation to ToM ability, and may be better explained by the local processing style that

appears to be characteristic of autistic individuals (Frith & Happé, 1994; Happé, 1997,

1999; Plaisted, 2000, 2001). The inability thus far of the ToM hypothesis to adequately

meet the criterion of explanatory value could arguably be considered one of the most

substantial problems to have faced it.

v) Domain specificity. The criterion of domain specificity results from the claim

that ToM reflects an innate module that develops separately from other cognitive

capacities, and is independently impaired in autism (Leslie, 1987, 1991; Baron-Cohen,

1991c). This assertion has been defended by citing evidence that individuals with

autism are able to pass tasks which have equivalent structure and demands to false

belief (or other ToM) tasks, but do not have mentalistic content. This approach of

comparing autistic assets and deficits on tasks which require mentalising and those

which do not has been dubbed the “fine cuts” technique by Frith and Happé (Frith &

Happé, 1994; Happé & Frith, 1995). For example, it has been found that while

individuals with autism fail false belief tasks, they are able to pass tests involving false

photographs, drawings, and models7 (Charman & Baron-Cohen, 1992, 1995; Leekam &

Perner, 1991; Leslie & Thaiss, 1992). Similarly, autistic individuals demonstrate

understanding of behavioural but not mentalistic picture sequences (Baron-Cohen et al.,

1986), understand “see” but not “know” (Perner et al., 1989), and engage in physical

sabotage but not deception (Sodian & Frith, 1992). Additionally, Baron-Cohen (1991c)

found that participants with autism were not impaired in domains of social cognition

which do not require a ToM.

However, the claim of domain specificity has come under increasing criticism,

with recent evidence suggesting that impaired performance on false belief tasks may

7 The false photograph paradigm, for example, runs as follows: a horse puppet takes a photograph of a cat puppet. The cat then moves from the chair to the bed. The child is asked, “In the photograph, where is the cat sitting?”, as well as two control questions probing knowledge of where the cat was when the horse took the photograph and where the cat is now. It is argued that this task is identical in structure to the false belief task, but requires reasoning about outdated physical representations instead of mental representations.

still be explained, and even better accounted for, by deficits in more domain general

processes such as EF (see Section 2.3) or language impairment (Bruner & Feldman,

1993; Tager-Flusberg, 2000). Zelazo and colleagues (Zelazo et al., 1996a; Zelazo,

Burack, Boseovski, Jacques, & Frye, 2001) have argued that failure to explicitly test

and meet the assumption that two tasks (such as the false belief and false photograph

tasks) are of the same underlying complexity is problematic, and weakens any

arguments for domain specificity and modularity (this argument, and further criticisms

of the false photograph task, are discussed in Section 2.3). They have found that

children with autism are impaired on another control task without mental content, which

is matched for underlying complexity to the false belief task (e.g., Zelazo et al., 1996a).

Frye (2000) also provides a number of cogent a priori arguments for why ToM should

not be considered a domain specific function. For example, given that ToM refers to

understanding one’s own beliefs as well as others’ (with research confirming self-other

equivalence on the Smarties and appearance-reality tasks), Frye questions where the

domain boundaries in our own beliefs would be, as our beliefs can be about non-

mentalistic things such as physics or biology. He also provides a critique of Baron-

Cohen’s (1994) Intentionality Detector module, pointing out that assigning intention on

the basis of the direction of movements will tend to over-ascribe intentionality to every

change in direction we happen to take, and under-ascribe intentionality to acts which do

not involve movement, such as not preventing something from happening.

While proponents of the ToM hypothesis have constructed some fairly plausible

defences against several of the attacks on its claims of universality, uniqueness, causal

precedence, explanatory value, and domain specificity, converging counterargument

and evidence has resulted in a general retreat from the strong version of the hypothesis -

that autism may be explained by a single primary cognitive deficit in a ToM module.

While some authors still largely adhere to the original strong version of the ToM

hypothesis (e.g., Surian & Leslie, 1999), many of its original proponents now advocate

a weaker version in which ToM is conceptualised as one of multiple cognitive

impairments in autism (Baron-Cohen & Swettenham, 1997; Frith & Happé, 1994;

Happé & Frith, 1996), and/or is not necessarily considered to be a unitary module, but

rather a more multidimensional ability which emerges gradually during development

(Tager-Flusberg, 2001).

2.2 Executive function (EF)

2.2.1 Defining and measuring EF

Because of its complex and theoretical nature, defining and operationalising “executive

function” has proven to be a persistent problem. To some extent, the chosen definition

of EF is dependent upon the author’s favoured model of its underlying structure.

However, EF is generally understood to be an umbrella term covering a number of

related but distinct high-level cognitive capacities which help guide and control

purposeful behaviour towards attainment of a goal (e.g., Lezak, 1993; Luria, 1966;

Stuss & Benson, 1986; Welsh & Pennington, 1988). These capacities include planning,

set-shifting (also known as attentional switching or cognitive flexibility), strategy

formation, inhibition, working memory, generativity8, decision-making, and self-

monitoring. In his overview of issues in EF assessment, Rabbitt (1997) proposed that:

“executive control is necessary to deal with novel tasks that require us to formulate a goal, to

plan, and to choose between alternative sequences of behaviour to reach this goal, to compare

these plans in respect of their relative probabilities of success and their relative efficiency in

attaining the chosen goal, to initiate the plan selected and to carry it through, amending it as

necessary, until it is successful or until impending failure is recognised.” (p. 3)

Compounding the difficulty with the precise definition of EF is the regular tendency to

use the term “frontal” (or more precisely, “prefrontal”) as a synonym for “executive”,

thereby confusing neuropsychological and neuroanatomical concepts. This confusion

has arisen because the cognitive construct of EF was originally posed in response to

observations of patients with frontal lobe damage, whose disorganised and disinhibited

behaviours were hypothesised to have their origins in executive dysfunction. This led to

a situation whereby some authors have considered any operation performed by the

frontal lobes (or any behavioural symptom of frontal lobe damage) to be an EF,

including constructs or functions such as emotion regulation and affective

responsiveness, social behaviour and personality, insight, humour appreciation, and

self-awareness. In the view of Zelazo and Müller (2002), EF includes both “cool”,

cognitive aspects and “hot”, affective aspects. However, it is important to make it clear

that in this thesis, EF will be considered a purely cognitive (“cool”) construct, as

8 The term ‘fluency’ is often used for this concept, however ‘generativity’ is preferred within this thesis.

defined (albeit broadly) above9, independent from any neuroanatomical basis. While

there is strong support for the notion that EFs are at least partially subserved by frontal

regions of the brain, mounting evidence indicates that the relationship is certainly not

well defined, and many EF measures lack both sensitivity and specificity to frontal

lesions (Reitan & Wolfson, 1994; Stuss & Alexander, 2000; Tranel, Anderson &

Benton, 1994).

Understandably, the measurement of EF in both adults and children has been

just as problematic as its definition. The difficulty with EF measurement was predicted

by Fodor (1983), who proposed the existence of domain-general, non-modular “central

processes” which would be “bad candidates for scientific study” (p.127). Unlike ToM,

where the false belief paradigm has (arguably) become a “gold standard” for its

measurement, a similar gold standard for assessing EF has proven elusive – and perhaps

unfeasible. This is not only because of EF’s complexity, but also because EF is a

theoretical rather than an operational term (Burgess, 1997). To borrow Burgess’

example, one can clearly call a patient dyscalculic if s/he shows impaired performance

on calculation tasks (or prosopagnosic if s/he shows impaired performance on face

recognition tasks), but there is no equivalent way of determining whether or not a

individual may be diagnosed as dysexecutive: there is no prototypical screening

measure. This has meant that unlike any other cognitive domain, the validity of an EF

test is not typically evaluated on psychological grounds, but rather in terms of whether

or not patients with frontal lesions show impaired performance on it. However, the

loose correspondence between the psychological and anatomical make this inference

problematic. For example, one of the most widely used tests of EF is the Wisconsin

Card Sorting Test (WCST; Grant & Berg, 1948), in which participants must work out

rules for sorting cards by certain categories, and then adapt their responses according to

feedback when the rules unexpectedly change. Patients with frontal lobe lesions have

previously been found to achieve less categories and make more perseverative errors

than patients with posterior lesions (Drewe, 1974; Milner, 1963). However, more recent

evidence has shown that non-frontal or diffuse brain damage can produce similar

deficits (e.g., Anderson, Damasio, Jones, & Tranel, 1991; Anderson, Bigler, & Blatter,

9 Clearly, affective factors influence and interact with cognitive processes, however it is possible to distinguish the two conceptually, methodologically and neuroanatomically. Furthermore, in examining the relationship between ToM and EF, considering affective and/or social factors to be part of the EF domain unhelpfully clouds the issue - for example, Zelazo and Müller (2002) actually deem false belief tasks to be tests of EF.

1995), and that adequate WCST performance does not exclude frontal pathology (e.g.,

Eslinger & Damasio, 1985).

There has, however, been some attempt to delineate purely cognitive criteria for

a test being a test of EF. For example, Phillips (1997) proposed that any test

hypothesised to measure EF should have the following characteristics: i) the test should

be novel, in order to tap goal identification and strategic planning (as well-practiced

tasks can be performed using previously formulated strategies); ii) the test should be

effortful in terms of task planning and execution, requiring inhibitory control and

monitoring; and iii) the test may involve working memory, in order to coordinate

concurring processing requirements. Similarly, Walsh (1978) proposed that EF tasks

require novelty, complexity, and the need to integrate information. However, these

criteria clearly apply only to multifactorial EF tasks – other more specific tests of

particular EF components may not involve all of these features (the notion of EF

components is discussed further in the next section).

Besides the difficulty in ascertaining the construct validity of EF tests, there are

several other reasons why measuring EF is challenging. Firstly, the psychometrics of

EF tests are notoriously poor. By its very nature, EF is required in novel situations; yet

because tests can only be novel once, the test-retest reliability of EF tasks is

consequently low (Rabbitt, 1997). Secondly, most EF tests lack purity – that is, they tap

multiple underlying processes, making it difficult to discern specific reasons for failure.

For example, the WCST has been commonly described as a test of abstraction and

flexibility, yet it also requires selective attention to relevant dimensions of the stimuli,

generation of a sorting rule, working memory to hold the sorting principle in mind, and

inhibition of the prepotent response to sort the cards according to the rule just used, as

well as non-EF processes such as the use of verbal feedback provided by the examiner,

and appreciation of the category of number (Ozonoff, 1995a; Pennington & Ozonoff,

1996). Furthermore, attempts to control task demands in order to isolate the relevant

abilities may not always be successful, as some EF tasks specifically require the

simultaneous co-ordination of a variety of different processes (Kimberg & Farah, 1993).

A third difficulty with EF measurement is that of low “process-behaviour

correspondence” (Burgess, 1997). In contrast to most other cognitive domains which

are only manifest in circumscribed situations (e.g., calculation abilities are manifest

when one is required to perform a calculation, or face recognition abilities when

presented with a face), EFs manifest themselves across a range of different situations.

As a consequence, there is an imprecise correspondence between behaviour and the

underlying process: a specific EF impairment can result in a variety of behaviours, and a

specific behaviour may be caused by a variety of EF (or other cognitive) impairments.

Furthermore, the same behavioural sequence is likely to require fewer EFs over time,

even within one short period, as it becomes more practiced. A final problem with

assessment of EF is that testing situations are for the most part structured and guided by

the examiner, removing much of the load on EF for the examinee. This results in poor

ecological validity of EF tests (Cripe, 1996).

Do these issues with EF assessment apply equally to children and adults? Until

recently, the large majority of EF research has focus on adult populations, and

measuring EF in childhood has only lately become a more popular topic of interest10 as

it becomes clear that EF develops much earlier than previously thought (see Section

2.2.2). Hughes and Graham (2002) argue that while EF measurement in children has its

own set of problems, conversely there are actually some difficulties with adult EF

assessment which are not so problematic in childhood. For example, children may

perceive a new task as novel for longer, possibly leading to greater stability in

underlying processes and overall performance (and therefore improved test reliability

and validity). In addition, as EF tasks need to be simplified in order to be

developmentally appropriate, the problem of task impurity is likely to be reduced.

However, there are also difficulties associated with assessing EF in children, the most

obvious of which, according to Hughes and Graham (2002), is children’s limited

language skills. This leads to a number of problems: i) complex task instructions tax

verbal comprehension, which may influence task performance for non-EF reasons; ii)

because fluent literacy is not an automatic skill until late in development, many adult EF

tasks which depend on written language being over-learned are not appropriate for

children (e.g., the Stroop test, in which reading a colour-word such as “red” is assumed

to be a prepotent response, making it difficult to instead say the different colour that the

word is printed in); and iii) language itself may play a role in EF, by both guiding

behaviour through internal self-talk and by enabling the use of verbal working memory

to rehearse strategies. Clearly, the development of appropriate assessment tools for EF

in children is an important focus for ongoing research.

While reviewing the literature on the definition and measurement of EF appears

to paint a rather negative picture, it should be emphasised that EF has still managed to

retain its utility, relevance and validity as a measurable construct. Tranel et al. (1994)

10 For lists of commonly used EF tests in children, see Anderson (1998) or Zelazo and Müller (2002).

argue that despite the difficulties in studying EF, the term provides a useful heuristic or

shorthand for denoting a relatively well-agreed upon set of capacities with several

unique characteristics: i) they are the highest level of human cognition; ii) they are

difficult to operationalise and therefore hard to measure quantitatively; iii) they are

closely intertwined with personality and consciousness; and iv) they have intimate

connections with the prefrontal cortex. In addition, recent research has focused on

improving the measurement of EF by using more fine-grained tests with several

performance measures and multiple control tasks in order to isolate specific components

of EF which may be impaired (Delis, Squire, Bihrle, & Massman, 1992; Godefroy,

Cabaret, Petit-Chenal, Pruvo, & Rousseaux, 1999; Ozonoff, Strayer, McMahon &

Filloux, 1994) as well as attempting to make the task paradigms more ecologically valid

(Manly et al., 2001; Wilson, Evans, Emslie, Alderman, & Burgess, 1998) and child-

friendly (Espy, Kaufmann, Glisky, & McDiarmid, 2001; Gerstadt, Hong, & Diamond,

1994; Hughes, 1998a). It is nevertheless important to acknowledge the difficulties with

the definition and measurement of EF, as it will become evident that these concerns

were influential both in selecting appropriate EF tasks for the current research, and in

interpreting the results gleaned from those tasks.

2.2.2 Models of EF and its development

A burgeoning literature has produced a sizeable number of alternative theoretical

frameworks for conceptualising EF, and, concurrently, the functions of the prefrontal

cortex (see Eslinger, 1996; Grafman, 1994; Stuss & Knight, 2002). As is the case for

EF measurement, models of EF have mostly been based on adults, with theories of EF

development borrowing heavily from adult concepts. Several adult-based models have

utilised the classic distinction between automatic and controlled actions from traditional

cognitive psychology (Atkinson & Shiffrin, 1968; Schneider & Shiffrin, 1977), where

unlike automatic actions, controlled actions involve conscious, effortful processing and

are required in novel, non-routine situations. Based on a similar routine/non-routine

dichotomy, an influential model of EF by Norman and Shallice (Norman & Shallice,

1980, 1986; Shallice, 1988) included two mechanisms for regulating behaviour: the

Contention Scheduler, which operates in routine or overlearned situations via automatic

priming of stored knowledge (analogous to scripts or schemas), which are cued either

by environmental stimuli or conceptual thought; and the Supervisory Attentional System

(SAS), which is activated in non-routine (novel, complex, difficult, and/or conflicting)

situations and in which conscious internal knowledge states can override the contention

scheduling mechanism and set the priority for action by creating new action schemata.

This model held an intuitive appeal and accounted for data on attention and action

failures in patients with prefrontal lesions, who were proposed to have intact Contention

Schedulers but an impaired SAS (Shallice & Burgess, 1991). Shallice (2002; Shallice

& Burgess, 1996) recently elaborated upon his model, outlining a number of different

components to the SAS including schema selection (which can occur via three different

methods), schema implementation, and schema checking and monitoring.

While Shallice (1984, 2002) contends that central control processes consist of

multiple components, others have argued for a more unitary control structure or

mechanism. Duncan (Duncan, Burgess, & Emslie, 1995; Duncan, Emslie, Williams,

Johnson, & Freer, 1996) to some extent represented this position when he claimed that

EF is largely synonymous with Spearman’s g, or fluid intelligence. Others have

proposed that the range of EF failures may be attributed to a single process, such as

inhibition (e.g., Dempster, 1992, 1993) or working memory (e.g., Case, 1985;

Goldman-Rakic, 1995). The idea of a single EF system or mechanism has become

increasingly unpopular, however, as evidence and opinion has converged upon the

notion that EF consists of multiple separable components (Baddeley, 1996, 2002;

Godefroy et al., 1999; Miyake, Friedman, Emerson, Witzki, & Howerter, 2000;

Pennington, 1997; Stuss & Alexander, 2000). This conceptualisation accounts better for

data showing weak correlations between various EF tasks (e.g., Boone, Ponton,

Gorsuch, Gonzalez, & Miller, 1998; Hughes, Russell, & Robbins, 1994; Miyake et al.,

2000) and differential impairment on various EF tasks in patients with lesions in

different parts of the prefrontal cortex (see Stuss & Alexander, 2000). In addition, it

allows us to distinguish between the various clinical groups in which executive

dysfunction is found, by examining qualitative differences in profiles of performance on

EF components (reviewed in the next section).

However, there has been little agreement on the appropriate taxonomy for the

components of EF. Lezak (1995) proposed four EF components: i) volition, ii)

planning, iii) purposive action, and iv) effective performance. In their problem-solving

framework of EF, Zelazo, Carter, Reznick and Frye (1997) also outlined four

components: i) problem representation, ii) planning, iii) execution (rule use) and iv)

evaluation (error detection/correction). While these two frameworks present four

sequential stages to the problem-solving process, other conceptualisations focus more

on concurrent or non-time-dependent executive processes. For example, Anderson

(1998; Anderson, Levin, & Jacobs, 2002) proposes three EF components: i) attentional

control (including selective and sustained attention and response inhibition), ii) goal

setting (incorporating initiation, planning, and problem solving), and iii) cognitive

flexibility (including working memory, attentional shifting and self-monitoring). One

model which has been particularly influential in the developmental literature is Roberts

and Pennington’s (1996) interactive framework, in which performance on EF tasks is

held to be a product of two separate but interdependent processes: working memory (to

hold the task demands or rules in mind) and inhibitory control (to guide behaviour

according to those rules). These two components compete for limited executive

resources. Support for this model comes from studies showing that on EF tasks,

prepotent response errors increase (indicating poorer inhibitory capacity) as the working

memory demands increase (Roberts, Hager & Heron, 1994).

A number of factor analytic studies have unfortunately produced a range of

different results, without clearly indicating any particular model as superior to others

(see Royall et al., 2002, for a review). For example, Burgess, Alderman, Evans, Emslie,

and Wilson (1998) found three factors which they termed Inhibition, Intentionality and

Executive Memory, whereas Collette, van der Linden and Salmon (1999) found two

factors, Inhibition and Working Memory, while Boone et al. (1998) found only one

Cognitive Flexibility factor (although in the latter study the EF tasks were only

modestly correlated and the authors concluded that the tests tapped somewhat different

abilities). In several of these studies, variables from the same EF task often load on

different factors, and in addition, the same task may load on different factors in different

studies depending upon which tests are included in the analysis. These inconsistencies

are hardly surprising given the “impurity” of EF tasks and their questionable

psychometric properties.

As argued by Hughes and Graham (2002), factor analysis studies using children

may be more fruitful, as more simple, “pure” tests may still tax EF in children and test

performances may be more reliable. Support for the fractionation of EF in children has

indeed been provided by several studies which have generally revealed three or four

distinct EF factors (Espy, Kaufmann, McDiarmid, & Glisky, 1999; Hughes, 1998a;

Levin et al., 1991; Luciana & Nelson, 1998; Pennington, 1997; Welsh, Pennington &

Groissier, 1991). Although named differently by different authors, these factors have

consistently included cognitive flexibility or set-shifting, inhibition, and working

memory, with the addition or substitution of a planning component in some studies. For

example, Hughes (1998a) identified Attentional Flexibility, Inhibitory Control and

Working Memory factors, and Pennington (1997) similarly found Set Shifting or

Cognitive Flexibility, Motor Inhibition, and Verbal Working Memory factors; while

Welsh et al. (1991) named their factors Planning, Hypothesis Testing & Impulse

Control, and Fluid & Speeded Response. However, inconsistencies also appear in this

developmental research, where the same task may be clustered with different tasks or be

part of different factors across studies, although it is difficult to tell how much of this

variability is attributable to different performance indices being used in the various

studies.

The stages of EF development have only relatively recently become the subject

of systematic research, as evidence accumulates in opposition to the early influential

notion that the prefrontal cortex was not functional at all until adolescence and did not

reach maturity until around the age of 24 (Golden, 1981). Behavioural and

electroencephalogram (EEG) data as well as case studies of children with early frontal

lesions all now refute this view, indicating prefrontal activity even in infancy. For

example, Diamond and Goldman-Rakic (1989; Diamond, 1985) found that by 12

months of age, human infants achieved errorless performance on classic delayed

response and A-not-B tasks, performance on which they argue requires working

memory and inhibition, and is sensitive to frontal lesions in monkeys. Bell and Fox

(1992) demonstrated changes in frontal EEG recordings during the first year of life

which correlated with improved performance on the A-not-B task. A number of case

studies (Anderson, Bechara, Damasio, Tranel, & Damasio, 1999; Eslinger, Biddle, &

Grattan, 1997; Marlowe, 1992; Price, Daffner, Stowe, & Marsel Mesulam, 1990) have

also demonstrated that very early prefrontal lesions result in immediately noticeable

consequences as well as EF deficits and impaired social and moral behaviour later in

life. Nevertheless, it is clear that although it is certainly not “silent” in infancy, both the

physiological and functional development of the prefrontal cortex follow a particularly

protracted developmental course. Investigations of synaptic density, dendritic growth,

myelination, interhemispheric connectivity, metabolic activity, and electrical (EEG)

activity all show that the prefrontal cortex continues to develop through middle

childhood and adolescence (Diamond, 2002; Huttenlocher & Dabholkar, 1997;

Schwartz, 1997; Thatcher, 1997).

Cognitive studies of the development of EF have focused on mapping

developmental trajectories for the various EF components. An early study by Passler,

Isaac, and Hynd (1985) found that EF development was a multistage process, with a

spurt of development between the ages of 6 and 8 and mastery evident by the age of 12.

Similarly, Chelune and Baer (1986) found that WCST performance improved between 6

and 10 years of age, with adult performance achieved by 12 years. More recent studies

incorporating a larger range of EF measures have extended the age range and more

thoroughly articulated the multidimensional nature of EF development. A study by

Levin et al. (1991) supported previous findings that tests of concept formation, set-

shifting and inhibition appear to be mastered by the age of 12, however they also found

additional gains in their adolescent 13-15 year-olds on measures of generativity and

planning. Welsh et al. (1991) found evidence for three distinct developmental stages,

the first beginning at around 6 years, a second commencing at around the age of 10, and

a third during adolescence. Consistent with Levin et al. (1991), they found that some

components of EF (e.g., the ability to resist distraction, impulse control or inhibition)

matured earlier than others (e.g., generativity, planning skills). An investigation of EF

development in late childhood and adolescence by Anderson, Anderson, Northam,

Jacobs & Catroppa (2001) found that while the developmental trajectory for EFs in this

period was generally flatter than during early and middle childhood, differential

developmental trends were observed within the different EF domains, with attentional

control and planning showing the greatest improvements during adolescence, while

cognitive flexibility was already matured by the age of 12. At the other end of the age

spectrum, studies by Zelazo and colleagues (Zelazo & Reznick, 1991; Zelazo, Frye, &

Rapus, 1996b) have demonstrated developments in rule use (the third stage in their

problem-solving framework of EF) between the ages of 2 and 5. Several studies have

also found significant improvements in inhibitory control between the ages of 3 and 6

years (Diamond & Taylor, 1996; Gerstadt et al., 1994; Kochanska, Murray, & Coy,

1997).

Thus, descriptive studies mapping the development of EF have shown fairly

consistently that i) the first emergence of EF occurs early in life, probably around the

end of the first year; ii) EF development appears to follow a multistage process, with

important changes occurring between the ages of 2-5 and 6-10, with adult performance

levels reached by the age of 12 in several domains, and performance in other domains

continuing to develop through adolescence; and iii) the various components of EF

follow different developmental trajectories, with cognitive flexibility and inhibition

tending to develop first and planning and generativity maturing later (see Anderson,

2002, for a slightly different mapping of EF development).

Attempts to characterise the development of EF within a theoretical framework

have tended to emphasise either one or two central constructs which account for EF

development as a whole. One view is that age-related changes in EF may be explained

by the construct of inhibition, such that children become increasingly able to resist

interference and keep task-irrelevant information out of working memory (Bjorklund &

Harnishfeger, 1990; Dempster, 1992, 1993; Harnishfeger & Bjorklund, 1993). As for

adult models of the structure of EF, this account is limited by its unidimensionality,

defaulting to the explanation that children find some tasks more difficult than others

simply because they require more inhibition, and being unable to explain developments

in EF tasks with minimal inhibitory requirements (Zelazo et al., 1997; Zelazo & Müller,

2002). A more popular approach has been to argue that EF changes result from both

working memory and inhibition, either as potentially separable components (Diamond,

2002; Diamond & Taylor, 1995; Gerstadt et al., 1994) or interacting processes (Roberts

et al., 1994; Roberts & Pennington, 1996). Results of a recent well designed study by

Beveridge, Jarrold, and Pettit (2002) favoured the view of inhibition and working

memory as independent and additive rather than interacting components of EF. While

they found that increasing the working memory load of an inhibition task did have a

detrimental effect on performance (consistent with Roberts et al., 1994), by using tests

with multiple levels of both inhibitory and working memory requirements they found

that interactions between the two processes were non-significant in both 6- and 8-year-

A recent alternative theory of EF development is Zelazo and Frye’s Cognitive

Complexity and Control (CCC) theory (e.g., Frye, Zelazo & Palfai, 1995; Zelazo, 2000;

Zelazo & Frye, 1998), which, according to Frye (2000), is most relevant to the EFs of

planning and deliberative action. This account focuses on development in the preschool

years, proposing that within this period there are increases in the complexity of

children’s rule systems (plans formulated in potentially silent self-directed speech, e.g.,

“If I see a mailbox, then I need to mail this letter”). Complexity is measured by the

number of levels of embedding in these rule systems. Embedded rules establish a

hierarchy in which rules are arranged beneath setting conditions (which select or restrict

the application of a rule), and have the form “if s1, then if a1, then c1” in which s is a

setting condition, a is an antecedent, and c is a consequent. Zelazo and Frye (Frye et al.,

1995; Zelazo & Frye, 1998; Zelazo & Reznick, 1991) have shown that 3-year-olds

readily integrate two “if-then” rules (e.g., in the Dimensional Change Card Sorting

(DCCS) Task, modelled on the WCST, they are able to comprehend “If the test card is

red then place it here; if blue then there”) but having difficulty representing a higher-

order “if-if-then” rule that allows them to switch flexibly between incompatible pairs of

rules (e.g., “If sorting by colour, then if red then here, if blue then there. If sorting by

shape, then if car then here, if flower then there”). The CCC account is reminiscent of

Halford’s proposal that cognitive development is characterised by the developing ability

to represent increasingly complex relations between items in parallel (Halford, 1993;

Halford, Wilson, & Phillips, 1998), but differs from Halford in the central importance

placed on embedded or hierarchical rule structures (Frye & Zelazo, 1998).

While both the inhibition-working memory accounts and the CCC theory of EF

development are able to account reasonably well for data within their specified task and

age domains, it is doubtful whether either of them could account for the range of

findings from descriptive studies mapping developmental trajectories of EF. Neither

theory explicitly accounts for the multi-stage process of development from infancy to

adulthood or the differential rate of development for the various components of EF. To

some extent, these limitations stem from the widely acknowledged problems with the

definition and measurement and EF, which make it difficult to agree on which

component processes underlie each EF task and which aspects are central to EF

development. Nevertheless, this important cognitive domain remains critical to the

explanation of a wide range of clinical disorders - including autism.

2.2.3 EF in autism

Research on executive dysfunction in autism has gathered momentum in a more gradual

manner than the ToM literature, beginning with early case reports of autistic individuals

documenting what would now be called EF deficits (Scheerer, Rothmann, & Goldstein,

1945; Steel, Gorman, & Flexman, 1984). Using tests of spontaneous colour and tone

sequence production, Frith (1972) found what might be interpreted as a generativity

impairment in children with autism, with the autistic sample producing more rigid,

restricted, and less unique patterns. In 1978, Damasio and Maurer published an

influential paper noting behavioural similarities between individuals with autism and

patients with frontal lobe damage, such as ritualistic and compulsive behaviours and

concreteness in thought and language. They proposed a neurological model of autism

involving the frontal lobes and parts of the temporal lobes, basal ganglia and thalamus.

Following this, Rumsey and colleagues (Rumsey, 1985; Rumsey & Hamburger, 1988)

tested a group of high-functioning male adults with autism on executive and non-

executive neuropsychological tasks. They found that the autistic men performed

significantly more poorly than controls on the WCST as well as measures of cognitive

flexibility and problem solving, but showed intact or only mildly impaired performance

in other cognitive domains. This finding was followed up in autistic adolescents by

Prior and Hoffman (1990), who found impaired performance on the WCST and a maze

test; and in individuals with Asperger syndrome by Szatmari, Tuff, Finlayson, and

Bartolucci (1990), who found impaired performance on the WCST.

The current era of EF research in autism was launched by a study by Ozonoff et

al. (1991), which compared the primacy of ToM and EF impairments in a group of

high-functioning children with autism. Contrary to their expectation, Ozonoff et al.

found that in their autism group, EF deficits (as measured by the WCST and the Tower

of Hanoi, a measure of planning) were more universal than ToM deficits and were

better predictors of autism group membership. Such findings have since been

consolidated in a number of studies showing impairment of individuals with autism

compared with age and IQ-matched controls on tasks tapping a range of EF

components, including cognitive flexibility or attentional shifting (Ciesielski & Harris,

1997; Courchesne et al., 1994; Goldstein, Johnson, & Minshew, 2001; Hughes &

Russell, 1993; Hughes et al., 1994; Minshew et al., 1992; Ozonoff & Jensen, 1999;

Ozonoff & McEvoy, 1994; Ozonoff et al., 1994), planning (Hughes, 1996a; Hughes et

al., 1994; Ozonoff & Jensen, 1999; Ozonoff & McEvoy, 1994), and generativity

(Boucher, 1988; Craig & Baron-Cohen, 1999; Lewis & Boucher, 1991; Turner, 1999;

Williams, Moss, Bradshaw, & Rinehart, 2002). In their review of studies on EF in

autism, Pennington and Ozonoff (1996) calculated the average effect size of group

differences on EF tasks to be 0.98 (a large effect according to Cohen, 1988), and as high

as 2.07 on the Tower of Hanoi. These EF deficits do not appear to be attributable to

impairments in more basic attentional processes, such as sustained or selective attention

or basic attentional capacity (Bryson, Landry, & Wainwright, 1997; Garcia-Villamisar

& Della Sala, 2002; Garretson, Fein, & Waterhouse, 1990; Goldstein et al., 2001;

Minshew et al.,1992).

These findings resulted in the hypothesis that EF deficits may be primary in

autism (e.g., Hughes & Russell, 1993; Ozonoff et al., 1991; Russell, 1997a).

Furthermore, prefrontal dysfunction has been posited to be the underlying

neuroanatomical basis for EF impairment in autism (Ozonoff, 1995a; Ozonoff et al.,

1991). In a test of the prefrontal hypothesis, Bennetto, Pennington, and Rogers (1996)

examined the pattern of performance displayed by individuals with autism on various

memory tasks, and found that it was consistent with that typically displayed by frontal

lobe patients. Minshew, Luna, and Sweeney (1999) found that the pattern of

performance of autistic individuals on oculomotor tasks suggested a disturbance in

prefrontal circuitry. Consistent with the notion that autistic symptomatology may have

its basis in frontal dysfunction, impairments in social interaction, spontaneous speech

and pragmatic communication, and the production of novel, goal-directed behaviours,

are also displayed by patients with frontal lobe damage, including children who have

sustained early damage to the prefrontal cortex (e.g., Alexander, 2002; Ames,

Cummings, Wirshing, Quinn, & Mahler, 1999; Anderson et al., 2002; Eslinger et al.,

1997; Stuss & Benson, 1984, 1986; Tranel, 2002).

Neuropathological and neuroimaging studies have also provided some evidence

of prefrontal abnormalities in autism, although so far no gross abnormalities have been

consistently identified. Casanova, Buxhoeveden, Switala, and Roy (2002) discovered

minicolumnar abnormalities in the frontal and temporal lobes of children with autism,

and Piven et al. (1990a) found evidence of abnormal neural migration in the frontal

lobes of three autistic individuals, although this was only one-fifth of their sample.

Piven et al. (1995) also found that the frontal lobes were small in comparison with other

cortical areas in subjects with autism. Zilbovicius et al. (1995) demonstrated evidence

that maturation of the frontal cortex (as measured by regional cerebral blood flow,

rCBF) was delayed in autistic individuals. Decreased rCBF in frontal areas has also

been found in a number of other studies (George, Costa, Kouris, Ring, & Ell, 1992;

Ohnishi et al., 2000; Sherman, Nass, & Shapiro, 1984). Functional neuroimaging

studies have shown reduced dorsolateral prefrontal activation during spatial working

memory tasks in autism (Luna et al., 2002) as well as differences in the pattern of

activation in the prefrontal cortex and other brain regions during ToM tasks (Castelli et

al., 2002; Happé et al., 1996).

However, prefrontal changes are only one of many brain abnormalities which

have been documented in autism (see Bauman, 1999; Deb & Thompson, 1998; Koenig,

Tsatsanis, & Volkmar, 2001), with other areas of significance including the cerebellum

(see Courchesne, 1997), corpus callosum (Piven, Bailey, Ranson, & Arndt, 1997a), and

limbic or medial temporal structures (Bachevalier, 1994; Bauman & Kemper, 1994).

Most neurobiological theories of autism in fact do not give prominence to the prefrontal

cortex (e.g., Akshoomoff, Pierce, & Courchesne, 2002; Waterhouse, Fein, & Modahl,

1996). Concluding that prefrontal abnormalities are the most significant in causing the

symptoms of autism would therefore be premature, particularly given the inconsistency

of results across neurobiological studies of autism in general (Ozonoff, 2001). In

addition, a major difficulty with the prefrontal hypothesis is that children who sustain

early lesions to the prefrontal cortex do not actually develop autism, but more often

display a syndrome resembling psychopathy or conduct disorder (Anderson et al., 1999;

Eslinger, Grattan, Damasio, & Damasio, 1992; Eslinger et al., 1997). While the

behavioural impairments displayed by children with frontal lesions may be broadly or

categorically similar to those displayed by children with autism, there are obvious

qualitative differences and it would be difficult to mistake one for the other in a clinical

setting. However, it may be that prefrontal dysfunction is a necessary but not sufficient

criterion for the development of autism (Ozonoff, 1995a), or that the timing of the insult

is a crucial variable in behavioural outcome. Nevertheless, it is not clear that prefrontal

dysfunction is the neurobiological basis for EF impairment in autism11.

The central concern of this section, however, is the hypothesis that EF

impairment, regardless of its neuroanatomical underpinnings, may be the primary

cognitive impairment in autism (Hughes & Russell, 1993; Ozonoff et al., 1991;

Ozonoff, 1995a; Pennington et al., 1997; Russell, 1997a). The associated claim of this

hypothesis is that a primary impairment in EF may also explain the ToM deficit

observed in individuals with autism (a claim which is examined in Section 2.3.2). As

for ToM, the EF hypothesis of autism has undergone a number of tests of whether or not

it meets the criteria for primacy, as reviewed below. Unlike ToM, the EF hypothesis is

not required to prove its domain specificity, as EF is not claimed to be modular.

i) Universality. The evaluation of whether or not EF deficits are universal in

autism has not received nearly as much attention as the equivalent question in the ToM

literature. This is probably because of the lack of a “gold standard” of EF performance

on which failure can be unequivocally evaluated – unlike false belief tasks, performance

on EF tasks is usually not a matter of pass or fail. Many studies of EF in autism

interpret the presence of a group difference as evidence of an EF deficit in autism, but

do not look further at what proportion of the autism group showed such a deficit. Those

researchers who have examined the universality of deficits have tended to choose an

arbitrary criterion for what comprises a “fail” or what defines a “deficit”. Ozonoff et al.

(1991) used the proportion of participants scoring below the mean of the control group

as their index of the universality of deficits, and found that 96% of their autism group

showed an EF deficit using this criterion. This finding was (and still is) cited as

11 As Pennington et al. (1997) point out, EF impairment can also result from diffuse structural or metabolic differences in brain development or diffuse brain lesions in adult patients (as discussed in Section 2.2.1), indicating either that disrupting the connectivity of the whole brain may mimic the effects of a focal frontal lesion, or that the neuroanatomical basis of EF tasks is not specific to the prefrontal cortex.

evidence that EF deficits were almost universal among individuals with autism.

However, defining any score below the mean as a deficit is a very lenient criterion – in

most rating systems for impairment severity, a score needs to be at least 1 standard

deviation (SD) below the mean to be considered as even mildly impaired (Heaton,

Grant, & Matthews, 1991; Lezak, 1995).

Other studies have been more equivocal than Ozonoff et al.’s (1991) study in

their findings on the universality of EF impairments in autism, usually finding lower

proportions of individuals with autism demonstrating difficulties. Liss et al. (2001)

found that 57% of their group of high-functioning autistic adolescents scored within 1

SD of the mean of the control group on the WCST, and 29% performed better than the

control mean. Teunisse, Cools, van Spaendock, Aerts, and Berger (2001) found that

only 46% of their high-functioning adolescents with autism showed poor cognitive

shifting, defined using the lenient criterion of any positive z score on the sum of two

variables measuring the number of trials required for successful performance. Hughes

et al. (1994) reported that 67% of their autism group failed both the Tower of London

(ToL) and the Intradimensional, Extradimensional (IDED) set-shifting task (using an

arbitrary criterion of failure), with 92% failing the ToL and 75% the IDED task.

Ozonoff and McEvoy (1994) report that 41% of their autistic sample performed within

the normal range on the WCST (but none on the Tower of Hanoi); and Ozonoff and

Jensen (1999) found that up to 36% of their sample scored above the control mean on at

least one EF task, with only half the sample performing below the control mean on all

three tasks used.

So, while universality has been assessed using a variety of different methods

making it difficult to compare results across studies, it is apparent that EF deficits are

not universal among individuals with autism. Those who have examined the

characteristics of individuals who perform in the normal range on EF tasks have usually

found, as for ToM, that they are older (Ozonoff & Jensen, 1999) and/or have higher

verbal IQ (Liss et al., 2001; Ozonoff & McEvoy, 1994). However, unlike ToM, it

would be difficult to argue that these older and higher-functioning individuals have

developed some kind of compensatory strategy to aid their EF performance, as the very

nature of EF tasks is that they are novel and would not have been encountered before,

making strategies difficult to develop in advance (although it is conceivable that general

strategies may have developed or been learned to offset recognised limitations in certain

cognitive abilities, for example by using visual imagery to compensate for a verbal

working memory deficit). An alternative argument might be that these EF “passers”

could show difficulties on tasks tapping other components of EF which may not have

been measured in the studies reviewed (e.g., generativity). Hughes (2001) has

suggested that low- and high-functioning individuals with autism may show distinct

types of EF impairment, as indicated by the different types of repetitive behaviour

displayed by these two groups (Turner, 1997), and therefore that group heterogeneity

may prevent universal EF characteristics from being discovered. In addition, it may be

that different aspects of EF are impaired at different points in the development of

children with autism in comparison with age-matched controls, as EF components show

different developmental trajectories. These possibilities await empirical investigation.

ii) Uniqueness. A major challenge to the EF hypothesis of autism (which has

been acknowledged and discussed by all its proponents) is that EF impairments are

displayed in a number of other disorders, including ADHD (e.g., Grodzinsky &

Diamond, 1992; Oosterlaan, Logan, & Sergeant, 1998; Pennington, Groissier, & Welsh,

1993; Shallice et al., 2002) schizophrenia (e.g., Elliott, McKenna, Robbins, & Sahakian,

1995; Pantelis et al., 1997; see Hoff & Kremen, 2003), Tourette’s syndrome (Baron-

Cohen & Robertson, 1995; Channon, Flynn, & Robertson, 1992), obsessive-compulsive

disorder (Christensen, Kim, Dyksen, & Hoover, 1992; Cox, Fedio, & Rapoport, 1989;

Head, Bolton, & Hymas, 1989; Veale, Sahakian, Owen, & Marks, 1996), and early-

treated phenylketonuria (Diamond, Prevor, Callender, & Druin, 1997; Smith, Klim, &

Hanley, 2000; Welsh, Pennington, Ozonoff, Rouse, & McCabe, 1990), not to mention

neurological disorders affecting frontal lobe functions such as frontotemporal dementia

(e.g., Razani, Boone, Miller, Lee, & Sherman, 2001; see Grossman, 2002), Parkinson’s

disease (e.g., Owen et al., 1993), and traumatic brain injury (e.g., Anderson et al., 2002;

Levine et al., 1998). The question of how these symptomatically different disorders

could all share the same cognitive basis has been dubbed the “discriminant validity

problem” (Pennington & Ozonoff, 1996).

Proponents of the EF hypothesis of autism have attempted to circumvent this

difficulty meeting the uniqueness criterion by proposing that different disorders may be

characterised by different severity and age of onset of EF impairment, and different

profiles of impairment on the various components of EF (Ozonoff, 1997b; Ozonoff et

al., 1994; Pennington & Ozonoff, 1996). In particular, Ozonoff (1997b; Ozonoff &

Jensen, 1999) has suggested that autism is characterised by deficits in cognitive

flexibility and planning but spared inhibitory capacity. Studies by Ozonoff et al. (1994)

and Ozonoff and Jensen (1999) demonstrated evidence of differentiation among the EF

profiles of children with autism, ADHD and Tourette’s syndrome, with autistic children

showing poor performance on tests of cognitive flexibility and planning but not

inhibition, the ADHD sample characterised by a specific impairment in inhibitory

control, and individuals with Tourette’s showing little evidence of any EF impairment.

The notion of intact inhibition in autism has received support in other studies using a

negative priming paradigm (Brian, Tipper, Weaver, & Bryson, 2003; Ozonoff &

Strayer, 1997). A few studies have suggested that working memory may also be spared

in autism (Ozonoff & Strayer, 2001; Russell, Jarrold, & Henry, 1996).

However, while these studies appear to paint a relatively clean picture of spared

and impaired EFs in autism and other disorders, other findings have not been so

consistent. Some have argued for and/or found evidence of impairments in inhibition

(Hughes, 1996b; Rinehart, Bradshaw, Tonge, Brereton, & Bellgrove, 2002; Williams et

al., 2002) and in working memory (Bennetto et al., 1996; Pennington et al., 1997) in

individuals with autism, although it has been argued that these impairments emerge only

if the task involves both inhibitory and working memory requirements (Russell, 1997b).

Others have found set-shifting to be intact in high-functioning individuals with autism

and/or Asperger syndrome (Ozonoff et al., 2000; Rinehart, Bradshaw, Moss, Brereton,

& Tonge, 2001; Turner, 1997). The role of generativity impairments (e.g., Turner,

1999) has also been under-emphasised. In her review of EF in autism, Hughes (2001)

was able to say only that EF deficits in autism were “high-level and non-spatial” (p.

258), a characterisation which clearly lacks the desirable specificity. Research on EF in

other disorders is characterised by similar inconsistencies. For example, in their review

of EF in ADHD, Sergeant, Geurts, and Oosterlaan (2002) concluded that the pattern of

EF deficits in ADHD was not consistent across studies and did not appear to be specific

to ADHD.

A number of factors may have contributed to the lack of consistency in

identifying unique EF profiles across studies of EF in autism and other disorders.

Firstly, the usual concerns about the measurement precision of EF tasks will inevitably

affect the reliability and interpretation of results (particularly given that most studies

have used multifactorial tasks such as the WCST). Secondly, different research groups

have focused on different aspects of EF depending upon their theoretical framework

(Hughes, 2001), with very few studies including tasks measuring the full range of EF

components. Thirdly, the task modality and/or mode of response (i.e., verbal or non-

verbal) has varied across studies, which may be important as spatial ability is superior to

verbal ability in autism (Happé & Frith, 1996), and this superior spatial ability may

boost performance on non-verbal EF tasks as compared with verbal tasks (Hughes,

2001). Fourthly, it has been suggested that some of the inconsistencies across studies

may be related to whether or not the tasks were computerised, and therefore whether

performance may have been affected by the need for social interaction; and also,

whether or not feedback about task performance is given verbally by the examiner or is

provided automatically by the task (Ozonoff, 1995b, 2001). Finally, as discussed

earlier, variability in the age and level of functioning (i.e., intellectual ability) of the

sample may also influence the pattern of results12. Clearly, studies using a wide range

of well-defined EF tasks, using both verbal and non-verbal response modes, and which

may be broken down into component processes, are needed to determine whether

autism is associated with a unique EF profile.

iii) Causal precedence. This criterion of primacy has also posed difficulties for

the EF hypothesis. The first study to test EF in young children with autism (McEvoy,

Rogers, & Pennington, 1993) found that autistic preschoolers (mean age = 5.1 years)

made significantly more perseverative errors on a spatial reversal task tapping inhibition

and set-shifting (but not A-not-B, Delayed Response or Delayed Alternation tasks,

which showed ceiling and floor effects). Similarly, Dawson, Meltzoff, Osterling, and

Rinaldi (1998) found that young children with autism (mean age = 5.4 years) performed

more poorly than controls on a version of the A-not-B task. However, no other studies

have managed to replicate these findings. Using a younger sample than McEvoy et al.

(1993), Wehner and Rogers (1994, cited in Pennington et al., 1997) found no difference

between their autism and control groups on the same spatial reversal task. To examine

whether this may be because perseverative behaviour increases as children with autism

grow older (contrary to typical development), Griffith, Pennington, Wehner, and Rogers

(1999) conducted a longitudinal study investigating the development of EF over the

course of a year. Besides finding no difference in performance between young children

with autism (mean age = 4.3) and developmentally delayed controls on a range of age-

appropriate EF tasks (measuring inhibition, set-shifting, spatial and object working

memory, and action monitoring) upon initial testing, they found that performance on the

spatial reversal task at the second testing period did not change significantly over time,

suggesting that young children with autism do not exhibit a deficit on this task at either

12 Besides the possibility that low- and high-functioning children with autism may actually exhibit different profiles of EF impairment, Russell (1997b) also points out that different findings in these two groups may be due to the different comparison groups used. If the control group consists of developmentally delayed or mentally retarded children who also display EF impairments (in comparison to typically developing children), then low-functioning children with autism may not be impaired compared with their IQ-matched control group. However, high-functioning children with autism may display an impairment in comparison with their more typically developing controls.

4 or 5 years of age. More recent studies by Dawson et al. (2002a), Stahl and Pry (2002),

and Rutherford and Rogers (2003) have also failed to find EF deficits in young children

with autism using the spatial reversal task and other measures of inhibition, working

memory, and set-shifting, although Rutherford and Rogers (2003) reported marginal

differences on their measure of generativity.

The apparently intact EF abilities of young children with autism suggest that

executive dysfunction cannot account for the earliest symptoms of autism. However,

two possible explanations for these null findings have been proposed (Griffith et al.,

1999; Pennington et al., 1997). Firstly, most of the above studies have used children

with developmental delays as controls, and have found that these children perform more

poorly on the EF tasks than typically developing children. Hence, it may be that young

children with autism are impaired on the EF tasks, but this impairment is not specific to

autism. A second explanation, which is more favourable to the EF hypothesis of

autism, is that EF impairments have been missed because the studies have not

incorporated the full range of EF components or task modalities in their test batteries.

Because of the age of the children, tasks have been primarily non-verbal (which may

favour children with autism); and in addition, measures of planning and generativity

have not been included, probably because of the lack of age-appropriate measures of

these abilities. Rutherford and Rogers’ (2003) finding of marginal differences in

generativity in young children with autism indicates that this explanation may hold

promise. In addition, Russell and colleagues (Biro & Russell, 2001; Russell, 1997b;

Russell, Jarrold, & Hood, 1999) have proposed that children with autism are only

challenged by EF tasks if they contain arbitrary rules (which require the online rehearsal

of novel strategies), which most EF tests used with very young children (such as the A-

not-B task) do not contain. As usual, however, further studies are needed to investigate

all of these possibilities.

iv) Explanatory value. Unlike the ToM hypothesis, the ability of the EF

hypothesis to account for the range of autistic symptomatology is perhaps its greatest

strength. While executive dysfunction was initially proposed largely to explain the

repetitive behaviours and restricted interests characteristic of autism, it has also fairly

consistently demonstrated links with social and communicative impairments. Two

studies by Berger and colleagues (Berger, Aerts, van Spaendonck, Cools & Teunisse,

2003; Berger et al., 1993) found that set-shifting performance was a significant

predictor of social understanding and social competence in high-functioning adolescents

and young adults with autism, although the same group was not able to replicate that

result in a third study (Teunisse et al., 2001). Gilotty, Kenworthy, Sirian, Black, and

Wagner (2002) found significant correlations in their autistic sample between parental

reports of everyday executive abilities (using the Behavior Rating Inventory of

Executive Function) and social and communication skills as measured by the Vineland

Adaptive Behavior Scales, such that impaired EF was associated with poorer adaptive

skills. McEvoy et al. (1993) also found a significant correlation between EF and early

social and communication skills in young children with autism. Liss et al. (2001) found

that relationships between EF and adaptive functioning were no longer significant when

VIQ was partialled out; however, their study was also inconsistent with other studies in

finding that autism versus control group differences on EF tasks disappeared when VIQ

was accounted for.

A number of studies have examined the relationship between EF deficits and

joint attention, as one of the earliest signs of social impairment in young children with

autism13. It has been proposed that difficulties in shifting attention, rather than

mentalising impairment, may underlie problems with joint attention in young autistic

children (Burack, 1994; Courchesne et al., 1994). This proposal received support in

McEvoy et al.’s (1993) study, which found that the frequency of joint attention

behaviours was significantly correlated with cognitive flexibility. However, those

studies which failed to find EF impairments in young children with autism have

generally also not found any correlation between EF and joint attention. Stahl and Pry

(2002) and Rutherford and Rogers (2003) both found no relationship between EF

measures and joint attention in their young autistic sample. Dawson and colleagues

(Dawson et al., 1998, 2002a) found that performance on tasks purportedly tapping

ventromedial prefrontal and medial temporal function (e.g., the delayed nonmatching to

sample task) correlated with joint attention, but not performance on tasks tapping

dorsolateral prefrontal function (i.e., more classic EF tasks such as the A not B and

spatial reversal tasks). These findings of no relationship between EF and early signs of

social impairment are likely to relate to the null findings of group differences in EF at

that young age. If young children with autism do not show impairment on EF tasks, one

can hardly argue that EF deficits underlie their abnormal joint attention behaviours.

Indeed, Swettenham et al. (1998) found that infants with autism (mean age = 20

13 While joint attention is clearly a social behaviour, it is also often interpreted as a precursor to or marker of ToM (Baron-Cohen, 1995). The relationship between joint attention and EF is therefore also relevant to the upcoming section on the relationship between ToM and EF in autism. Similarly, pretend play is thought to reflect metarepresentative capacity and so its relationship with EF is also of relevance to that section.

months) showed more difficulty shifting attention between people than between objects,

suggesting that their impairment may lie in social orientation rather than attentional

shifting. Nevertheless, the extent to which EF and joint attention may be related in

autism remains a matter of debate (Hughes, 2001).

Several studies have also examined the link between EF and the absence of

spontaneous pretend play in autism. As for joint attention, these investigations have

been spurred by suggestions that a lack of pretend play may have its basis in EF deficits,

specifically an impairment in generativity (Jarrold, Boucher, & Smith, 1994a, 1996;

Lewis & Boucher, 1995) or the inability to disengage attention from salient external

stimuli to access internal, hypothetical play schemas (Harris, 1993; Harris & Leevers,

2000) rather than an inability to mentalise. Observations that children with autism have

intact comprehension of pretend acts (Jarrold, Smith, Boucher, & Harris, 1994b) and

that they are able to produce structured, elicited, or instructed pretence (Lewis &

Boucher, 1988) are consistent with this view. In a series of studies, Jarrold et al. (1996)

showed that children with autism have the capacity to engage in the mechanics of

pretence, but that they produced significantly less pretence than controls in spontaneous

and weakly structured conditions, suggesting that their difficulty lay in the production

or generation of pretence. Similarly, Lewis and Boucher (1995) found that the

generation of original actions in the play of autistic children was more consistent with a

generativity hypothesis than a metarepresentational deficit. In a more recent study,

Rutherford and Rogers (2003) found that the performance of children with autism on a

generativity task was a significant predictor of pretend play.

Surprisingly, although the usual behavioural consequences of EF or prefrontal

impairment correspond closely with the repetitive behaviours displayed by individuals

with autism, only one published study has directly examined the association between EF

and repetitive behaviours in autism. Turner (1997) found significant correlations in

children with autism between measures of inhibition, set-shifting, and generativity and

the incidence and severity of various aspects of repetitive behaviour (e.g., repetitive

movements, circumscribed interests) as measured by a parental interview. Furthermore,

specific EF components appeared to underlie different types of repetitive behaviour; for

example, repetitive movements were associated with performance on a test of

inhibition, whereas sameness behaviour was correlated with measures of generativity.

The EF hypothesis has therefore demonstrated good explanatory value in terms

of its ability to account for both social and communicative impairments and repetitive

behaviours in autism, with the exception of early signs of social impairment such as

joint attention. Like the ToM hypothesis, it is less able to account for non-triad features

of autism such as savant abilities and heightened visuospatial and visuoperceptual skills,

a fact which has been somewhat overlooked in the EF literature. The causal direction of

correlations between EF and behavioural symptoms is another issue to consider, as it

may be that executive dysfunction is a consequence of fewer social interactions or

engagement in high rates of repetitive, restricted activities, rather than vice versa.

However, evidence does not appear to support this possibility. In a number of studies,

increasing the structure of the environment has been found to result in less stereotypic

and more social behaviours (Clark & Rutter, 1981; Dadds et al., 1988), indicating that

reducing EF demands facilitates social interaction and reduces repetitive behaviour,

which would not be expected if EF impairment was caused by the behavioural

symptoms. Also, the results of longitudinal studies have shown that EF performance

predicts later social understanding (Berger et al., 1993, 2003).

So how does the EF hypothesis fare? Like ToM, while it has defended some of its

weaknesses fairly successfully, it does not convincingly meet all of the criteria for a

single primary cognitive deficit of autism. Although it holds good explanatory value for

most of the symptoms of autism (with the exception of some early symptomatology and

non-triad features), the evidence collected so far suggests it lacks causal precedence and

that EF deficits are not universal among individuals with autism. In addition, the

variability of findings among studies of EF in autism is problematic. Methodological

issues of the measurement precision of EF tasks, the different developmental trajectories

of the various EF components, the difficulty in designing age-appropriate EF tasks for

young children which tap the range of EF abilities, the variability among studies in the

age and level of functioning of the sample, and variations in the modality, arbitrariness

of the rules, and mode of presentation, response, and feedback of the tasks used, have

all clouded the definition of the universality, specific profile, and developmental course

of EF deficits in autism. While it is fairly clear that autism is characterised by

significant EF deficits, these methodological issues need to be addressed in order to

determine how primary those deficits may be to autism.

2.3 The ToM-EF relationship

2.3.1 Models of the ToM-EF relationship

On the surface, there is no particular reason to propose a link between the constructs of

ToM and EF: why should the ability to attribute mental states to oneself and others

relate to cognitive capacities which aid the control of action? In fact, an accumulating

number of recent studies have consistently demonstrated an empirical relationship in

typical development; for example, there are strong correlations between various types of

ToM and EF tasks which remain significant when age and IQ variables are partialled

out (Carlson & Moses, 2001; Carlson, Moses, & Breton, 2002; Davis & Pratt, 1995;

Frye et al., 1995; Gordon & Olson, 1998; Hala, Hug, & Henderson, 2003; Hughes,

1998a, 1998b; Lang & Perner, 2002; Russell et al., 1991; see also Perner & Lang, 1999,

who report a large average effect size of 1.08 across the studies conducted up until

then). Marked improvements in ToM and in EF (particularly inhibitory control) both

occur around the same age, in the preschool period between 3 and 5 years of age (e.g.,

Gerstadt et al., 1994; Kochanska et al., 1997; Wellman et al., 2001; Zelazo et al.,

1996b). The co-occurrence of ToM and EF deficits not only in autism but also in

schizophrenia (e.g., Corcoran et al., 1995; Elliott et al., 1995), frontal lobe pathologies

(Bach et al., 1998; Channon & Crawford, 2000; Gregory et al., 2002; Rowe, Bullock,

Polkey, & Morris, 2001; Saltzman et al., 2000), and possibly Fragile X syndrome

(Garner, Callias, & Turk, 1999) is also suggestive of a meaningful relationship. A range

of explanations for this observed relationship have been proposed by various authors,

including links based on i) the EF requirements of ToM tasks (“expression” accounts),

ii) a third common conceptual requirement, iii) functional dependence during

development (“emergence” accounts), and iv) shared neuroanatomical bases. These

classes of explanation each have different implications for the nature of the relationship

between ToM and EF impairments in autism (these implications are reviewed in Section

2.3.2 and are important in the interpretation of the results of analyses of the ToM-EF

relationship in Study One). As such, a review of each follows.

2.3.1.1 Expression accounts14

This type of account holds that the relationships observed between performances on

ToM and EF tasks are (at least partly) due to the executive requirements of ToM tasks,

and therefore that failure on ToM tasks may be caused by impaired or underdeveloped

EF rather than (or in addition to) poor mentalising ability. In other words, EF might

affect the expression of ToM capacity. Proponents of this account have emphasised

either inhibitory control (Carlson & Moses, 2001; Carlson, Moses, & Hix, 1998; Hala &

Russell, 2001; Leslie & Polizzi, 1998; Roth & Leslie, 1998; Russell et al., 1991),

working memory (Davis & Pratt, 1995; Gordon & Olson, 1998; Keenan, 1998), or a

combination of both inhibition and working memory (Carlson et al., 2002; Hala et al.,

2003) as the crucial EF factors affecting ToM performance. These ideas have been

tested both by manipulating the EF requirements of various ToM tasks and by

examining correlations or predictive relationships between the relevant EF components

and ToM variables.

The idea that ToM tasks require inhibitory control has been advanced by several

authors, although there has been some disagreement regarding exactly what it is that

needs to be inhibited. Across a series of studies, Russell and colleagues (Hala &

Russell, 2001; Russell et al., 1991; Russell, Jarrold, & Potel, 1994) have argued that

knowledge of current physical reality is more salient than knowledge of mental reality,

and that tests of both deception and false belief require children to suppress or inhibit

responding on the basis of their physical knowledge in favour of their less salient mental

knowledge15. For example, it is proposed that in the standard false belief (unexpected

transfer) task, the child is required to disengage from (and inhibit the prepotent response

to report) his/her knowledge about where the object is currently located, and instead

refer to an empty location.

This hypothesis has been tested mainly by using a measure of strategic deception

called the “windows task”. Russell et al. (1991) found that despite prior training (using

14 Perner and Lang (1999, 2000) label this class of explanation slightly differently, as “Executive component in ToM tests”. As the notion of common task requirements refers to the idea that common underlying processes are required in both tasks, the term “expression account” is preferred here, as this more accurately encompasses the notion that EFs may influence the expression of ToM capacity in everyday life (due to the executive requirements of perceiving and inferring others’ mental states) as well as on structured tasks. 15 It should be noted that while Russell (1996, 1997b) argues that most ToM tasks confound EF and mentalising demands, his view of the ToM-EF relationship is not actually that it may be explained entirely because of common performance requirements. His main theory is reviewed in this section under the heading of “emergence accounts”.

two opaque boxes) on the rules of the task whereby they had to point to an empty

location to prevent their opponent from winning a chocolate, 3-year-olds typically

pointed to the true location of the chocolate on test trials, where they were able to see

the chocolate but the opponent could not. Furthermore, the majority of the 3-year-olds

persisted in revealing its true location across a series of 20 trials, suggesting that a

failure of inhibition or inability to disengage from a salient stimulus was underlying

their difficulty, rather than a conceptual deficit with deception (and thus ToM). This

interpretation of the results was supported in two further studies. Russell et al. (1994)

found that removing the opponent (and therefore the requirement to deceive) did not

affect 3-year-olds’ performance on the windows task. Conversely, Hala and Russell

(2001) found that the performance of 3-year-olds improved when the inhibitory

demands of the task were reduced, such as by removing the requirement to directly

point to the chocolate and instead using a mechanical pointer to indicate the appropriate

response (as pointing correctly to true locations is likely to be a well-practiced,

reinforced and therefore prepotent response). Using a different approach, Moore et al.

(1995) found that when their own desires were particularly strong or salient, 3-year-olds

performed as poorly on a conflicting desire task as on a false belief task. This suggests

that even though desire is purportedly easier for young children to understand because

of its non-representational nature, 3-year-olds have difficulty judging others’ desires

when EF demands are high. Together, these results suggest that the failure of young

children on ToM tasks may be at least partially attributable to inadequate inhibitory

control rather than (or in addition to) a poorly developed ToM.

Carlson, Moses and colleagues (Carlson & Moses, 2001; Carlson et al., 1998)

have outlined a similar account of the role of inhibition in ToM performance. Like

Russell, while they allow for genuine development in the understanding of mental

concepts, they argue that the inhibitory requirements of ToM tasks affect the expression

of ToM ability in 3-year-olds16. Using a similar deception paradigm as Hala and

Russell (2001), Carlson et al. (1998) found that 3-year-olds showed improved

performance under conditions requiring low inhibitory control (i.e., when pictorial cues

or arrows were used to mislead the opponent rather than pointing), and that they were

equally successful in using arrows to point whether the opponent was present or not. In

a correlative study, Carlson and Moses (2001) found that the link with inhibition

16 Carlson and Moses (2001) also state that their results are equally compatible with an expression account (i.e., that inhibitory dysfunction impedes the expression of ToM ability) as with Russell’s (1996, 1997b) emergence account, to be reviewed later as mentioned in the previous footnote.

extended beyond deception, finding that performance on a battery of ToM tasks was

significantly correlated with a number of indices of inhibitory control, and that these

correlations remained robust after the effects of age, gender, number of siblings, verbal

ability, and a number of other cognitive abilities were removed. Other studies have also

indicated that like deception, young children’s performance on the standard false belief

task improves when inhibitory demands are reduced by using a response mode that is

not influenced by a prepotent response history or by reducing the salience of the desired

object’s current location. Examples include tracing out the path a naive character would

take in searching for their desired object (Freeman, Lewis, & Doherty, 1991), giving an

explanation for a protagonist’s wrong search in an empty location (Bartsch & Wellman,

1989), and indicating which of two twin boys, one searching in the actual location and

one in the empty location, had been absent during the transfer of the object (Robinson &

Mitchell, 1995).

In line with their ToMM-SP model (described in Section 2.1.2), a more

unequivocal expression account of the relationship between inhibitory processes and

ToM has been proposed by Leslie and colleagues (Leslie & Polizzi, 1998; Roth &

Leslie, 1998). Besides differing from the above two accounts on the level of ToM

ability attributed to 3-year-olds (Leslie argues that the ToM module is fully active by

this age but that its abilities are usually masked by the processing requirements of ToM

tasks, whereas Russell and Carlson and colleagues favour the view that some conceptual

ToM development does take place between the ages of 3 and 4), Leslie and his

colleagues also have a different view of what creates the salience-related difficulty for

3-year-olds on the standard false belief task. They argue that because beliefs are

typically true, there is a default (or prepotent) assumption that beliefs are true, and

therefore the attribution of a non-default (false) belief requires inhibition of this default

assumption (Leslie, 1994a; Leslie & Polizzi, 1998). Thus, they identify the competition

as being between two belief contents (one of which represents physical reality) rather

than between physical and mental realities (notably, Leslie and colleagues do not offer

an analysis of the inhibitory requirements of other ToM tasks such as tests of

deception). In support of their hypothesis, Leslie and colleagues conducted a series of

cleverly designed experiments which supported previous findings that reducing the

inhibitory demands of false belief tasks improves the performance of 3-year-olds (Roth

& Leslie, 1998), but also showed that increasing the inhibitory requirements of the false

belief task had a significant detrimental effect on the performance of 4-year-olds (Leslie

& Polizzi, 1998).

Inhibitory control is not the only executive process that has been implicated in

ToM task performance. Olson (1989) argued that developments in children’s capacity

for holding complex representations in mind may support or underlie their

understanding of false belief. Similarly, Halford (1993) proposed that working memory

capacity may limit young children’s success in situations which require the

simultaneous integration of two representations of a situation (i.e., reality and belief).

In a test of this hypothesis, Davis and Pratt (1995) found that backward digit span

performance significantly predicted scores on the unexpected contents and appearance-

reality tasks over and above age and verbal ability (accounting for around 6% of the

variance), but forward digit span did not. They interpreted this as suggesting that

development in the central executive, but not articulatory loop, component of Baddeley

and Hitch’s (1974) working memory model was a small but significant determinant of

false belief task performance. Using an additional false belief task (the unexpected

transfer test) and a more age-appropriate working memory measure involving dual-task

performance, Keenan, Olson, and Marini (1998) also found that after controlling for

age, working memory capacity was a significant predictor of false belief performance

(accounting for 7.4% of the variance). The influence of working memory capacity on

the expression of ToM ability is supported in other studies showing that reducing the

usual memory demands of ToM tasks has a facilitative effect on performance (e.g.,

Chandler & Hala, 1994; Freeman & Lacohée, 1995; Mitchell & Lacohée, 1991;

although see Hala et al., 2003; Robinson, Riggs, & Samuels, 1996).

While these studies permit the fairly acceptable conclusion that working

memory plays some limiting role in the expression of mentalistic concepts that may

already exist, Gordon and Olson (1998) considered the more contentious possibility that

increasing computational resources may actually allow the formation of those concepts.

They argued that the key capacity required for false belief understanding is the ability to

hold in mind and then update a previously created representation when a new

representation is created by the current perceptual situation. They used two working

memory tasks, both of which required children to perform concurrent mental activities,

but only one of which required them to hold the product of such activity in mind such

that it could be updated on the basis of some new perceptual information. While both

their tasks showed strong correlations with false belief performance after controlling for

age (accounting for up to 40% of the variance), the more complex working memory task

contributed a significant amount of variance to false belief performance over and above

the other more simple working memory task. Gordon and Olson concluded that while

primitive concepts such as self, true, and real may be available earlier, “their co-

ordination into a higher-order structure depends upon increased computational

resources” and thus that “conceptual content and conceptual complexity combine not

only in the performance on theory of mind tasks but also for the formation of the

understanding itself” (1998, p. 81)17.

Two studies found that the relationship between ToM and working memory no

longer held after age and verbal ability were controlled for (Hughes, 1998a; Jenkins &

Astington, 1996). Hala et al. (2003) proposed that this discrepancy may be attributable

to the lack of significant inhibitory demands in the working memory tasks used in these

two studies (i.e., they were simple tests of maintenance of information in working

memory over time and did not require dual-task performance or the simultaneous

activation of two concurrent activities). This interpretation is supported by Davis and

Pratt’s (1995) finding that forward digit span was not a significant predictor of false

belief performance, in contrast to backward digit span (which arguably involves not

only rehearsing the sequence of numbers but also inhibiting the tendency to report them

in the order heard).

The idea that EF tasks involving both working memory and inhibitory

components may show the strongest relationship with ToM had been raised earlier by

Carlson and colleagues (Carlson & Moses, 2001; Carlson et al., 2002). They argued

that false belief tasks require both working memory and inhibition in that the child must

hold in mind two representations simultaneously as well as make a response based on

the representation which directly conflicts with his/her own salient perspective.

Although this group earlier favoured the view that the ToM-EF relationship was based

purely on the inhibitory requirements of ToM tasks, their shift in view was prompted by

two separate studies in which they found that of two types of inhibition task, the type

which involved a heavier working memory load (whereby two conflicting alternatives

needed to be held in mind) was the more powerful predictor of ToM performance, and

added extra variance over and above the low working memory load inhibition task

(Carlson & Moses, 2001; Carlson et al., 2002). In addition, Carlson et al. (2002) found

that a working memory task with no inhibitory requirement did not predict false belief

performance independently of age and both verbal and non-verbal intelligence. Perner,

Lang, and Kloo (2002b) also failed to find a significant relationship between ToM and

inhibition when a simple go-nogo inhibition task with low working memory load was

17 Again, this view is therefore probably best conceived of as a combination of expression and emergence accounts of the ToM-EF relationship.

used. Consistent with these results, a recent study by Hala et al. (2003) found that

“pure” measures of inhibition and working memory did not predict false belief

performance individually, but tasks combining inhibitory and working memory

requirements were strongly predictive of false belief performance after age and verbal

ability were controlled.

Only a few studies have examined the contribution of EF components other than

inhibition and working memory to ToM performance, with mixed results. Hughes

(1998a) found a significant correlation between tests of attentional flexibility (or set-

shifting) and deception after age and verbal and non-verbal intelligence were partialled

out. However, flexibility did not correlate with false belief performance and the

correlations with deception were not as robust as those with performance on inhibition

tasks. Harris (1993) argued that in both ToM tasks and tests of planning, children must

envisage a hypothetical state of affairs and respond or make a prediction in accordance

with that hypothetical situation, which typically contradicts the response which is

suggested by the true or current state of affairs. According to Harris, children will

exhibit difficulty on both types of task if they are unable to shift or disengage from their

current context to a hypothetical and conflicting context. However, he does not present

any evidence specifically examining this hypothesis. Using a planning test developed

for use with young children, Bischof-Köhler (1998, cited in Perner & Lang, 2000) found

a relationship between planning ability and false belief performance, but the direction of

the relationship was such that false belief understanding appeared to be necessary for

planning success. However, the effect of age or verbal ability on this relationship was

not reported. Moses and Carlson (2000, cited in Carlson et al., 2002) did not find a

significant relationship between planning ability and ToM after age and verbal ability

were partialled out. It therefore remains unclear whether EF components other than

inhibition and working memory are correlated with ToM, and if so, what might underlie

these relationships.

Overall, though, the evidence appears to be consistent with the idea that the EF

requirements of ToM tasks or abilities are a significant factor influencing the

developmental relationship between ToM and EF. However, a number of criticisms of

expression accounts have been advanced by Perner and colleagues (Perner, 1995, 2000;

Perner & Lang, 2000; Perner et al., 2002b), whose central contention is that ToM-EF

correlations are not solely attributable to task requirements (and, relatedly, that the

preschool development in ToM is not solely attributable to developments in EF)18.

While they more readily accept the evidence suggesting that tests of deceptive pointing

include a significant executive component, Perner and colleagues have questioned the

methodology of several of the studies purporting to demonstrate earlier competence on

false belief tasks when the EF demands are reduced. For example, Perner (1995; Perner

et al., 2002b) argued that in Bartsch and Wellman’s (1989) study, equal numbers of

children passed the explanation version of the false belief task (which does not include

an obvious inhibitory requirement) as the standard prediction task; and that it was only

after receiving an overly helpful prompt that they displayed additional correct answers

on the explanation version. Other studies using alternative explanation paradigms have

found that explanation tasks are equally as difficult as prediction tasks (Hughes, 1998a;

Moses & Flavell, 1990; Perner et al., 2002b; Wimmer & Mayringer, 1998), although

Russell, Hill, and Franco (2001) have pointed out that the mean age in these studies was

around 4 years (compared with 3 years in Bartsch & Wellman’s study), which may have

masked differences by boosting scores on the prediction task. Perner (1995) also argued

that Robinson and Mitchell’s (1995) finding of significantly improved performance on

their identical twin explanation paradigm compared with a standard prediction version

may be explained by a difference in the baseline performance expected for children with

no understanding, and that there is no difference in difficulty once the data are adjusted

for correct guesses – a post-hoc analysis which was then confirmed by the pattern of

results obtained by Perner et al. (2002b).

While these findings suggest that modified false belief (“explanation”) tasks

which purportedly remove the EF component may be just as difficult as standard false

belief tasks (see also Robinson & Beck, 2000), even more pertinent for Perner are

studies showing that performance on explanation versions correlates just as strongly

with EF scores as performance on standard prediction versions. Hughes (1998a) found

that performance on her explanation version correlated as strongly as performance on a

standard false belief task with scores on tests of inhibitory control, and Perner et al.

(2002b) found that performance on their explanation version correlated as strongly as

performance on a prediction version with scores on the dimensional change card sorting

task (which arguably requires set-shifting and inhibition).

These results suggest that the ToM-EF relationship is not solely due to the EF

requirements of ToM tasks. However, this conclusion rests on the assumption that

18 Perner’s own theory of the ToM-EF relationship is reviewed in Section 2.3.1.3, under the heading of “Emergence accounts”.

explanation versions of false belief tasks do not incorporate any EF requirements.

Russell et al. (2001) argue that tasks requiring a linguistic explanation for why an empty

location was visited still require children to set aside or inhibit their knowledge of the

actual location. It could also be argued that other explanation versions still require the

child to hold in mind two conflicting perspectives, thereby taxing working memory.

For example, in the identical twin versions used by Robinson and Mitchell (1995) and

Perner et al. (2002b), the child is still required to hold in mind the sequence of events

that has occurred and consider the different experiences of both twins simultaneously in

order to work out why one twin looks in the wrong location. In addition, in defence of

the expression accounts, it should be noted that the majority of authors who argue that

EF abilities constrain performance on ToM tasks do not hold that the ToM-EF

relationship is solely due to performance-based factors, that there is no deeper

conceptual link, or that the development in mentalistic understanding in the preschool

period is attributable only to increasing EF capacity without any additional conceptual

development (Leslie and colleagues are an obvious exception). Certainly, none of the

authors have made the claim (sometimes attributed to them) that mentalising ability

does not exist and that ToM tasks are simply EF tasks. On the basis of the evidence as a

whole, it seems reasonable to accept that successful performance on some ToM tasks

requires a certain level of capacity in EF (particularly inhibition and working memory)

and that executive difficulties may impact upon ToM performance, and therefore that

the correlations observed between ToM and EF are partially due to performance-based

commonalities.

2.3.1.2 Common conceptual requirements of ToM and EF

Rather than positing that the ToM-EF relationship arises from the EF requirements for

successful ToM performance, this account contends that both ToM and EF share a third

common underlying conceptual requirement. The main account falling in this category

is the CCC theory (see Section 2.2.2; Frye, Zelazo, & Burack, 1998; Frye et al., 1995),

which proposes that false belief and EF tasks both require the use of embedded

conditionals, or if-if-then rules (an example of a task structure involving embedded

conditionals, the Dimensional Change Card Sorting (DCCS) task, is described in

Section 2.2.2). Frye’s (2000) analysis of the logical structure of the standard false belief

task (where the child must predict where Maxi will look for his chocolate) in terms of

if-if-then embedded conditionals runs as follows:

IF me (s1), IF looking for chocolate (a1), THEN here (c1).

IF Maxi (s2), IF looking for chocolate (a1), THEN there (c2).

Frye (1999) proposed that while ToM and EF are distinct and neither underlies the

other, they are related in that they depend on different applications of the same set of

reasoning rules, and the development in this reasoning ability underlies the development

of ToM and EF at the same age. The same embedded rules “guide the inferences

necessary for theory of mind and allow the formulation of action that results in

improved executive control” (Frye, 1999, p. 121-122).

Frye et al. (1995) tested the CCC account by comparing preschoolers’

performance on three false belief tasks and two reasoning tasks with an embedded

conditional structure: the DCCS task and a physical causality task where a marble was

rolled down a covered ramp either to a hole directly below its entry point or across to

the opposite side, depending on the setting condition, and children had to predict where

the marble would be found. Frye et al. found similar age-related improvements

(between 3 and 5 years of age) across both types of task. In a further study, Frye,

Zelazo, Brooks, and Samuels (1996) showed that 3-year-olds were able to successfully

perform a simplified version of the physical causality task with a simple if-then

structure. Frye et al. (1995) also found that scores on the reasoning and ToM tasks were

significantly correlated with age partialled out. Furthermore, ToM performance only

correlated with performance on reasoning tasks with an embedded rule structure, and

not with performance on tasks with simple if-then structures.

A number of criticisms of CCC theory and its explanation of the ToM-EF

relationship have been advanced. Carlson and colleagues (Carlson et al., 1998; Carlson

& Moses, 2001) argue that Frye et al.’s (1995) data are also consistent with an

inhibitory control interpretation, as, for example, the DCCS task requires children to

inhibit their previous way of responding and shift to a new response. Zelazo and Frye

(1998) refute the inhibition interpretation by pointing to data showing that 3-year-olds

are able to effectively inhibit a previous way of responding when the task conforms to a

simple if-then structure (Marcovitch, Zelazo, Boseovski, & Cohen, 1997, cited in

Zelazo & Frye, 1998) and that on a task with an embedded rule structure, 3-year-olds

still performed poorly when evaluating the sorting of a puppet – that is, when they were

not themselves required to inhibit a previous response (Jacques, Zelazo, Kirkham, &

Semcesen, 1999). However, conversely, Carlson et al. (1998) highlight the fact that the

deceptive pointing and arrow tasks used in their study had identical rule structures, but

3-year-olds’ performance was significantly better on the arrow task (which had a lower

inhibitory requirement). Perner and Lang (2002) also found that 3-year-olds were able

to perform well on variations of the DCCS task which had an embedded rule structure,

but which did not include an extradimensional shift (i.e., the rule reversed rather than

changed dimensions from colour to shape) and did not involve a visual clash between

target and test cards (i.e., had reduced inhibitory requirements). Moreover, Perner

(2000) calls attention to the fact that go-nogo tasks (which require a simple pair of

rules) are as difficult for 3-year-olds as other inhibition and conditional reasoning tasks.

Perhaps even more pertinent is Carlson and Moses’ (2001) finding that one of their

inhibition measures which had a simple if-then structure was one of the strongest

predictors of ToM performance. Similarly, Sabbagh, Moses, and Shiverick (2001, cited

in Carlson & Moses, 2001) found that false belief performance was strongly correlated

with inhibition, but performance on the false photograph task (which has an identical

rule structure) was not.

These data suggest that similarities in the rule structure of ToM and EF tasks

cannot account entirely for the ToM-EF relationship. Perner, Stummer, and Lang

(1999) also present a more a priori argument against the CCC account’s analysis of the

standard false belief task. They point out that in the DCCS and physical causality tasks,

the conditional structures describe rules which the child must know in order to solve the

task. However, in the case of the false belief task, the conditional rules (e.g., “If Maxi,

if looking for chocolate, then here”) are not part of the task’s instructions and cannot be

those explicitly used by the child in solving the task, as such a rule would only be

possible if the child was repeatedly exposed to Maxi going to the empty location.

Perner and colleagues argue that this highlights the arbitrary nature of the rules chosen

to describe the false belief task. They offer the following analysis:

IF I am looking for the chocolate (a1), THEN here (c1).

IF Maxi is looking for the chocolate (a2), THEN there (c2).

This plausible alternative reduces the task to one with a pair of simple if-then rules,

which 3-year-olds should be capable of performing successfully. Zelazo, Jacques,

Burack, and Frye (2002) attempted to refute these criticisms by arguing that their claim

is not that people must learn the rules, but must formulate them in an impromptu

manner in order to solve the task. They argue that their analysis of the rule structure of

the false belief task is not logically necessary, but is an empirical claim which is

confirmed by the correlations observed between ToM and rule-based reasoning tasks.

However, this would mean that any task showing correlations with the rule-based

reasoning tasks could then be considered to have an embedded conditional structure -

surely a circular and ad hoc argument. Hence, difficulties with the logical defence of

the conditional structure of false belief tasks, as well as an inability to account for data

regarding both the abilities of young children and correlations between ToM and EF

tasks with simple rule structures, present a significant challenge to the CCC account of

the ToM-EF relationship.

An alternative “common conceptual requirements” account has been presented

by Halford and colleagues (Halford, 1993; Halford et al., 1998), although this has not

been as thoroughly investigated or discussed as the CCC account. Halford’s theory

states that processing capacity (or working memory) is limited by the complexity of the

relations (i.e., the number of related dimensions or sources of variation) that may be

processed in parallel, and that as processing capacity develops, children should be able

to represent concepts of increasingly higher relational complexity. He proposes that

young children’s difficulty on standard false belief and appearance-reality tasks may be

explained by their inability to represent “ternary” relations, or problems with three

related dimensions. His analysis of the standard false belief task is that it requires

representing the relation between an object and two different representations of its

location: one based on knowledge of its actual location and the other on a false belief of

its location. He expresses this situation as the ternary relation:

Find-object (<known-event>, <actual-location>, <believed-location>),

instances of which are:

Find-object (<saw-moved>, <object-in-location-A>, <believe-object-in-location-A>)

Find-object (<not-seen-moved>, <object-in-location-A>, <believe-object-in-location-

Halford argues that young children are able to understand any of the component binary

relations (e.g., Find-object (<not-seen-moved>, <object-in-location-A>)), but that they

cannot integrate two object-percept relations into a single ternary relation. This also

explains their poor performance on other kinds of task, including EF tests such as the

Tower of London, which require the same or a higher degree of relational complexity

(the Tower of London is described in the next chapter, Section 3.4.1).

Halford’s view is therefore similar to the CCC account, but differs in that it

emphasises the number of relations between pieces of information rather than the

presence of a hierarchical or embedded conditional structure. Because of this, Halford’s

proposal may be more resistant to some of the criticisms levelled against the CCC

account on the basis of the purported embedded rule structure of the false belief task.

However, his account has yet to be directly tested, although evidence of a relationship

between working memory and ToM (Davis & Pratt, 1995; Keenan et al., 1998) is

consistent with it. In addition, the relational complexity of the EF tasks (for example, of

inhibitory control) which are mastered at the same time as false belief tasks remains to

be determined.

2.3.1.3 Emergence accounts19

In this category falls two main theories of the ToM-EF relationship, one claiming that

EFs are a prerequisite for the development of ToM, and the other claiming that ToM is

necessary for EF to develop.

i) EF is required for ToM. This position is represented mainly by Russell

(1996, 1997b), whose argument is essentially that a sense of “agency” underlies self-

awareness, which in turn underlies the development of ToM. According to Russell,

agency has four main features: i) action-monitoring (the process through which changes

in experience are perceived to have been caused by the self and not the world), ii)

instigation (the ability of agents to determine their own perceptual sequences), iii) non-

observational knowledge of actions (the phenomenon whereby if an agent is in control

of his/her actions, s/he does not have to consciously observe them to know what they

are), and iv) privileged knowledge of goals (whereby in acting in a goal-directed fashion

we know incorrigibly what the goal is, whereas a third person does not). Russell

considers EF to be equivalent to action-monitoring and instigation (or at least that these

are the fundamental aspects of EF), and it is in this sense that he views EF as underlying

the development of ToM20. He asserts that these features allow a sense of ownership –

the perception of experiences as one’s own, and not determined by the world – and

therefore a self-awareness which he calls “pre-theoretical” (i.e., bodily-based and

immediate, requiring no comprehension of psychological concepts). This pre-

theoretical self-awareness is a necessary condition for the development of ToM - a

form of self-awareness which does depend on mental concepts.

19 Perner and Lang (1999, 2000) label these “functional dependence” accounts. The term “emergence” accounts is preferred here as this more directly refers to the developmental aspect of this class of explanation – that is, the notion that one ability depends on the other to develop or “emerge”. 20 Russell (1997b) recognises that EFs include other components such as inhibition, cognitive flexibility and working memory, and goes on to say how monitoring and instigation relate to these components. For example, he says that instigation (defined as the capacity to take actions not driven by habit or the external world) is analogous to the concept of generativity, but also requires inhibition; and flexibility requires adequate monitoring of the outcome of an incorrect response and instigation of a new strategy.

Empirical tests of Russell’s (1997b) theory have mostly been conducted on

children with autism (who are purported to have inadequate action-monitoring or

instigation), and these are reviewed in Section 2.3.2. However, a study by Hughes

(1998b) is also relevant. She found that preschoolers’ early EF performance,

particularly on a test of goal-directed action and inhibition, predicted ToM scores one

year later; but that early ToM scores did not predict later EF performance. Although

this study did not directly measure monitoring and instigation, it provides general

support for the notion that EF is required for ToM rather than vice versa.

Perner and Lang (1999, 2000) have critiqued Russell’s theory on conceptual

grounds, claiming that while it can explain how early action-monitoring may

fundamentally enable the early and later development of ToM, it does not adequately

explain why developments in false belief and inhibition in particular should occur at the

same age (around 4 years, later than the development of action-monitoring) or why

ToM and EF should be correlated at this age. Russell does not specifically address this

issue in his writings, tending to focus on expression or performance-based explanations

for the ToM-EF relationship during the preschool period, without relating later

inhibition to earlier action-monitoring and instigation.

Also posing a challenge for Russell’s theory are the existence of disorders where

EF is impaired but ToM is intact. If EF is a prerequisite for ToM development, one

would not expect children with impaired EF to show typical ToM capacity. Three

studies have now shown that children with ADHD or at risk of ADHD have impaired

EF (particularly inhibitory control) but intact performance on ToM tasks (Charman,

Carroll, & Sturge, 2001; Hughes, Dunn, & White, 1998; Perner, Kain, & Barchfeld,

2002a). A study by Tager-Flusberg, Sullivan, and Boshart (1997) demonstrated a

similar dissociation in children with Prader-Willi syndrome and Williams syndrome,

who showed impaired EF and intact ToM with no correlation between EF and ToM

performance. In addition, six children failed both EF tasks but passed both ToM tasks,

while no children passed both EF tasks if they failed both ToM tasks21, inconsistent

with the notion that intact EF is a prerequisite for ToM. Baron-Cohen and Robertson

(1995) reported a case of a child with Tourette’s syndrome who passed all ToM tasks

but failed two of three tests of inhibition, although this study comes with the usual

caveats of a single-case design.

21 This additional data was not contained in the original publication but was reported by Perner and Lang (2000).

Although Russell has not directly responded to these challenges to his theory, he

has implied that he subscribes to a multi-componential view of EF (Russell, 1997b) and

therefore may argue that individuals showing a ToM-EF dissociation have intact action-

monitoring and instigation but impairments in other aspects of EF. However, this

would contradict his assertion that monitoring and instigation are the fundamental basis

for EF, underlying its other components. Another important issue in evaluating

evidence of dissociations, highlighted by Perner and Lang (2000), is that the criterion

for failure of EF tasks is arbitrary in many cases, and so evaluating the relative pass/fail

rates of ToM and EF tasks suffers from the absence of absolute standards of

performance. Although Perner’s view takes the opposite form, he concludes that while

Russell’s theory is in need of greater specification regarding the aspects of ToM and EF

it incorporates, there is no firm evidence against it (Perner, 2000; Perner & Lang, 1999,

2000).

ii) ToM is required for EF. This account was first alluded to by Wimmer

(1989, cited in Perner, 1991), Frith (1992) in reference to schizophrenia, and Carruthers

(1996), and was then developed further by Perner (1998; Perner & Lang, 1999, 2000).

The essence of this position is that the metarepresentational capacity which

(purportedly) underlies ToM is necessary for volitional control over action. Wimmer’s

initial idea (as described by Perner) was that a better understanding of our own mind

and mental concepts allows better control over our mental processes and behaviour.

Carruthers (1996) developed this notion by positing that normal human reasoning

routinely involves second-order evaluation of first-order thoughts, beliefs and desires

(e.g., how strong is my desire to do x as opposed to y?), a kind of reflexive,

introspective access to our recent conscious mental events. He argues that the operation

of a ToM module underlies this meta-access to our own beliefs and thoughts, and in

turn, that this meta-access is a necessary condition for the evaluation of recent problem-

solving strategies such as is required on many EF tasks.

Perner (1998) elaborated upon this idea by specifically delineating the

metarepresentational requirements of the contention scheduling and SAS aspects of

Norman and Shallice’s (1980, 1986) model of EF (reviewed in Section 2.2.2). He

argues that contention scheduling does not require “meta-intentional” understanding

(i.e., a declarative, conscious representation of one’s goal), because at this level action

schemas control each other automatically by mutual inhibition and activation of

competing behavioural sequences (such as in trial-and-error learning). On the other

hand, intentional actions such as following verbal instructions or planning a future

action sequence (which require control by the SAS) demand a declarative representation

of a goal as desired or intended (i.e., “something the examiner wants me to do” or

“something I want to do”), so that the correct novel action schema can be boosted22.

These representations are called meta-intentional because they involve representing the

intended action sequence as intended. In certain situations, though, boosting the desired

action sequence is not sufficient – the inhibition of competing action schemas is also

required. In these cases, Perner argues, one needs to understand why the particular

competing action schema in question needs to be inhibited, and to do so one needs to

consciously conceptualise the action sequence as a tendency one has – that is, meta-

represent the schema as a representational vehicle (a representation which is not

specified by its content, such as a procedural action sequence). Thus, it is only in

situations where a competing action schema needs to be inhibited that we require

metarepresentational (not just meta-intentional) capacity. The developmental

relationship between tests of inhibitory control and false belief understanding therefore

occurs because they both require metarepresentational capacity - both require the ability

to represent representational vehicles (either action sequences or “pictures-in-the-head”)

which have causal efficacy (i.e., make people act in a certain way). In a sense, then,

Perner’s account is one of a common conceptual requirement of ToM and EF (i.e., they

both rely on metarepresentational capacity) rather than ToM itself being a prerequisite

for EF (although Perner himself places his account under the “functional dependence”

heading, equating metarepresentational ability with ToM and then saying that EF tasks

are applied ToM tasks)23.

The main piece of direct evidence for Perner’s theory comes from a study by

Lang and Perner (2002) which examined the relationship between early EF (as

measured by the DCCS and Luria’s hand game, which requires inhibitory control), false

belief and a knee-jerk reflex task. This latter task required the child to identify whether

or not they intended to move their leg after a reflex movement was elicited by the

examiner. Perner et al. argued that like the false belief and EF tasks, this requires an

understanding of mental states as representations which are causally responsible for

22 Perner (1998) characterises this distinction as being that contention scheduling occurs at the level of the representational vehicle, and the SAS exerts control at the level of representational content (see Perner, 1995 for further explanation). 23 It also resembles a “common conceptual requirements” account in that Perner implies that ToM and EF both depend on metarepresentational capacity throughout the lifespan, and not just during development. However, it was classified as an emergence or functional dependence account here both because that is how it is classified by Perner himself, and because the hypothesis does emphasise that ToM (or metarepresentational capacity) is necessarily for EF to develop.

actions (as the child needs to differentiate between intentional and accidental

movements). Consistent with their predictions, they found that the three types of task

were significantly correlated with age and verbal ability partialled out, implying that all

three abilities depend upon a common developmental factor. Perner and Lang (2000)

also report further relevant results from what was presumably a preliminary version of

the study, which showed that the knee-jerk reflex task still explained a significant

amount of variance in false belief performance beyond the EF tasks, suggesting that the

relationship between false belief and knee-jerk reflex understanding could not be

explained by any executive component in the knee-jerk task.

Perner’s account of the ToM-EF relationship has, like all the preceding

accounts, been subjected to a range of critiques. Russell (1997b; Russell et al., 2001)

has argued that the assumption that any behaviour with a second-order character (i.e.,

where the subject is required to represent to itself what it is doing and what needs to be

done) necessarily involves ToM is an unjustified over-stretching of the ToM concept.

He asserts further that action schemas or tendencies are not representations in any useful

sense, or at least, do not necessarily require metarepresentational understanding.

Russell has also criticised Perner’s interpretation of his result with the knee-jerk task,

arguing that the task could be considered to require inhibition of an answer based on

perceived outcome; and that the reason why it explains variance in the false belief task

beyond that explained by the EF tasks is that the response to be inhibited is verbal,

rather than a motor act as in Luria’s hand game (Russell et al., 2001; Russell, Hala, &

Hill, 2003). It could also be argued that the correct rejection of the reflex movement as

intentional requires action-monitoring (i.e., the ability to perceive the difference

between changes in experience caused by the self and the world), and therefore that the

results are also consistent with Russell’s agency theory.

A number of empirically based criticisms of Perner’s theory have been advanced

by Hughes (1998b) and Carlson and colleagues (Carlson & Moses, 2001; Carlson et al.,

2002). Firstly, Hughes’ (1998b) finding that early EF predicted later ToM but not vice

versa is inconsistent with the notion that ToM is a prerequisite for EF. Perner and Lang

(1999, 2000) attempt to reinterpret this finding in their favour by arguing that EF tasks

assess the understanding of mental states as causally efficacious as much as ToM tasks,

and that Hughes’ data can be explained by assuming that this metarepresentational

understanding occurs in reference to one’s own actions first, and in reference to others’

actions later. However, this is not consistent with findings that on the unexpected

contents (Smarties) and unexpected identity (appearance-reality) tasks, correct reporting

of one’s own previous belief develops at the same time as, or even after, the correct

prediction of others’ beliefs (e.g., Gopnik & Astington, 1988). Also, this interpretation

would mean that the findings of impaired EF and intact ToM in certain disorders, the

evidence used by Perner against Russell’s theory, would pose an equally difficult

problem for Perner: is it plausible that children could be impaired in a developmentally

precedent ability in comparison with intact performance on an ability which develops

later?

Secondly, Hughes (1998b) points out that significant improvements in inhibition

and goal-directed behaviour occur during infancy (see Section 2.2.2), long before

children acquire Perner’s representational theory of mind. Perner et al. (1999) provide a

more solid defence of this problem by distinguishing between “automatic inhibition”

(when a more highly activated schema naturally inhibits less activated competitors, or

relatively automatic suppression of motor or cognitive responses), which he argues is

tapped by the A-not-B task and other EF tasks used with infants, and “executive

inhibition” (when no alternative schema is automatically activated and a response must

be actively inhibited, or when there is deliberate suppression of a response to achieve an

internally represented goal), which is what is tapped by EF tasks mastered around the

age of 4, and which requires metarepresentational understanding.

A third empirical problem for Perner was discovered by Carlson and colleagues

(Carlson & Moses, 2001; Carlson et al., 2002), who found that their two types of

inhibition task showed differential relationships with ToM, but both required executive

inhibition and therefore according to Perner should have been equally related to ToM.

Perner et al. (2002b) also found themselves that performance on a go-nogo task which

required executive inhibition was not significantly correlated with false belief prediction

or card sorting performance, contrary to their predictions. Fourthly, as highlighted by

Hughes (1998b), young children may correctly verbalise their understanding of the rules

of a task but nevertheless demonstrate perseveration of the incorrect response (Zelazo et

al., 1996b), indicating that meta-intentional ability is not sufficient for strategic

performance on an EF task.

The existence of dissociations between ToM and EF whereby ToM is impaired

but EF is spared poses another difficulty for Perner, although, consistent with Perner’s

account, these cases appear to be rarer than cases of the reverse dissociation. A number

of studies of individuals with brain injuries have demonstrated a ToM-EF dissociation

such that ToM is impaired and EF intact (reviewed in the next section). In addition,

deaf children displaying intact EF performance still demonstrate a ToM impairment

(Remmel, 2003). However, it could be argued that the abnormal development of ToM

in deaf children has its basis in a different process to other conditions and is therefore a

poor example of this dissociation, as it appears to be impoverished language

development which underlies the delay in ToM rather than a metarepresentational

deficit (de Villiers, 2000).

Overall, then, while Perner’s theory of the ToM-EF relationship has an

interesting and well-developed conceptual basis, it has not as yet been able to

adequately refute conceptually grounded critiques or account for all the available data.

Perner and Lang (1999) suggest that both emergence accounts may be correct, such that

ToM and EF are interdependent: “an understanding of mental states as causally

efficacious is required for executive inhibition, and executive inhibition is a main

exercise ground for a theory of mind at this stage of development” (p. 342). However,

while both of the emergence accounts are strengthened by evidence suggesting a deep

link between ToM and EF during conceptual development, they are equally weakened

by evidence that ToM and EF may be dissociably impaired (this is discussed further

later).

2.3.1.4 Common neuroanatomical bases for ToM and EF

This category of explanation holds that correlations between ToM and EF may be

coincidental or accidental, occurring because both abilities depend upon the same or

proximal brain regions (Bach et al., 1998; Ozonoff, 1995a; Ozonoff et al., 1991;

Pennington et al., 1997). On this account, the concurrent developments in ToM and EF

which occur around the age of 4 are due to late maturation of these common brain

structures. It can also explain the frequent co-occurrence of ToM and EF impairments,

on the basis that proximal neuroanatomical structures will often be damaged together.

While the notion of common underlying brain regions is not inconsistent with any of the

other theories of the ToM-EF relationship, the idea that this is the only link between the

constructs is a possibility unique to this hypothesis. This account therefore allows for

dissociations between ToM and EF, although of course only if the brain regions in

question are not absolutely identical.

So what are the brain regions in question? Areas within the prefrontal cortex are

obvious suspects. In Sections 2.2.1 and 2.2.2, we saw that while EF tasks are not

necessarily sensitive or specific to the functioning of the prefrontal cortex, the view that

EFs rely upon the prefrontal cortex is fairly well established (e.g., Owen, Downes,

Sahakian, Polkey, & Robbins, 1990; see Stuss & Knight, 2002). Neuroimaging studies

and investigations of brain-damaged and psychiatric patients have converged on the

notion that the dorsolateral prefrontal cortex in particular is important for working

memory, problem-solving, attentional flexibility and planning (Burgess, 2000; Cabeza

& Nyberg, 2000; Collette & van der Linden, 2002; Dagher, Owen, Boecker, & Brooks,

1999; Fuster, 2000; Goldman-Rakic & Leung, 2002; Grattan, Bloomer, Archambault, &

Eslinger, 1994; Kane & Engle, 2002; Mega & Cummings, 1994; Weinberger, 2002).

An increasing number of recent neuroimaging studies of the brain regions

involved in ToM (which are generally adult studies comparing activation during

advanced ToM tasks with structurally similar tasks containing no mentalistic content)

have implicated a network of structures including the medial prefrontal cortex, anterior

cingulate cortex, superior aspects of the temporal lobes, and the temporal poles (Baron-

Cohen et al., 1994, 1999a; Brunet, Sarfati, Hardy-Baylé, & Decety, 2000; Castelli et al.,

2002; Fletcher et al., 1995; Gallagher et al., 2000; Goel, Grafman, Sadato, & Hallett,

1995; for reviews, see Abu-Akel, 2003; Frith & Frith, 2000; Gallagher & Frith, 2003;

Kain & Perner, 2003). In their review of neuroimaging studies of ToM, Gallagher and

Frith (2003) argue that the anterior paracingulate cortex (which is part of the medial

frontal cortex, and lies just anterior to the anterior cingulate cortex proper) is the crucial

region dedicated specifically to processing mental states, and the temporal regions

which are commonly activated in ToM tasks have more secondary functions such as the

interpretation of biological motion (which may be necessary to ascribe intentionality to

others) and episodic memory (which may be required to imagine ourselves in the

situation of another person). The orbitofrontal cortex and amygdala have also been

proposed to have a role in social cognition (Baron-Cohen & Ring, 1994; Brothers,

1996), however activation in these areas is not seen in the majority of neuroimaging

studies of ToM. Gallagher and Frith (2003) conclude that while these areas may form

part of the social brain in general (with the amygdala involved in the automatic response

to socially salient stimuli as well as playing a key role in emotion, and the orbitofrontal

cortex in the processing of affective, particularly aversive, social stimuli), they are

unlikely to be directly involved in ToM. However, neuroimaging studies are of limited

utility in investigating the role of the orbitofrontal cortex in ToM as it is difficult to

obtain reliable activation maps for this region (Gregory et al., 2002).

The importance of the prefrontal cortex in ToM has also emerged in studies of

neurological patients, which have demonstrated significant ToM impairments in patients

with prefrontal damage (Bach et al., 1998; Channon & Crawford, 2000; Gregory et al.,

2002; Happé, Malhi, & Checkley, 2001; Lough, Gregory, & Hodges, 2001; Lough &

Hodges, 2002; Rowe et al., 2001; Stone, Baron-Cohen, & Knight, 1998; Stuss, Gallup,

& Alexander, 2001). However, the regions of the prefrontal cortex implicated in these

studies have been somewhat more ambiguous than those indicated in neuroimaging

studies of normal adults24. Rowe et al. (2001) found no effect of the site of lesion

within the prefrontal cortex on ToM performance. Some studies have muddied the

issue either by defining damage as being in the “orbitomedial” region, using the terms

ventromedial and orbitofrontal interchangeably, or focussing on the area of overlap

between the orbitofrontal and ventromedial regions (Gregory et al., 2002; Lough et al.,

2001; Lough & Hodges, 2002; Stuss et al., 2001). Studies by Happé et al. (2001) and

Stone et al. (1998) found that ToM is impaired following orbitofrontal lesions, contrary

to the findings of neuroimaging studies. Cicerone and Tanenbaum (1997) also describe

a patient with traumatic orbitofrontal injury who performed poorly on ToM-like tasks

requiring interpretation of social situations. Drawing on additional evidence that

patients with orbitofrontal damage commonly show marked changes in social

behaviour, Stone (2000) concludes that the orbitofrontal cortex is the most crucial

region for ToM. However, Bach, Happé, Fleminger, and Powell (2000) report a case of

an adult male with orbitofrontal damage who showed intact performance on ToM tasks

even though he displayed a disturbance in social behaviour. Eslinger (1998) also

reviews evidence showing that patients with orbitofrontal lesions have impaired

emotional or affective empathic processing, but intact performance on cognitive aspects

of empathic processing25. Importantly, though, studies of neurological patients rarely

implicate the dorsolateral prefrontal cortex in ToM (although see Price et al., 1990).

Regardless of whether ToM relies more heavily on orbitofrontal or medial

frontal regions, more relevant are studies addressing the notion that ToM and EF are

related because they rely on proximal brain regions. Consistent with this hypothesis, a

number of studies have reported co-existing ToM and EF impairments in patients with

prefrontal damage not limited to specific dorsolateral, ventromedial or orbitofrontal

regions (Bach et al., 1998; Channon & Crawford, 2000; Gregory et al., 2002; Rowe et

al., 2001; Saltzman et al., 2000). Gregory et al. and Rowe et al. both found that while

24 The laterality of ToM representation in the brain is also unclear, although it appears that patients with right hemisphere damage show ToM impairments more consistently than patients with left hemisphere damage (see Kain & Perner, 2003). 25 Interestingly, consistent with the notion of a distinction between cognitive and affective aspects of empathy, Blair et al. (1996) found that psychopaths show intact performance on ToM tasks but do not show typical affective responses (as measured by physiological arousal) to images of individuals in distress.

their frontal patients scored poorly on both ToM and EF tasks, these deficits were not

significantly correlated, consistent with the idea that the two types of task may rely on

different aspects of the prefrontal cortex. This specialisation within the prefrontal

cortex has also been supported by studies demonstrating dissociations between ToM

and EF in patients with damage to specific prefrontal regions. Case and group studies

have demonstrated specific impairments in ToM in the face of intact EF in patients with

frontotemporal dementia (Lough et al., 2001; Lough & Hodges, 2002). In addition,

Fine, Lumsden, and Blair (2001) reported a case of a patient with congenital amygdala

damage who demonstrated impaired ToM but intact EF. The reverse dissociation, of

intact ToM with impaired EF, was also reported by Bach et al. (2000). A double

dissociation of sorts was demonstrated by Stone et al. (1998), who found that while

patients with orbitofrontal damage failed ToM tasks regardless of their working

memory load, dorsolateral prefrontal patients displayed impaired ToM performance

only under conditions where the working memory load was high (moreover, under these

conditions they made errors on control questions as well as belief questions).

This account of the ToM-EF relationship has therefore been fairly resistant to

criticism, as it is able to explain both the relationships and the dissociations between

ToM and EF. Of course, in addition to the lack of consistency in defining what

constitutes an impairment, the problem of equating ToM and EF tasks for difficulty

should be noted as a caveat in interpreting the dissociations observed in neurological

patients (and any other clinical samples or individuals), although most studies are

careful to note the absence of any floor or ceiling effects. Aside from this, Perner and

Lang (2000) have argued that the theory lacks strong predictive value, as any kind of

task association or dissociation is compatible with it. Perner (2000) adds that while it

accounts for a general developmental relationship between ToM and EF based on

common timing of the maturation of prefrontal structures, it does not specifically

predict that false belief tasks and “executive inhibition” should be mastered at the same

time. He also points out that environmental factors such as number of older siblings

influence the age at which false belief tasks are mastered (Perner, Ruffman, & Leekam,

1994; Ruffman, Perner, Naito, Parkin, & Clements, 1998). However, a number of other

extraneous factors also influence ToM development (e.g., language, visual perception),

but this does not preclude the notion that ToM has a specific neuroanatomical basis

which is proximal to the structures involved in EF.

So, what can we conclude overall about the relative strength of the various models of

the ToM-EF relationship? As we have seen, no account has eluded criticism. Leaving

aside the problem of defining “impairment” and equating ToM and EF tasks for

difficulty, one recurrent theme is that the explanation must account for the observed

correlations between ToM and EF and the frequent co-occurrence of deficits in the two

areas, as well as allowing for dissociable impairments in either direction. The

“emergence” accounts in particular are weakened by evidence of dissociations, as they

both imply a fundamental conceptual dependence between the two constructs during

development. Similarly, dissociations are problematic for the “common conceptual

requirements” accounts particularly if the tasks on which dissociations are demonstrated

are both purported to rely on the same third underlying mechanism. While the

“common neuroanatomical bases” explanation accounts best for these data, it lacks

specificity in its predictions regarding the components of EF and types of ToM task

which should be related. Also, it does not specifically account for data showing that

performance on false belief and deception tasks improves when the EF requirements are

reduced. The “expression” accounts do not strictly predict ToM-EF dissociations as one

would expect that performance on ToM tasks would be affected if EF is impaired

(although it would be acceptable for ToM to be impaired while EF is intact), but if the

account is limited to specific aspects of EF such as inhibition or working memory, then

dissociations with other EF components would be allowable. Although this account has

been criticised on the basis of evidence suggesting that the ToM-EF link extends

beyond the level of the EF requirements of ToM tasks, those findings do not exclude the

possibility that there may be both performance-based and deeper conceptual or

functional (and neuroanatomical) links.

It is also possible that ToM and EF may be dependent on each other for their

initial development but then become separable processes when matured (linked only by

the EF requirements of ToM tasks and/or their common neuroanatomical substrates) –

in which case emergence accounts would not be inconsistent with the existence of

dissociations in adults or children over the age of five (in whom ToM and EF had

previously developed normally). Perner et al. (2002a) do not appear to concur with this

possibility, implying that functional dependence should extend across the life span26.

However, Karmiloff-Smith (1992; Karmiloff-Smith, Scerif & Ansari, 2003; Thomas &

Karmiloff-Smith, 2002; see also Bishop, 1997) has argued persuasively that processes

26 If Perner’s account is viewed as a “common conceptual requirements” account, this claim becomes somewhat more defensible. However, he suggests that this is the case for both his and Russell’s accounts.

which are dissociable in adulthood cannot be assumed to be so during development and

that “a difference in performance....at any point in development does not permit the

inference of a stable double dissociation at a later or earlier time” (Karmiloff-Smith et

al., 2003, p.162). This argument carries the inference that double dissociations between

ToM and EF observed during middle childhood and adulthood do not necessarily mean

that the two abilities were not interrelated during earlier stages of development27. It is

possible and not implausible that performance-based, conceptual, functional, and

neuroanatomical factors interact and combine to produce the observed relationships

between ToM and EF during different stages of development and in different disorders.

This remains a speculative proposition, however – the nature or existence of the

relationship between ToM and EF beyond the preschool period has been largely

overlooked by all of the main accounts, which have focused in particular on the

relationship between false belief and certain aspects of EF (inhibition, working memory,

conditional reasoning, monitoring) between the ages of 3 and 5, without generating or

testing specific predictions about the ToM-EF relationship in later childhood,

adolescence and adulthood (or addressing the role of components of EF such as

generativity and flexibility, which are still developing during late childhood and

adolescence).

Only three studies provide separate data on correlations between ToM and EF

for non-clinical controls older than 5 years, with inconsistent results. Perner et al.

(2002a) found a number of significant correlations between a second-order false belief

task and a range of EF measures in typically developing 4.5–6.5 year-olds, while

Charman et al. (2001) did not find any significant correlations between advanced ToM

stories and measures of inhibition and planning in their 8-10 year-old typically

developing controls. The only available adult data is from Channon and Crawford

(2000), who report significant correlations for their healthy adults (with a mean age of

43 years) between advanced ToM stories and two measures of generativity, but not

other EF measures of flexibility, inhibition and planning. These data suggest that the

nature of the ToM-EF relationship may not be the same for older children and adults as

for young children, and therefore that ToM-EF dissociations in these age groups may

not be easily interpreted in terms of theories based on the preschool period. Notably,

almost all reported ToM-EF dissociations have occurred in individuals or samples older

27 Another example of this concept is that visuospatial skills are functionally dependent upon basic vision for their typical development (e.g., Vecchi, 1998), but in adult disorders it is possible for visuospatial processing to be impaired without disruption to vision itself, such as in cases of spatial neglect (see Heilman, Watson, & Valenstein, 1993).

than 5 years. The only exception is Hughes et al.’s (1998) study on “hard-to-manage”

preschoolers, which found that these children showed largely intact performance on

ToM tasks in comparison with impaired performance on EF tasks, but nevertheless that

ToM and EF were correlated in this group. Evidently, further studies on older age

groups are necessary to delineate the nature of the ToM-EF relationship beyond the

early years and the meaning and implications of ToM-EF dissociations for the various

accounts of the ToM-EF relationship. This is particularly important for the

interpretation of studies on the ToM-EF relationship in autism, which have been

conducted largely on older children.

2.3.2 The ToM-EF relationship in autism

As a developmental disorder characterised by both ToM and EF impairments, autism

provides an interesting test case for the various accounts of the ToM-EF relationship,

each of which generates different predictions about the relationship between ToM and

EF deficits in autism. The nature of the relationship is highly relevant to the evaluation

of hypotheses of autism which propose a primary deficit in either ToM or EF. A single

primary cognitive deficit account of autism needs to demonstrate that one impairment

subsumes or explains the other; and conversely, a multiple cognitive deficits account

would be consistent with evidence suggesting that ToM and EF are (at least partially)

independent deficits in autism.

Surprisingly, only a few studies have directly measured correlations between

ToM and EF in autism, with many authors relying on the mere existence of both ToM

and EF impairments in autism, other indirect evidence, or the theories and evidence

generated from the study of typically developing children to argue for their position.

Those who claim that ToM and EF are related deficits in autism rely heavily on Ozonoff

et al.’s (1991) finding that ToM and EF were correlated in their sample of high-

functioning individuals with autism. However, this was based on single ToM and EF

composite scores, therefore obscuring the specific nature of the relationship; and

furthermore, age was not partialled out of the correlation, leaving open the possibility

that the correlation may have been mediated by age.

The evidence regarding the nature of the ToM-EF relationship in autism will be

reviewed by examining the predictions of each of the accounts reviewed in the previous

section for the ToM-EF relationship for autism, and how well these predictions fit with

the available data.

i) Expression accounts. The idea that the failure of children with autism on

ToM tasks may be at least partially due to difficulties with their EF requirements has

been most overtly advocated by Russell and colleagues (Hughes & Russell, 1993;

Russell, 1997b; Russell, Saltmarsh, & Hill, 1999). In favour of this, Hughes and

Russell (1993) found that participants with autism continued to fail a test of strategic

deception (the windows task) when there was no opponent present. Russell et al. (1999)

found that children with autism demonstrated significantly poorer performance than

controls on the conflicting desire task used by Moore et al. (1995), suggesting that their

difficulty with the false belief task is not restricted to a lack of understanding of the

representational nature of belief. Charman and Lynggaard (1998) also found that the

performance of children with autism on the Smarties task was enhanced by the

provision of a photographic cue which (arguably) reduced the working memory and

inhibitory demands of the task, although Bowler and Briskman (2000) were not able to

replicate this effect using the standard Sally-Anne false belief task.

Although this evidence indicates only that children with autism show

impairments on tasks with both ToM and EF requirements as well as on tasks with only

EF requirements, some proponents of the EF hypothesis of autism have nevertheless

suggested that children with autism may fail ToM tasks because of their EF

requirements (e.g., Ozonoff, 1997a; Russell et al., 1999). One problem with this

account, which has been overlooked by all of its critics, is that children with autism do

not tend to demonstrate impairments on tests of inhibitory control or working memory

(see Section 2.2.3), the main EF components implicated in expression accounts of the

ToM-EF relationship. Although Russell and his colleagues do not directly address this,

they implicitly sidestep it by interpreting their findings on the strategic deception and

conflicting desire tasks in terms of a difficulty with mental disengagement rather than

emphasising inhibition, an interpretation which is a little more consistent with the

attentional shifting difficulties more consistently displayed by individuals with autism.

Also, no studies have directly examined the performance of children with autism on

tests combining inhibitory and working memory requirements (in comparison with tests

tapping one or the other), which, as we have seen, appears to be more relevant to false

belief performance.

The most common argument advanced by critics of the view that EF

impairments may explain the poor performance of children with autism on ToM tasks,

however, is that they are able to pass the “false photograph” task (described in Section

2.1.3). The claim is that the false photograph task has an identical task structure to the

false belief task, and therefore that their failure on false belief tasks cannot be due to

their EF requirements (Baron-Cohen & Swettenham, 1997; Leslie & Roth, 1993; Leslie

& Thaiss, 1992). However, a number of criticisms of the false photograph task have

been put forth in return. Pennington et al. (1997) argued that the “false” photograph is

not actually false: it does not misrepresent current reality because the nature of

photographs is that they do not portray current reality, and therefore the adequate

performance of children with autism could be explained by their intact understanding of

cameras. Pennington et al. and Hughes et al. (1994) both also maintain that the camera

and photograph are perceptually salient, available and enduring to participants in a way

that inferred beliefs are not. Similarly, Russell (1997b) claims that the inhibitory

demands made by the false photograph task are far weaker than those made by the false

belief task, as the participant is required only to inhibit their current perception of a

three-dimensional representation (i.e., a toy) in order to refer to what is known about a

two-dimensional representation (i.e., a photograph of a different toy). This claim was

tested by Russell et al. (1999) by using a modified version of the false photograph task

in which the initial photograph was taken of a blank wall, designed to increase the

relative salience of the current representation (a three-dimensional doll) in comparison

to the old one (where nothing was present). They found that while children with autism

were able to pass the standard false photograph task, they demonstrated impaired

performance on the modified version, indicating that when the inhibitory demands of

the task matched those required by the false belief task more equally, children with

autism could not sustain intact performance. Russell (1997b; Russell et al., 1999) has

nevertheless made it clear that he is not of the view that the ToM deficits displayed by

individuals with autism are entirely due to EF impairment, or that if the EF demands of

ToM tasks were removed, then autistic individuals would show normal performance.

Although Leslie and colleagues present an expression account of the ToM-EF

relationship in typical development, they of course do not subscribe to this view of ToM

failures in autism. While they propose that 3-year-olds fail false belief tasks because of

an impaired Selection Processor (SP), they argue that children with autism have an

intact SP but instead fail false belief tasks because of impaired metarepresentational

capacity or ToMM (Leslie & Thaiss, 1992; Leslie & Roth, 1993). In support of their

view, Leslie and colleagues have reported evidence that children with autism do not

benefit from helpful task modifications as 3-year-olds do, and that while 3-year-olds

will attribute beliefs to others even if they are incorrect, children with autism will not

attribute any beliefs at all (Leslie & Roth, 1993; Roth & Leslie, 1998; Surian & Leslie,

1999). Leslie and colleagues acknowledge the presence of EF impairments in autism,

but in their model, these are independent from the SP and from ToMM. They subscribe

to a view of EF as a fractionated system whereby children with autism are impaired in

some EF components but not those involved in the SP (Leslie & Roth, 1993). If the SP

is considered to be an inhibitory mechanism, this in fact fits quite well with the

literature suggesting that inhibition is intact in autism.

ii) Common conceptual requirements of ToM and EF. In discussing the

applications of their CCC theory for autism, Zelazo, Frye and colleagues have strongly

advocated the role of domain general processes (such as rule-based reasoning) in the

cognitive aetiology of autism and argued against the conception that autism is

characterised by a domain specific impairment in a theory of mind module (Frye et al.,

1998; Zelazo et al., 1996a, 2001). Zelazo et al. (2002) tested the hypothesis that

individuals with autism may fail ToM tasks because of domain-general difficulties in

rule use by examining correlations between performances on two false belief tasks, the

physical causality task, and the DCCS task. They found that the correlation between

ToM and rule use tasks was not significant for severely impaired individuals with

autism (due to floor effects on most tasks), but was significant for their mildly impaired

group. They interpreted this result as indicative of the lack of domain-specificity of

ToM impairments in autism, and furthermore, argued that ToM deficits in autism may

be accounted for by rule-based reasoning impairment.

Besides the study’s small sample size (with only 10 mildly impaired

participants) and the lack of a control group (necessary to ensure that any difficulties

displayed are connected to autism; Colvert, Custance, & Swettenham, 2002), an

important limitation of Zelazo et al.’s (2002) study is that they did not partial out age or

IQ variables in their correlations. A study reported by Colvert et al. (2002) which

addressed these limitations nevertheless replicated the result with 20 high-functioning

children with autism, finding significant correlations between false belief and DCCS

performance with age, verbal and non-verbal ability partialled out. However, as Colvert

et al. point out, further research is needed to investigate what other factors (e.g.,

inhibition, salience of the switch of setting conditions) might account for these

correlations; particularly in light of the criticisms of CCC theory outlined in the

previous section’s review. In addition, as for dissociations observed in other disorders,

the presence of ToM-EF dissociations in autism (reviewed below) challenge the notion

that the two abilities depend upon a common conceptual ability.

Halford has not discussed or tested the implications of his relational complexity

account (Halford, 1993; Halford et al., 1998) for autism. His proposal implies that

individuals with autism may demonstrate limited relational complexity, a prediction

awaiting empirical confirmation.

iii) Emergence accounts. In his account of why a sense of internal agency is

a necessary prerequisite for the development of ToM, Russell (1997b) specifically

posited that autism may be a disorder characterised by impaired action monitoring and

instigation, and therefore that these deficits may underlie the abnormal development of

ToM28. Previous studies showing deficits in imitation (Smith & Bryson, 1994) and

motor planning (Hughes, 1996a) in autism are consistent with this hypothesis. Russell’s

first direct investigations of action monitoring in autism were promising, although not

compelling (he has not studied instigation deficits, saying that this has been adequately

covered by others under the guise of “generativity”). Russell and Jarrold (1998) found

that on a task involving the launching of missiles towards targets, children with autism

failed to correct errors based on both external and internal feedback. This was

interpreted as indicating an impairment in constructing visual schemata of motor acts,

which are necessary for action monitoring (although the authors acknowledged that

their data could also be consistent with a deficit in flexibility). Russell and Jarrold

(1999) tested higher-level self-monitoring by using a task requiring children with autism

to recall whether they or another person had performed a certain action. Consistent with

their predictions, they found that children with autism demonstrated impaired

performance on this task, suggesting that they were failing to monitor their actions as

their own. However, they also demonstrated some subtle difficulties on memory-based

control tasks. More recent studies have not been so favourable towards Russell’s

theory. Using a range of tasks including monitoring of basic actions, reporting an

intention when the outcome was unintended but desired, and reporting on intended

actions when the action achieved was unexpected, Russell and Hill (2001) did not find

any strong evidence of monitoring impairments in children with autism. Similarly, Hill

and Russell (2002) did not find evidence for a self-monitoring impairment in autism

using a test of memory for actions which involved a self/other source attribution (i.e., a

judgement of who performed the act), inconsistent with Russell and Jarrold’s (1999)

28 Russell (1997b; Russell & Hill, 2001) also argues that these deficits can account for the range of other EF deficits displayed by individuals with autism. He proposes that action-monitoring and instigation underlie the development of verbal self-regulation (or “inner speech”), which in turn is necessary to hold in mind arbitrary rules. Therefore, individuals with autism are impaired on EF tasks which have arbitrary rules.

results. These failures to meet the predictions of Russell’s (1997b) theory have led him

to reconsider his conceptualisation of the core EF deficit in autism. Consistent with

Ozonoff (1997b), Russell and Hill (2001) proposed that set-shifting or flexibility may

instead be the core impairment in autism, and that this deficit may have a “homologous”

rather than a “causal” or functional relationship with ToM. They propose that if one

assumes that cognition is a form of set-shifting between domains, then children who are

mentally inflexible would find it challenging to reflect on mental acts (their own and

other people’s).

Other proponents of the EF hypothesis of autism have presented alternative

emergence accounts of the ToM-EF relationship in autism, although these have not been

as extensively developed as Russell’s either conceptually or empirically. Hughes and

Russell (1993) suggested that a child with an impairment in dealing with novelty and

making decisions due to a damaged SAS (Norman & Shallice, 1980, 1986) would fail to

develop successful social relations, with the developmental outcome being an impaired

ToM. Pennington et al. (1997) proposed that autism is characterised by a severe deficit

in working memory, which results in an early disruption in the planning and execution

of complex behaviour. Because this occurs early in development, it affects the

acquisition and use of concepts that require the integration of information across time

and contexts. Concepts with these requirements include a recognition of one’s own and

others’ intentions and their correspondence or conflict, which is involved in imitation as

well as later ToM abilities. Hence, an early impairment in working memory would

result in the development of a mentalising impairment. An obvious challenge for this

account is the absence of convincing evidence for a working memory deficit in autism

(in the absence of any inhibitory requirements). Ozonoff and McEvoy (1994) suggested

that an early and persistent impairment in the ability to disengage from the external

environment and guide behaviour by internal mental models (see Harris, 1993) would

have significant consequences for the ability to appreciate others’ perspectives (which

requires disengagement from one’s own prepotent thoughts). A problem for all of these

accounts, however, is the lack of convincing evidence of early EF deficits in autism (see

Section 2.2.3), which speaks against the notion of EF impairment as being causally

primary.

The existence of ToM-EF dissociations in individuals with autism, whereby EF

is impaired but ToM intact, presents additional difficulties for these accounts (which in

one way or another all propose that a primary EF impairment underlies the abnormal

development of ToM), although only one study has reported data relevant to this

dissociation. Ozonoff et al. (1991) found that in their sample of high-functioning

individuals with autism, EF deficits were almost universal but ToM deficits only

occurred in a subset of the sample, with the implication being that some individuals

failed EF tasks while passing ToM tasks. This finding is inconsistent with the view that

EF is a necessary prerequisite for ToM and that early EF deficits result in later ToM

impairment. Ozonoff et al. conclude that while EF deficits are primary in autism, they

are unlikely to be causally related to ToM (they adopt a “common neuroanatomical

bases” position, as reviewed below). However, it is possible that differences in the level

of difficulty of the two sets of tasks may account for the pattern of results (Perner &

Lang, 2000).

The reverse emergence account proposed by Perner and colleagues has not been

directly examined with autistic individuals, although they have implied that a ToM (or

metarepresentational) impairment may underlie EF deficits in autism just as for 3-year-

olds (Perner, 1998; Perner & Lang, 2000). The most obvious difficulty with the

application of this account to autism is that children with autism have shown intact

performance on tests which may be regarded as measuring Perner’s “executive

inhibition” (e.g., Ozonoff & Strayer, 1997), which his theory would not predict. ToM-

EF dissociations in autism whereby ToM is impaired but EF intact also contradict his

notion that metarepresentational ability is a prerequisite for EF development. Baron-

Cohen and Robertson (1995) reported a case of a child with autism who failed several

ToM tasks but performed successfully on EF tasks, and Baron-Cohen, Wheelwright,

Stone, and Rutherford (1999b) report the same dissociation in three high-functioning

adults with autism. Of course, the small number of individuals for which this

dissociation has been noted limits the generalisability of these findings. Perner’s theory

of the ToM-EF relationship in autism requires a direct and systematic investigation

before it may be adequately evaluated, although the available evidence is not overly

favourable towards it.

iv) Common neuroanatomical bases for ToM and EF. The possibility that

ToM and EF impairments may co-occur in autism because of their proximal

neuroanatomical substrates was first proposed by Ozonoff et al. (1991; see also Bishop,

1993). However, Ozonoff (1997a; Ozonoff & McEvoy, 1994) subsequently put forward

an opinion that there may be both performance-based and conceptual links as well.

Baron-Cohen and Swettenham (1997), on the other hand, clearly expressed the view

that ToM and EF are best conceptualised as independent deficits in autism, which

probably co-occur because of their shared frontal origins.

While prefrontal abnormalities have been found in individuals with autism (as

discussed in Section 2.2.3), more convincing evidence for this class of explanation

would involve demonstrating both dorsolateral and medial or orbitofrontal impairment,

as the purported substrates for EF and ToM respectively. Two studies provide indirect

evidence for dorsolateral abnormalities in autism. Luna et al. (2002) found significantly

reduced activation in the dorsolateral prefrontal cortex in individuals with autism during

the performance of a spatial working memory task. Goldberg et al. (2002) also

interpreted the presence of impairments on an eye movement anti-saccade task as

suggestive of dorsolateral prefrontal dysfunction in autism. The idea that autism may

involve medial frontal or orbitofrontal dysfunction has been suggested by a number of

authors (e.g., Bachevalier & Loveland, 2003; Damasio & Maurer, 1978; Mundy, 2003)

on the basis of the region’s apparent role in social behaviour. In a series of studies,

Dawson and colleagues (Dawson et al., 1998, 2002a; Dawson, Osterling, Rinaldi,

Carver, & McPartland, 2001) obtained indirect evidence of ventromedial prefrontal

dysfunction in early autism by demonstrating impairments on tasks previously found to

be linked with ventromedial functioning. Two studies have also found reduced

activation in medial frontal areas during the performance of ToM tasks in individuals

with autism (Castelli et al., 2002; Happé et al., 1996). While these studies do not

provide unequivocal evidence of dorsolateral and medial/orbitofrontal abnormalities in

autism, they are at least consistent with the possibility that ToM and EF impairments

may co-occur in autism because of damage to their proximal neural substrates.

Overall, then, we can conclude only that the nature of the relationship between ToM and

EF in autism remains unclear. The only studies to have conducted direct correlations

between ToM and EF in autism have either failed to partial out the effects of age and IQ

variables (Ozonoff et al., 1991; Zelazo et al., 2002), not examined specific relationships

with components of EF (Ozonoff et al., 1991), or only included one type of EF task

(Colvert et al., 2002). Studies addressing expression accounts of the ToM-EF

relationship have not yet systematically varied the EF requirements of ToM tasks, and

the emergence accounts again struggle with the presence of ToM-EF dissociations in

autism as well as being inconsistent with some of the available data. The accounts

which propose that ToM and EF are related in autism, while intuitively appealing,

therefore remain open to further investigation. Interestingly, these accounts mostly

originate from proponents of the EF hypothesis of autism, who argue that EF deficits

may explain ToM impairments in autism (either because of performance-based or

functional/developmental links). Notably, Baron-Cohen and Leslie - the most

prominent proponents of the ToM hypothesis of autism - both adhere to the view that

ToM and EF are independent deficits in autism. The independence and relative primacy

of ToM and EF in autism are clearly important matters awaiting further empirical work.

These matters are addressed in the current research.

CHAPTER 3

Selection and Description of Measures 3.1 Diagnostic measures

3.1.1 Autism Screening Questionnaire

3.1.2 Autism Diagnostic Interview – Revised

3.2 IQ measures

3.3 ToM measures

3.3.1 Simple false belief task

3.3.2 First-order false belief task

3.3.3 Second-order false belief task

3.3.4 Dewey Stories

3.4 EF measures

3.4.1 Tower of London

3.4.2 Intra-dimensional, Extra-dimensional Set-shifting task

3.4.3 Response Inhibition and Load task

3.4.4 Opposite Worlds

3.4.5 Relational Complexity

3.4.6 Pattern Meanings

3.4.7 Uses of Objects

3.4.8 Stamps task

3.5 Behavioural measures

3.5.1 Measures of repetitive behaviour

3.5.1.1 Repetitive Behaviours Questionnaire

3.5.1.2 Repetitive Behaviours Interview

3.5.2 Measures of social behaviour and communication

3.5.2.1 Social Behaviour Questionnaire

3.5.2.2 Social and communication ADI-R domains

Both of the studies contained in this thesis involve the use of a large range of cognitive,

behavioural, and diagnostic measures. This chapter is devoted to a comprehensive

discussion of each set of measures, including a rationale for the selection of each

measure, a brief overview of its psychometric properties where possible, and a detailed

description of what it entails. This precedes the chapters describing the two studies

partly because of its length (due to the number of measures involved), and partly

because the measures used are common to both studies. The reader is encouraged to

refer back to this chapter when the procedure and results of the studies in Chapters 4

and 6 are being discussed.

3.1 Diagnostic measures

3.2.1 Autism Screening Questionnaire1 (ASQ; Berument, Rutter, Lord, Pickles

& Bailey, 1999)

The ASQ was developed as a screening instrument for autism and other PDDs, based on

current diagnostic criteria and for use with all age groups. It is a 40-item questionnaire

completed by the individual’s primary caregiver. The questions are based on the

Autism Diagnostic Interview-Revised (ADI-R; Lord, Rutter, & Le Couteur, 1994 – see

the following section) but have been modified to be easily understandable in a

questionnaire format. It includes questions on reciprocal social interaction, language

and communication, and repetitive and stereotyped behaviours. There are two versions,

one for individuals under the age of six and the other for those aged six and over. A

score of 1 is assigned for the presence of abnormal behaviour and a score of 0 for its

absence. The total score therefore ranges from 0 to 39 (an item on current language

level is not included in the score). Berument et al. found that a score of 15 or more on

the ASQ was the optimal cutoff point for differentiating ASDs from other diagnoses.

The ASQ shows good diagnostic validity and the correlation between the ASQ total

score and the ADI algorithm score is high (Berument et al., 1999).

In the current research, the ASQ was used mainly as a screening instrument to

determine whether or not individuals in the control group (or siblings of individuals

with ASDs and control siblings in the second study) displayed symptoms of autism. If a

1 The ASQ has now been published as the Social Communication Questionnaire, however as the version of the questionnaire used in this research was obtained from the authors prior to publication, the ASQ was deemed to be a more appropriate descriptor.

participant in one of these groups met the cutoff criterion for an ASD on the ASQ, the

ADI-R was then administered. The ASQ cutoff point was lowered to a more

conservative score of 10 (rather than 15) in this research, to ensure that both i) any

controls scoring highly on the ASQ did not meet ADI-R criteria for an ASD, and ii) any

individuals with mild ASD symptomatology met the criterion and were administered the

ADI-R.

3.2.2 Autism Diagnostic Interview – Revised (ADI-R; Lord et al., 1994)

The ADI-R is a modified version of the ADI (Le Couteur et al., 1989), which is a

standardised, semistructured interview for caregivers of individuals for whom autism is

a possible diagnosis. The diagnostic algorithm is based on ICD-10 (WHO, 1992)

criteria for autism, but it can also provide a DSM-IV (APA, 1994) diagnosis as the two

diagnostic systems are very similar. It has demonstrated good reliability and validity

(Lord et al., 1994). The duration of the interview for a practiced administrator is

approximately 1.5 – 2 hours. Special training is required for administrators and

approval for use of the instrument is given after completion of a test interview.

The ADI-R consists of five sections: opening questions, communication (both

early and current), social development and play (early and current), repetitive and

restricted behaviours (early and current), and general behaviour problems. Each item is

scored either 0 (behaviour not present), 1 (behaviour probably present but criteria not

fully met), or 2 (behaviour definitely present), and occasionally a score of 3 is used to

indicate extreme severity. An algorithm cutoff score determines whether an individual

meets diagnostic criteria within each of the three domains of abnormality (i.e.,

communication, social interaction, and repetitive/restricted behaviours). In order to

meet diagnostic criteria for autism, the individual must meet criteria in each of these

three domains, as well as exhibiting some abnormality in at least one area by 36 months

of age.

The application and utility of the ADI-R in diagnosing ASDs other than autism

and in differentiating ASD subtypes has not yet been investigated. However, as

individuals with clinical diagnoses of Asperger syndrome and PDDNOS were included

in the current research, a more lenient ADI-R diagnostic criterion was introduced such

that any individual exceeding the cutoff point in at least one of the three domains was

considered to have met criteria for an ASD. Section 4.3.2 in Chapter 4 describes the

results of comparisons between the “full criteria” and “partial criteria” groups in Study

3.2 IQ measures

Two Verbal and two Performance subtests from the Wechsler scales (WPPSI-R, WISC-

III or WAIS-III, depending on the participant’s age) were used to estimate Verbal IQ

(VIQ) and Performance IQ (PIQ) respectively. VIQ and PIQ scores were estimated by

pro-rating sums of scaled scores based on the two subtests for each scale. Verbal

subtests were Vocabulary (providing definitions of words) and Similarities (identifying

the way in which two things are alike). Performance subtests were Picture Completion

(identifying the missing part of a picture) and Object Assembly (assembling pieces of a

puzzle to form a whole object). These subtests were chosen because they are

representative tests of verbal and non-verbal ability2, as well as being the least similar to

other measures in the test battery.

3.3 ToM measures

Three different false belief tasks, varying in level of difficulty, were selected as the

main ToM measures for both studies. Emphasis was placed on measures of false belief

as these have been the main focus of studies of the ToM-EF relationship. The tasks

chosen were all in common usage in the literature (unexpected contents/identity,

standard first-order false belief, and second-order false belief; these are all described in

detail below). A more advanced social cognition measure (Dewey Stories) was also

included because of expected ceiling effects on false belief tasks in older control

participants.

The three false belief tasks were administered in a hierarchy of difficulty, with

different starting points for participants of different ages. This was done mainly to

conserve time within the extensive test battery. The “simple” false belief task

(including unexpected contents and unexpected identity items, as described below) was

2 In the WISC-III (which was used with the most participants), the Vocabulary subtest loads .81 on the VIQ factor and .79 on the Verbal Comprehension (VC) index; and the Similarities subtest loads .75 on the VIQ factor and .72 on the VC factor. The Object Assembly subtest loads .66 on the PIQ factor and .69 on the Perceptual Organisation (PO) index; and the Picture Completion subtest loads .50 on the PIQ factor and .53 on the PO factor. Although the Block Design subtest has the highest loading on the PIQ and PO factors, it was not chosen because it is considered to be a measure of central coherence and therefore would have complicated the interpretation of PIQ scores.

considered the easiest of the three tasks, with first-order false belief (or unexpected

transfer) next in the hierarchy and second-order false belief the most difficult. The

more challenging nature of second-order false belief tasks was demonstrated by Perner

and Wimmer (1985) in typically developing children and Baron-Cohen (1989b) in

children with autism. The level of difficulty of “simple” and first-order false belief

tasks has generally been found to be more equal (Wellman et al., 2001), but findings are

consistent with the assumption that an individual who passes the first-order false belief

task is likely to have passed the simple false belief task.

The hierarchy of task administration operated such that participants were

administered either the simple false belief task (for children between 4 and 6 years of

age) or the first-order false belief task (for 7- to 16-year-olds) first, and then only

proceeded to the more difficult task(s) if the initial task was passed3. Pass or failure was

measured by performance on belief questions only, whereby a score of 2/3 or more for

the simple false belief task or 3/6 or more for the first- and second-order tasks was

considered a pass. If the initial (or subsequent) task was failed, failure was also

assumed on the more difficult task(s). If the 7- to 16-year-olds passed the first-order

false belief task, they were assumed to have also passed the simple false belief task;

however, if they failed the first-order false belief task, the simple false belief task was

administered.

Only a few studies have investigated the reliability and validity of false belief

tasks, with somewhat equivocal results. An initial study by Mayes, Klin, Tercyak,

Cicchetti, and Cohen (1996) found poor test-retest reliability for standard first-order

false belief tasks, however Hughes et al. (2000) found fair to moderate reliability across

a wider range of false belief tasks, and high reliability when aggregate scores were used.

Charman and Campbell (1997) found that a range of ToM tasks demonstrated moderate

reliability in individuals with learning disorders. In a sample of children with autism,

Grant, Grayson, and Boucher (2001) found good convergent validity of several false

belief tasks as well as high consistency across task versions.

3.3.1 Simple false belief task (Flavell et al., 1983; Perner et al., 1987)

This task included three items, one of which was an unexpected contents item and two

of which were unexpected identity items. These two types of item resemble each other

3 Of note, most studies which have examined the effect of task order on false belief performance have not found any significant order effects (e.g., Gordon & Olson, 1998; Hala et al., 2003).

closely. Because a higher number of trials per task is preferable in general, they were

grouped together and considered as different items of a single task. This has been done

in previous studies (e.g., Gopnik & Astington, 1988), and its validity was confirmed by

Wellman et al.’s (2001) meta-analysis, which showed no difference in the level of

performance on the two item types. It was also confirmed in this research, where strong

correlations were found across the three task items in the control participants of both

studies (on the belief questions, r ranged from .52 to .7, all ps < .01).

In the first task item (the unexpected contents item), the child is shown a box of

Smarties and asked, “What do you think is inside this box?” After s/he responds, the

box is opened and the child is shown that the box actually contains a pencil. The pencil

is then replaced in the box, and the child is asked, “What is really in the box?” (the

Reality question). S/he is then asked, “When you first saw the box, all closed up like

this, what did you think was inside it, Smarties or a pencil?” (the Own Belief question).

Finally, the participant is asked, “If I show X (parent/sibling) the box all closed up just

as I showed you, and I ask X what he/she thinks is in the box, what do you think X will

say, Smarties or a pencil?” (the Others’ Belief question)4. The other two items (the

unexpected identity items) involve the same questions, except the stimulus for the

second trial is a sponge which is spray-painted to look like a rock (after being asked

what s/he thinks it is, the child is then allowed to squeeze it, then asked what it really is,

and so on); and the stimulus for the third item is a black pen which contains red ink

(after being asked what colour s/he thinks the pen is, the experimenter writes with it to

show that it is red, then the child is asked what colour it is really, and so on). The child

is given a score of 1 or 0 for each of the Reality, Own Belief and Others’ Belief

questions, and the scores for the three trials are summed for each question type.

3.3.2 First-order false belief task (Wimmer & Perner, 1983; Baron-Cohen et al.,

For the current studies, rather than using puppets, a video was filmed in which six

independent scenes are depicted5. The task is introduced to participants by saying

“Now we are going to watch some short videos. Each video tells a story. After we

finish watching each video, I will ask you some questions about what happened in the

4 The order of control and belief questions has been found to have no effect on participants’ responses (Eisenmajer & Prior, 1991; Leslie & Frith, 1988). 5 It should be noted that Wellman et al.’s (2001) meta-analysis demonstrated that the medium in which false belief tasks are presented (e.g., video, puppets, real people) had no effect on performance.

story”. In four of the scenes, an object is placed in Location 1 by Actor 1, who then

leaves the room. Actor 2 moves the object to Location 2, and Actor 1 then re-enters the

room. The participant is then asked i) where Actor 1 will look for the object (the Belief

question), ii) where the object actually is (the Reality question), and iii) where Actor 1

placed the object at the beginning (the Memory question). In another one of the scenes,

Actor 1 places Object 1 in a covered bag and leaves the room, and then Actor 2 replaces

Object 1 with Object 2. The participant is asked i) what Actor 1 thinks is in the bag, ii)

what is actually in the bag, and iii) what Actor 1 placed in the bag in the beginning. In

the remaining scene, Actor 1 draws Picture 1 on a board and leaves the room, then

Actor 2 rubs out Picture 1 and draws Picture 2. The participant is asked i) what picture

Actor 1 thinks is on the board, ii) what picture is actually on the board, and iii) what

picture Actor 1 drew in the beginning. Each scene therefore follows the same basic

structure. In each case, both of the locations, objects or pictures are visible on the

screen when the participants are being asked the questions.

Participants are given a score of 1 or 0 for each of the Belief, Reality, and

Memory questions, and scores for each question type are summed over the six scenes.

In pilot testing, it was found that many participants gained a score of 1 on all items and

found the task easy. A discontinue criterion was therefore introduced whereby if

participants gained full marks for the Belief questions for three consecutive scenes, they

were not shown the remaining scenes and gained automatic credit for these. If this

discontinue criterion was met but participants had scored 0 on any of the Reality or

Memory questions, their overall score in these question categories was calculated on a

pro-rata basis – that is, a percentage correct was calculated for those items administered,

and then multiplied by six.

3.3.3 Second-order false belief task (Perner & Wimmer, 1985; Baron-Cohen,

1989b)

This task was also presented in video format, with each of the six scenes followed by

Belief, Reality, and Memory questions. In four of the scenes, Actor 1 places an object in

Location 1, and leaves the room, but spies on Actor 2 without Actor 2 knowing. Actor

2 moves the object to Location 2, and Actor 1 then re-enters the room. The participant

is then asked i) where Actor 2 thinks that Actor 1 will look for the object (the Belief

question), ii) where the object actually is (the Reality question), and iii) where Actor 1

placed the object at the beginning (the Memory question). In another one of the scenes,

Actor 2 draws a picture on a sheet of paper while Actor 1 watches, then Actor 1 leaves

the room. While Actor 1 secretly watches, Actor 2 decides to draw a different picture

instead. The participant is asked i) what Actor 2 thinks Actor 1 thinks the picture is, ii)

what the picture actually is, and iii) what Actor 2 drew in the beginning. In the

remaining scene, Actor 1 offers Actor 2 an orange and a banana, which are both placed

in a lunchbox. Actor 2 takes the orange, and then Actor 1 leaves the room. While

Actor 1 secretly watches, Actor 2 replaces the orange and takes the banana instead, and

eats it. Actor 1 then re-enters. The participant is asked i) what Actor 2 thinks that Actor

1 thinks she ate, ii) what she actually ate, and iii) what she took in the beginning.

Again, in each case, both locations, pictures or fruits are visible on the screen when the

participants are being asked the questions.

Participants are given a score of 1 or 0 for each of the Belief, Reality, and

Memory questions, and scores for each question type are summed across the six scenes.

The same discontinue criterion was used for this task as for the first-order false belief

3.3.4 Dewey Stories (Dewey, 1991)

Dewey (1991) states that she composed this task in 1974 as an informal measure of

knowledge of social norms and human relations. It was chosen as a higher-level, more

advanced measure of social cognition than the false belief tasks. While Dewey (1991)

reports qualitative data on the unusual comments made by individuals with autism in

response to the stories, she does not report any quantitative scoring method or any

results from typically developing or other control samples. No other published study

has used the measure, and there have been no published investigations of its reliability

or validity. Its inclusion in this research can therefore be considered somewhat

experimental. While several of the story items appear to tap mentalistic understanding,

its validity as a measure of ToM remains to be investigated (see Section 4.3.2.2 of

Chapter 4), as it could be argued that the task may also be successfully performed

simply by drawing on knowledge of normative or common social behaviours.

The stimuli for the task are 7 stories, each one paragraph in length, which

describe a sequence of events containing certain social scenarios (Figure 3 contains an

example). The stories used in this study were taken directly from Dewey (1991),

however one story was shortened and another was substantially modified to be more

appropriate for an Australian sample. Two or more sections from each story are

underlined, and a pair of empty brackets follows each underlined part. Participants are

asked to rate each underlined behaviour according to how they think most people would

judge that behaviour if they witnessed it. They are asked to use the following scale:

Fairly normal behaviour in that situation [ A ]

Behaviour that is a little unusual in that situation [ B ]

Rather strange behaviour in that situation [ C ]

Very eccentric or shocking behaviour in that situation [ D ]

Although there are no set right or wrong answers for each rating, as a way of judging

the typicality of participants’ responses, each response was compared with norms

derived from 30 undergraduate psychology students. Responses of the normative

sample showed an equal split between the frequency of “B” and “C” responses on

several items, and so it was decided to place B and C in the same category of response.

The scoring system worked such that responses closer to the dominant normative

response were assigned a lower score. For items where A was the dominant response (n

= 8), participants who chose A scored 0, B/C scored 1, and D scored 2; and for items

where B/C was the dominant response (n = 9), A scored 1, B/C scored 0, and D scored

1. There were no items where D was the dominant response. Scores were summed

across items to produce a total score, on which lower scores represented a higher social

awareness.

Emily, age nineteen, overslept on the morning of her aeroplane trip. When she woke up, there was

just enough time for her to dress and get to the airport, so she skipped her breakfast. [ ] At noon, the

steward came around with lunch, but Emily was so hungry by then that one portion did not satisfy

her. She watched a little girl across the aisle toy with her food, complaining “I can’t eat it.”

Apparently, the father didn’t want any more, because he told the child to just leave it. Emily leant

across the aisle and said, “If your little girl doesn’t want her tray, can you pass it over for me?” [ ]

igure 3. An example of a Dewey Story.

3.4 EF measures

ecause one of the central aims of this research was to conduct a thorough investigation

f the EF profile characteristic of ASDs, as well as to examine the relationship of

various EF components to ToM (these aims are discussed in the next chapter), a strong

emphasis in the test battery was placed on measures of EF. Given the difficulties

following from the task impurity of most widely used EF tests (as discussed in the

previous chapter, Section 2.2.1), specific assessment of component processes using

tasks with high construct validity was given priority in task selection. The component

process approach to EF assessment has been strongly advocated by several authors (e.g.,

Hill, 2004; Ozonoff, 1995a, 1997a, 1997b, 2001). The tasks chosen are relatively

simple and/or include control conditions allowing precise delineation of the underlying

EF process(es) involved, although for some EF domains (e.g., planning), this was not as

easily achievable due to both the nature of the component and the availability of “pure”

tasks. A wide range of EF components was assessed, including planning (measured by

the Tower of London), set-shifting or cognitive flexibility (the Intra-dimensional, Extra-

dimensional set-shifting task), inhibition and its interaction with working memory

(Response Inhibition and Load task and Opposite Worlds), relational reasoning

(Relational Complexity), and generativity (Pattern Meanings, Uses of Objects and the

Stamps task). It was desirable to test each EF domain using both verbal and non-verbal

response modes, which was possible for the inhibition and generativity domains. The

child-friendliness of the tasks was another factor considered in EF task selection. It was

important for the tasks to be appropriate for a fairly large age range, so tasks with a low

floor and high ceiling were regarded as preferable.

3.4.1 Tower of London (Culbertson & Zillmer, 1998b; Shallice, 1982)

The Tower of London (ToL) was first designed as a cognitive measure by Shallice

(1982), who found that it was performed poorly by patients with frontal lobe lesions.

The ToL’s sensitivity to frontal dysfunction has been supported in a number of

subsequent studies using both adult clinical samples (Carlin et al., 2000; Owen et al.,

1990) and head-injured children (Levin et al., 1994, 1996). Shallice proposed that the

ToL specifically measures planning ability, which may be defined as the ability to

generate, select, organise, integrate, and monitor behaviours needed to achieve a future

goal (Culbertson & Zillmer, 1998a; Lezak, 1995). The validity of the ToL as a measure

of planning was supported by Shallice’s (1982) finding that ToL performance did not

covary with measures of visuospatial ability or working memory, although subsequent

studies have found that working memory and inhibition may also be involved in task

performance (e.g., Welsh, Satterlee-Cartmell, & Stine, 1999).

A wide range of administration and scoring procedures for the ToL have been

used in different studies. Three groups of researchers have published proposed

standardised versions of the ToL for use with paediatric populations (Anderson,

Anderson, & Lajoie, 1996; Culbertson & Zillmer, 1998b; Krikorian, Bartok, & Gay,

1994). Both Anderson et al. and Krikorian et al.’s versions require the readministration

of failed items, whereas Culbertson and Zillmer’s version only requires the child to do

each problem once, and uses the number of extra moves made as its main dependent

measure. Culbertson and Zillmer (1998b) argue that the readministration of failed items

significantly increases the amount of on-task time, which is a liability when assessing

younger children and clinical populations with limited attentional capacities. It can also

provoke frustration and distress, leading to decreased motivation and co-operation.

Their version has demonstrated adequate test-retest reliability as well as good criterion-

related, diagnostic and construct validity (Culbertson & Zilmer, 1998a, 1998b).

For these reasons, administration and scoring procedures for the version of the

ToL used in this research were based on those outlined by Culbertson and Zillmer

(1998b). The major differences were that the floor was lowered by including 1- and 2-

move items (rather than beginning with 3-move items); there were four items at each

level of difficulty instead of three; the instructions were slightly modified to encourage

participants to plan moves in advance; and scores were adjusted (see below) to account

for participants who completed problems in fewer moves than the minimum number

because they broke the task rules.

Participants are presented with a tower structure consisting of three wooden

posts of descending heights mounted on a wooden base. Three coloured discs (red,

black and white) are placed on the posts in a standard starting position (see Figure 4).

The participant is then required to rearrange the three coloured discs on the posts so that

the new configuration corresponds to the pattern presented on a 21cm x 15cm stimulus

card. The participants are informed that this must be accomplished in the minimum

number of moves, which is told to them verbally as well as being written at the top of

the stimulus card. In addition, they are told that they must adhere to the following four

rules: i) they can only use one hand to move the discs, ii) they can only move one disc at

a time, iii) discs cannot be placed on the board or table - only on the posts, and iv) they

cannot put more discs on a post than it will hold. Examples of breaking a rule are

demonstrated, each time with the experimenter saying “You can’t do this”.

Figure 4. The starting configuration for the Tower of London stimuli.

Participants are given one 1-move and two 2-move practice examples, during which if a

rule is broken or extra moves made, the rules are reiterated and the correct solution

demonstrated. The following instructions are then given:

“Now I am going to set up more disc patterns and see if you can make them on your

board in as few moves as possible. You may find that some of the patterns are difficult,

but do the best you can. Each pattern can be solved. You should look carefully at the

pattern and the board and plan the best move to start with. Take your time planning, as

each move you make counts towards the total. If you think you can’t finish it in the

correct number of moves, then just keep going and try and do it in the fewest number of

moves you can.”

Items range in difficulty from 1 move to 7 moves, with four items at each level of

difficulty. Participants aged between 4 and 13 begin with 1-move items, and

participants aged 14 and over begin with 3-move items (and are given automatic credit

for 1- and 2-move items if they complete at least two 3-move items in the minimum

number of moves). If the first item at a new level of difficulty is failed (either by

breaking the rules or using too many moves), the correct solution is demonstrated.

There is a time limit of 2 minutes on each item, after which the item is discontinued.

Testing is discontinued if participants fail all items at a particular level of difficulty.

Remaining items are assumed to have been failed and are assigned the maximum total

number of moves (i.e., 20).

Five scores are computed for each test item. These are listed in Table 1. The

sum of extra move scores and adjusted extra move scores is calculated for each block of

items (i.e., each level of difficulty) as well as overall. The total number of problems

completed in the minimum number of moves is also computed (i.e., the total number of

problems with an adjusted extra move score of 0). Although Culbertson and Zillmer

(1998b) also calculated initiation and solution times, these were not used in this research

as not all participants were administered all items (due to the different starting points for

different ages and the discontinue rule), making mean times (either overall or block by

block) difficult to calculate and analyse in a meaningful way.

Table 1. The five scores computed for each item on the Tower of London

1. Total number of rule violations Included moving 2 discs off the posts at the same

time, placing more discs on a post than it would

hold, and placing discs on the board or table

2. Extra moves Total number of moves – minimum moves*

3. Adjusted extra moves Adjusted moves – minimum moves, where adjusted

moves = total moves + (2 x no. of rule violations)

4. Extra move score Designed to avoid excessive inflation of the “extra

moves” index by an extreme number of total

moves, this is calculated as follows:

• If extra moves = 0, extra move score = 0

• If extra moves = 1-5, extra move score = 1

• If extra moves ≥ 6, extra move score = 2

5. Adjusted extra move score Identical to the extra move score except using

adjusted extra moves (3) instead of extra moves (2)

*If the total number of moves exceeds 20, it is reduced to 20 to avoid inflation of the

“extra moves” index by excessive moves on a limited number of items. For example, if

a participant executes 24 moves on a 7-move problem, then the score would be

calculated as follows: 20 - 7 = 13. In addition, the total number of moves is assigned a

value of 20 for any item not solved within 2 minutes.

3.4.2 Intra-dimensional, Extra-dimensional (IDED) Set-shifting task (Owen et

al., 1993)

The original version of this task, which forms part of the CANTAB (Cambridge

Neuropsychological Test Automated Battery), was designed as a WCST-like

computerised measure of attentional set-shifting. In comparison to the WCST, the

IDED set-shifting task is simpler to allow participants of a wider age and ability range

to participate, and involves a series of stages containing a number of internal control

conditions to aid the elucidation of the mechanisms involved in successful or

unsuccessful task performance. It has demonstrated fair test-retest reliability (Lowe &

Rabbitt, 1998). Using this task, it was found that patients with Parkinson’s disease and

with frontal lobe damage, but not patients with temporal lobe damage, demonstrated an

inability to shift attention between two perceptual dimensions at the “extra-dimensional

shift” stage (Downes et al., 1989; Owen, Roberts, Polkey, Sahakian, & Robbins, 1991).

However, observations that patients with frontal lobe dysfunction and

Parkinson’s disease may fail the task for different reasons led Owen et al. (1993) to

develop a modified version of the CANTAB procedure, which was designed to

distinguish whether impairments in attentional set-shifting ability are caused by an

inability to release attention from a relevant stimulus dimension (Perseveration), or an

inability to re-engage attention to a previously irrelevant dimension (Learned

Irrelevance). This version (described further below) therefore allows even further

breakdown of the processes involved in set-shifting performance, making it attractive

for inclusion in the current research. Using this modified IDED task, Owen et al. (1993)

found that the difficulty with extra-dimensional shifting demonstrated by patients with

frontal lesions was caused by perseveration to the previously relevant dimension,

whereas patients with Parkinson’s disease tended to show learned irrelevance. When

Turner (1997) used the task with children with autism, she found that low-functioning

participants demonstrated significantly more errors at the extra-dimensional shift stage

of the Perseveration condition, but not the Learned Irrelevance condition.

The modified version of the IDED set-shifting task includes two task conditions,

one intended to assess perseveration and the other to assess learned irrelevance. In both

conditions, each trial consists of two patterns appearing on a computer touchscreen

(their positions randomly alternating between four rectangular boxes to the top, bottom,

left and right of centre of the screen), and the participant is required to choose which

one is “correct” according to an unspecified rule, with feedback provided by the

computer. Participants are given the following instructions:

“This is a game where you have to work out the rule for choosing the right answer. On

the screen you are going to see two patterns. The patterns will appear in any two of four

boxes. One of the patterns is right and the other is wrong, and you must tell the

computer the one you think is right. You do this by touching the pattern on the screen.

There is a rule which you can follow to make sure you make the right choice every time.

The computer will not tell you the rule – you will have to work it out for yourself. To

begin with, there is nothing on the screen to tell you which of the patterns is correct so

when you choose your first answer you will just have to guess. However, the computer

will give you a message after each try to tell you whether you are right or wrong. If you

are right, it will come up with the word “CORRECT”, written in green, and if you are

wrong, it will say “INCORRECT”, written in red. The computer will be keeping track

of how well you are doing. When the computer can tell that you know the rule, the

computer will then change the rule, but it will not tell you that the rule has changed.

You will have to work out the new rule. That won’t happen very often. Do you have

any questions before you start?”

Each condition comprises 8 stages presented in the same fixed order: a simple

discrimination (SD) and reversal (SDR), then a compound discrimination (CD) and

reversal (CDR), then an intra-dimensional shift (IDS) and reversal (IDR), and finally an

extra-dimensional shift (EDS) and reversal (EDR). Participants can only proceed to the

next stage after reaching the criterion of 6 consecutive correct responses.

In the Perseveration condition (see Figure 5), the task begins with subjects being

required to learn which of two geometrical shapes is correct (SD condition). The

subject is then required to reverse the learnt rule and respond to the previously incorrect

stimulus in the target stimulus dimension (SDR). The next stage introduces an

additional stimulus dimension, white lines, which are paired with the shapes. At this

stage the same shape remains correct, with the nature of the lines being irrelevant (CD).

Once that is learnt, the subject is again required to reverse the learnt rule and respond to

the other shape (CDR). The next stage is the IDS stage, where the subject is presented

with new exemplars for both of the stimulus dimensions (shapes and lines). Although

the exemplars are different to the two previous stages, the relevant stimulus dimension

(shape) remains the same. In the EDS stage, the previously irrelevant stimulus

dimension (lines) is replaced by a new stimulus dimension which becomes relevant

(solidity), and the previously relevant dimension (shape) becomes irrelevant. Thus,

participants must shift their attention from a previously relevant to a new stimulus

dimension, and failure reflects perseveration to the previously relevant dimension.

The Learned Irrelevance condition (see Figure 6) proceeds as for the

Perseveration condition for the first 6 stages, except that colour is the relevant

dimension and number the irrelevant dimension. However, in the EDS stage, the

relevant dimension (colour) is replaced by a previously irrelevant dimension (number),

and a new dimension (size) becomes the irrelevant dimension. Hence, participants must

shift their attention to a previously irrelevant stimulus dimension, and failure reflects

learned irrelevance associated with the previously irrelevant dimension.

Stage Stimuli Relevant dimension

Irrelevant dimension

Solidity

Figure 5. Stimuli for the Perseveration condition of the IDED set-shifting task (see text

for explanation of abbreviations). The correct choice is always displayed on the left.

Stage Stimuli Relevant dimension

Irrelevant dimension

Colour

Number

Colour

Number

Colour

Number

Colour

Number

Figure 6. Stimuli for the Learned Irrelevance condition of the IDED set-shifting task

(see text for explanation of abbreviations). The correct choice is always displayed on

the left.

Failure to achieve the criterion of 6 consecutive correct responses within 50

trials at any one stage results in discontinuation of the test. There is a 1000ms interval

between successive trials. Each condition lasts approximately 10 minutes and the two

conditions are separated by at least 30 minutes of unrelated tests. Unlike Owen et al.’s

(1993) procedure, in the current study the dimensions used in each condition (i.e.,

shape, lines, solidity, colour, number, and size) were consistent across participants (as

shown in Figures 5 and 6), rather than counterbalancing the dimensions across the two

conditions. The only other difference between the current and Owen et al.’s version

was that the Perseveration condition was presented first for all participants (rather than

the order of conditions being counterbalanced across participants).

So that participants could be effectively compared across conditions, the main

index of performance was the number of “errors to criterion” within each stage of the

task (this was also the main index of performance used by Owen et al., 1993). If the test

was discontinued because the criterion of 6 consecutive correct responses was not met

within 50 trials, a value of 25 (the value expected with random responding) was

assigned for the errors to criterion score for subsequent stages of the task which were

not administered.

3.4.3 Response Inhibition and Load (RIL) task

The basic idea for this non-verbal computerised test of inhibition, which was created by

the author, was taken from Drewe (1975) but with substantial modifications and

additions. Drewe’s study included two types of task, one involving the requirement to

press a button in response to one type of stimulus but not another (otherwise known as a

“Go-Nogo” task) and the other requiring the participant to inhibit the prepotent response

to match stimuli of the same colour by pressing a button which was opposite in colour

to the stimulus. This latter task type had a control condition which required participants

to press a button which was the same colour as the stimulus. The inclusion of a control

condition was desirable for the current research, as subtraction of scores on this

condition from scores on the inhibition condition allows more precise identification of

the level of performance on the inhibition task condition without the confounding

effects of non-inhibitory processes such as speed of processing and motor coordination.

This “non-matching to target” paradigm was therefore adopted for this research

but modified in order to improve aspects of the methodology. In Drewe’s task, the

stimulus and two response buttons (one red and one blue) were always present and

visible (with the stimulus button lighting up in either red or blue), whereas in the current

task a touch screen was used so that the stimulus was presented only briefly before the

response buttons were presented. This also allowed the two coloured response buttons

to change sides randomly from trial to trial, ensuring that the participant was responding

on the basis of the colour of the response button rather than its spatial location. The

inhibition task condition was also modified so that the colours of the stimulus and

response buttons were different to the control condition (which preceded it). This was

to avoid confounding inhibition with set-shifting (specifically, reversal), as the use of

exactly the same stimuli in control and inhibition conditions means that the inhibition

condition then requires the participant to reverse the stimulus-response contingencies

and therefore directly “shift set” (Ozonoff et al., 1994).

An important addition to the task was an extra condition which involved an

increased working memory load, thereby allowing evaluation of the interaction between

inhibitory capacity and working memory. This condition was included in order to

examine two hypotheses: i) that children with autism (and/or their siblings) may be

impaired only on tasks which combine inhibitory and working memory demands, and ii)

that false belief measures show correlations with performance on tasks that combine

inhibitory and working memory demands, but not with each capacity individually.

Performance on the working memory load condition was compared with that on the

inhibition condition, to assess the specific effect of the working memory load, and also

with performance on the control condition, to assess the combined effect of inhibition

and working memory requirements. The three task conditions are described below.

Condition 1 (Control condition): In this condition, either a pink or green

stimulus circle (approximately 5cm in diameter) appears at the top of a computer touch

screen for 250ms, and then two smaller response circles (approximately 3.5cm in

diameter), one pink and one green, appear simultaneously at the bottom left and right

corners of the screen. Participants are instructed to touch the response circle which is

the same colour as the stimulus circle. Participants have 4s to respond before the

response circles disappear and the next trial begins. An equal number of pink and green

stimulus circles are presented, and the order of presentation is random. As already

mentioned, the response circles change sides randomly (i.e., the pink circle can appear

on either the right or left), to ensure that participants are responding to the colour of the

circle rather than simply its position on the screen. Participants are required to use one

hand only to respond. Performance indices are the percentage of errors (i.e., responding

to the wrong coloured stimulus), and the median RT for correct trials.

Condition 2 (Inhibition condition): This condition is identical to Condition 1

except that the colours of the stimulus and response circles are purple and yellow, and

the participant is required to touch the response circle which is the opposite colour to

the stimulus circle. Hence, if the stimulus circle is purple, participants must touch the

yellow response circle, and vice versa. As for Condition 1, performance indices are the

percentage of errors and the median RT for correct trials.

Condition 3 (Working Memory Load condition): In this condition, instead of the

stimulus being a circle, it is either a square, triangle or cross. As in Condition 2,

participants must touch the response circle which is opposite in colour to the stimulus

shape (the colours in this condition are orange and grey). However at random intervals,

between trials, the three shapes are displayed on the screen and the participant must

touch the shape which was presented in the most recent trial. This occurs for 25% of

the trials. The participant must therefore recall the shape of the stimulus as well as

inhibiting the prepotent tendency to respond to the same colour. Performance indices

for the responses to the colour of the stimulus are identical to those in Conditions 1 and

2. For the questions about the shape of the stimulus, performance is measured by the

percentage of errors.

In each condition, participants perform 7 practice trials, during which any errors

are pointed out verbally and corrected. Following the practice trials, there is a pause

during which the participant may ask any further questions. The 60 critical trials then

proceed, during which every third error is pointed out and the participant reminded of

the task rules. The inter-trial interval is 1000ms in all conditions.

3.4.4 Opposite Worlds (Manly, Robertson, Anderson, and Nimmo-Smith,

This task was selected as an additional measure of inhibition, where unlike the RIL task,

a verbal response is required. Opposite Worlds is a subtest of the Test of Everyday

Attention for Children (TEA-Ch; Manly et al., 1998), and is similar in design to

Gerstadt et al.’s (1994) Stroop-like day-night task, but instead involves reading the

number “1” as “2” and vice versa. It demonstrates good test-retest reliability (Manly et

al., 2001). It forms part of the “attentional control/switching” factor in the TEA-Ch, but

the naming of this factor reflects the executive nature of the factor rather than accurately

describing the requirements of the tasks that load on it. Manly et al. (1998, 2001)

consider Opposite Worlds to be a test of verbal inhibition, pointing out that as

participants are required to switch from the Opposite to the Same World (control)

condition as well as vice versa, performance on the Opposite World condition may be

attributed to the requirement to inhibit a prepotent verbal response rather than the

demands of task switching. The task has displayed good construct and convergent

validity, correlating significantly with other measures of inhibition (Manly et al., 1998).

Opposite Worlds is administered only to children who are able to read the

numbers 1 and 2. The stimuli are yellow squares linked together in an undulating

pattern on a black background, with each square containing either a 1 or a 2. The task is

introduced by saying: “In this test there are two sorts of world. There is the Same

World, where everything is as you would say it here, and the Opposite World, where

you have to say the opposite of what you would say here”. An example page with two

Same World examples at the top and two Opposite World examples at the bottom is

shown to the participant. The experimenter points to the beginning of the first Same

World example and says: “Here I would say “Start, one, one, two, two, one, Stop”. The

child is encouraged to complete the same item and then the other Same World example.

While the child reads the numbers, the experimenter points to each square in turn, and

does not move onto the next square until the child has said the correct number. After

successful completion of the Same World examples, the experimenter then points to the

first Opposite World example and says: “We’re now going to the Opposite World where

we have to say the opposite. Here, when we see a one we have to say “two”, and when

we see a two we have to say “one”. This is how to do it: “Start, one, one, two, one, two,

Stop”. Both examples are then completed by the child.

The participant then completes the four test trials in the order: Same World,

Opposite World, Opposite World, Same World. S/he is reminded of the instructions at

the beginning of each trial. The time taken to complete each trial is recorded from the

time the child says “Start” to the time s/he says “Stop”. The number of errors for each

trial is also recorded, with an error being defined as any occasion upon which the child

says a “1” when required to say “2”, or vice versa. A total time score (summing the

time taken across the two trials) and total error score (summing the errors made across

the two trials) are calculated for each of the Same and Opposite World conditions.

3.4.5 Relational Complexity (Waltz et al., 1999)

This task was included as a measure of relational reasoning, which refers to the ability

to “form and manipulate mental representations of relations between objects and

events” (Waltz et al., 1999, p. 119). While relational reasoning is not often included in

lists of EF components, it was assessed mainly to test Halford’s notion that limited

capacity to integrate multiple relations (i.e., relational complexity) may underlie failure

on false belief tasks (Halford, 1993; Halford et al., 1998; see Section 2.3.1.2 in the

previous chapter). Halford et al. (1998) argue that working memory capacity may be

best defined in terms of the complexity of the relations that can be processed in parallel,

and therefore the Relational Complexity task may also be considered a test of working

memory – a domain which is more often considered an aspect of EF. Waltz et al.

(1999) found significant impairments in patients with prefrontal cortical damage on

their Relational Complexity task, and proposed that failures on various types of EF task

could be accounted for by a deficit in relational integration.

The task is similar in format to the Raven Standard Progressive Matrices Test, in

which the missing part of a pattern must be chosen from six alternatives. The current

version is based on Waltz et al.’s (1999) adaptation but has more levels of difficulty,

more pictures within each item, and more alternative answers. In this version, each

problem consists of a 3 x 3 matrix of square-shaped simple geometric pictures, with the

bottom right-hand corner picture missing. Participants are asked to select the missing

picture from eight alternatives (see Figure 7). Problems vary in the number of relational

changes (e.g., in shape, size, rotation), occurring over horizontal and/or vertical

dimensions of the matrix, which must be attended to while selecting the missing picture.

Nonrelational (Level 0 complexity) items consist of identical pictures, with participants

simply having to choose the matching picture from the eight alternatives. The highest

level of difficulty for relational problems are of Level 4 complexity – requiring the

simultaneous processing of 4 relational changes (see Figure 8). In order to raise the

ceiling of the task, some more difficult items were also included where the relevant

stimulus dimensions do not necessarily vary in a consistent way across the vertical or

horizontal dimensions of the matrix (see Figure 9). An additional item at the end of the

task consists of a matrix with four missing pictures, and participants are required to

move four cut-out pictures into their correct places.

There are 4 problems at each level of difficulty. Participants are instructed to

take their time and point to the correct answer when they are sure of it. They are given

a maximum of two minutes for each problem. A score of 1 or 0 is given for each trial.

They are given three minutes to solve the final problem with the four missing pictures.

The task is discontinued if the participant fails all four problems at a particular level of

difficulty, with remaining items assumed to have been failed. The sum of correct

responses is calculated for each level of difficulty and overall.

Figure 7. Example of a Relational Complexity item with 1 relational change.

Figure 8. Example of a Relational Complexity item with 4 relational changes.

Figure 9. Example of a more difficult Relational Complexity item without consistent

relational changes.

3.4.6 Pattern Meanings (Wallach & Kogan, 1965; Turner, 1999)

Tests of generativity measure the ability to produce multiple novel responses

spontaneously following a single cue or instruction. In this research, the generativity

domain was tested using three different tasks because this aspect of EF has been under-

researched in autism, despite studies demonstrating its potential ability to explain

several symptoms of autism (e.g., Jarrold et al., 1996; Turner, 1997). There are three

basic types of generativity task: word fluency (requiring the participant to generate

words beginning with a certain letter or belonging to a certain category), design fluency

(where participants must produce abstract designs or patterns), and ideational fluency

(requiring generation of uses for objects or interpretations of abstract line drawings).

Word fluency was not tested in this research, mainly because it relies heavily on

vocabulary, making it difficult to disentangle reasons for poor performance (particularly

in autism, where verbal ability is typically impaired). A special emphasis was placed on

ideational fluency, with two tasks of this capacity included, as Turner (1999) found

particularly poor performance on ideational fluency tasks in both low- and high-

functioning individuals with autism.

Pattern Meanings is one of the measures of ideational fluency and requires

verbal generativity. The stimuli are five meaningless line drawings, taken from Wallach

and Kogan (1965) and also used by Turner (1999), which were printed on individual

14.3cm x 9.2cm laminated cards (see Figure 10 for an example). An additional drawing

was used for a practice item. Administration procedures were identical to those used by

Turner (1999), except that participants were given 90s instead of 150s to generate

responses for each item. This shorter interval was introduced following pilot testing, in

which it was found that participants tended to produce only a very small number of

responses in the last minute, and would often become restless and impatient or

inattentive. Before presentation of the practice stimulus, participants are told that the

task is one in which they will be shown some different patterns and asked to think of all

the things the pattern looks like, or what it could be. Participants are then shown the

practice stimulus card (see Figure 11) and asked “What could this be?” Any appropriate

response is reinforced and the participant is encouraged to think of other things the

pattern looks like. The experimenter also makes the following suggestions (if they have

not already been provided by the participant): “a hedgehog”, “someone with spiky hair”,

“sparks from a fire cracker”, and “a brush”. Participants are told that they are allowed

to turn the cards around and view them from any orientation. They are then given the

test stimuli one at a time, and for each one asked “What could this be?”. Stimuli are

presented in a random order.

Figure 10. One of the five test stimuli for the Pattern Meanings task.

Figure 11. The practice stimulus for the Pattern Meanings task.

Scoring procedures were similar to those used by Turner (1999), but an extra

“uninterpretable response” category was added. This category was introduced because it

was found during scoring that a number of responses could not be classified in any of

the other categories. Each response was therefore classified as belonging to one of the

following five scoring categories, and the number of responses in each category was

summed across the five test items:

1. Correct response: A response which represents a plausible interpretation of the

pattern.

2. Incorrect response: A response that represents an inappropriate or implausible

interpretation of the pattern (e.g., for the pattern displayed in Figure 10: “this could

be a shoe”).

3. Repetition: A response which is a repetition of a previous response (for the current

stimulus or a previous stimulus).

4. Redundant response: A response that varies from a previous response only in terms

of one minor element or feature of the response (e.g., “two hills”, “two mountains”,

“two sand-hills”, etc.)

5. Uninterpretable response: A nonsensical response, which cannot be interpreted as

fitting into any of the above categories (e.g., “up and down”).

As some unusual responses were sometimes difficult to classify, the scoring of Pattern

Meanings (and the Uses of Objects task described below) was more subjective than

other tasks in the protocol. Across all types of fluency tasks used in her study, Turner

(1999) reported 85% inter-rater agreement and kappa values in excess of .70, indicating

satisfactory inter-rater reliability. Because the version of Pattern Meanings used in this

study employed slightly different scoring criteria from Turner, inter-rater reliability of

this version was calculated using a subset of data from 22 participants (sampled

randomly from the ASD and control groups in Study One and the ASD sibling and

control sibling groups in Study Two). There was 93.3% agreement between the two

raters and Cohen’s kappa was .81, indicating excellent inter-rater reliability for this

version.

3.4.7 Uses of Objects (Wallach & Kogan, 1965; Turner, 1999)

Uses of Objects also measures ideational fluency and requires verbal responses. In this

task, the stimuli are six different objects. Three objects are “conventional” items with

well-established functions (a pencil, a brick, and a mug), and three are

“nonconventional” items with no clear or established function (a piece of plain navy

blue material measuring 14 x 51 cm, a 50 cm length of dowelling, and a 90 cm long

piece of clothing elastic). As with Pattern Meanings, administration procedures

matched those used by Turner (1999), but again with the shorter 90s interval in which to

provide responses. The task is introduced as one in which participants will be asked to

think of all the ways in which some different objects could be useful. Participants are

then asked “For example, how could we use a newspaper? Tell me something useful we

could do with it”. Any appropriate suggestions made by the participant are praised and

further responses encouraged. The examples “you could use it to start a fire”, “you

could roll it up and swat flies with it”, and “you could use it to wrap a present” are

provided by the experimenter if not already produced by the participant. Participants

are then asked to think of as many uses as they can for the six different objects, one at a

time. For each of the conventional items, the experimenter gives two examples, one

representing the object’s established function (e.g., “you could use a mug to drink

from”), and one that is more imaginative (e.g., “you could use a mug as a vase for

flowers”). For the nonconventional items, the experimenter gives just one imaginative

example (e.g., “you could use a piece of material to wrap up pencils if you wanted to

carry them”). After the examples are provided, the participants are asked to say all the

other ways in which the object could be useful. The objects are presented in the same

order for each participant (pencil, dowel, brick, material, mug, elastic).

Scoring procedures were again similar to those used by Turner (1999) but, as

with Pattern Meanings, an extra “Uninterpretable responses” category was added. In

addition, a “Non-Useful responses” category was introduced as it was found during

scoring that many responses were plausible things that could be done with the object,

but that did not serve any useful purpose (e.g., “you could sharpen a pencil”). Hence,

each response was categorised into one of the following six scoring categories:

1. Correct response: A response which represents a plausible use for the object.

2. Incorrect response: A response that represents an inappropriate or implausible use

for the object (e.g., for the brick: “eat it”).

3. Repetition: A response which is a repetition of either one of their own previous

responses (for the current object or previous objects) or one of the examples.

4. Redundant response: A response that varies from a previous response only in terms

of one minor element or feature of the response (e.g., for the brick: “to build a

garage”, “a shed”, “a factory”, etc.)

5. Uninterpretable response: A nonsensical response, which cannot be interpreted as

fitting into any of the above categories (e.g., for the piece of fabric: “blow down”).

6. Non-useful response: A response which describes something plausible that could be

done to or with the object, but which does not include a useful purpose for the object

(e.g., for the piece of elastic: “stretch it”).

The number of responses in each category was summed separately for conventional and

nonconventional items, as well as overall. Inter-rater reliability for Uses of Objects was

calculated using a subset of data from 23 participants (again sampled randomly from the

ASD and control groups in Study One and the ASD sibling and control sibling groups in

Study Two). There was 86.8% agreement between the two raters and Cohen’s kappa

was .76, indicating good inter-rater reliability.

3.4.8 Stamps task (Frith, 1972)

This task was based on one used by Frith (1972) as a measure of the spontaneous self-

generation of underlying rules in patterns. It was considered a test of design fluency in

this research despite the fact that Frith did not label her task in this way, as it is a non-

verbal task requiring participants to produce multiple novel responses. The task differs

from standard design fluency measures in that it involves producing patterns from a set

of materials rather than drawing abstract designs. While it has been used far less

frequently than other design fluency tasks, its scoring system allows analysis of a

number of different processes underlying task performance, making it amenable to a

component process approach. In addition, Frith (1972) demonstrated interesting results

with the task in children with autism, who tended to rigidly adhere to the same

underlying pattern rules, used a restricted range of available materials, and did not

generate original patterns.

The task procedure was based on Frith (1972), with some minor procedural and

scoring modifications. Participants are provided with four stamps of different shapes

and colours, and a piece of paper with a line of 16 boxes on it. They are asked to make

whatever pattern they like with the stamps, putting one stamp in each box. There are

eight trials, four using only two of the stamps and four using all four stamps. If the

child does not use all the stamps available during the first eight boxes of a trial (i.e.,

only uses one stamp on the two-stamp trials or less than four stamps on the four-stamp

trials), at that point s/he is reminded that there are more stamps available. The two-

stamp and four-stamp trials are presented alternately. The trials are divided up into two

blocks, separated by at least half an hour, with each block consisting of four trials (two

trials with two stamps, and two trials with four stamps).

Four types of scores are calculated for each trial:

1. Complexity. Rules are defined as consistently recurring sub-units of a fixed number

of elements (e.g., the pattern red/green/red/green etc. has an underlying alternation

rule as two elements are repeated over and over again; the pattern red/red/red/red

etc. has an underlying rule to repeat a single element; and the pattern

red/green/black/blue/red/green/black/blue etc. has the underlying rule to repeat a

group of four elements). Ratings of complexity are based on the number of

elements contained in a sub-unit, using the following scale:

i) repetitions of single elements are given the lowest rating of 1

ii) repetitions of two single elements (i.e., alternations) are given a rating of 2

iii) repetitions of three or four single elements are given 3

iv) on two-stamp trials, if two stamps are used in the sequence, but the pattern

consists of more than just an alternation (e.g., red/red/green/red/green/

green), a score of 3 is given

v) on four-stamp trials, if three or four stamps are used but the pattern consists

of more than just cycling through the three or four elements (e.g.,

red/red/green/black/black/blue/red/green/green/black/blue/blue), then a score

of 4 should be given.

In a case where a single rule can not account for the whole sequence of 16 items, but

only for part of it, the rule must account for at least one half of the sequence in order

to receive its score. If this criterion is not met, the pattern is considered

unidentifiable and given a rating of 1.

2. Rule adherence. All sequences which can be completely accounted for by a single

rule (i.e., a repeated sub-unit or a mirror-reversed pattern) are given a score of 1.

All sequences which are irregular in any way, including those with predominant or

unidentified rules, are given a score of 0.

3. Restriction. In the four trials where four stamps are used to build a pattern, a score

of 1 is given (for each trial) if the child uses fewer than the four stamps available. A

score of 1 is also given if the child only uses one stamp on two-stamp trials.

4. Originality. Any sequence that occurs only once in all of the trials is considered

“original” and given a score of 1. This score is only given if the original sequence

follows an identifiable pattern. If the “original” sequence is random, it scores 0.

Scores are summed across the eight trials to produce overall complexity, rule adherence,

restriction and originality scores for each participant.

3.5 Behavioural measures

Autistic symptomatology includes impairments in social interaction and

communication, and repetitive behaviours. Social and communication behaviours were

considered together in the current research, for reasons described below in Section

3.5.2.2. Thorough measurement of repetitive behaviours was emphasised, as discussed

in the introduction to Study One (see Section 4.1.1 in Chapter 4).

3.5.1 Measures of repetitive behaviour

3.5.1.1 Repetitive Behaviours Questionnaire (RBQ)

The RBQ was developed by the author as a screening measure to be completed prior to

administration of the Repetitive Behaviours Interview (RBI; see Section 3.5.1.2). The

RBQ covers the same repetitive behaviours as the RBI, but the questions are answered

in a yes/no questionnaire format. The caregiver of the individual completed the

questionnaire. S/he was asked to tick “yes” if his/her child had ever shown the

behaviour under question, regardless of its frequency, and whether it be recently or in

the past. Any questions that were ticked “yes” were then asked again verbally, with

follow-up questions, in the RBI, but questions which were ticked “no” were not

repeated within the RBI. The purpose of this structure of administration was mainly to

conserve time, given the time-consuming nature of the test protocol and other

interviews.

3.5.1.2 Repetitive Behaviours Interview (RBI; Turner, 1996)

The RBI was developed by Michelle Turner as part of her PhD thesis. Neither the full

RBI nor a thorough description of it have been published, so it is described in detail

here, and the current version is contained in full in Appendix A. It was designed to

measure the presence and severity of a large range of repetitive behaviours, including

those typically displayed by individuals with autism as well as those characteristic of

other clinical groups. Turner’s version of the RBI consists of 59 questions covering 10

categories of repetitive behaviour: stereotyped manipulation of objects, object

attachments, stereotyped movements, tic-like behaviours, self-injurious behaviour,

obsessive-compulsive behaviours, insistence on sameness of environment, rigid

adherence to routines and rituals, repetitive use of language, and circumscribed

interests. Each interview question asks whether or not the caregiver’s child displays a

particular type of behaviour, and includes specific examples of behaviours covered by

the question, so that caregivers are clear about the type of behaviour being targeted and

forgetting is minimised. The interview assesses whether or not the target behaviour is

displayed currently (once a week or more over the last three months), as well as whether

it had ever been displayed previously. These Recent and Lifetime behaviours are rated

separately. In the current research, only the Recent behaviours ratings were used,

because relationships between current cognitive and behavioural functioning were the

central concern.

Scoring procedures. For the classes of behaviour which occur in discrete

episodes (i.e., stereotyped manipulation of objects, stereotyped movements, tic-like

behaviours, self-injurious behaviour, and repetitive use of language), information on the

frequency of the behaviour is coded using an 8-point scale. The codes, which refer to

how often each episode of the behaviour occurs, range from (0) “never” to (7) “almost

constantly”, with intermediate codes referring to the number of episodes occurring per

week and per day (ranging from 1-2 times per week to more than 30 times per day).

Information on the duration of each episode is also included because individuals may

show a particular repetitive behaviour infrequently, but engage in it for a long period of

time, thus frequency data alone could be misleading. Duration information is coded

using a 5-point scale ranging from (0) “less than one minute” to (5) “30 minutes or

longer”. Caregivers are not given a list of the frequency and duration codes, in order to

prevent response bias. However, if any response is unclear, the caregiver is asked for

the number of times per day the behaviour is shown, or asked to choose between two of

the duration codes. In Turner’s version of the RBI, the circumstances which commonly

lead to each of the discrete-episode type behaviours are also coded in one of eight

categories, including “at no specific time or situation”, “when anxious or tense”, and so

forth.

For other “steady-state” behaviours which do not occur in discrete episodes and

can not be coded in terms of frequency and duration (i.e., object attachments, obsessive-

compulsive behaviours, insistence on sameness of environment, and rigid adherence to

routines and rituals), the severity of the behaviour is coded using a simpler 3-point

scale. In general, a code of (0) is used to indicate the absence of the target behaviour (or

at least, lack of abnormal levels of the target behaviour), (1) denotes mild inflexibility or

mild-moderate behavioural severity, and (2) indicates marked inflexibility or extreme

severity. Codes of (1) and (2) are specifically operationalised for each question.

Because the nature of some of the behaviours is such that they are shown to some

degree in a normal population (e.g., having regular routines, favourite items and so on),

severity is often gauged by the impact of the behaviour on the rest of the family. Each

of the sections on “steady-state” behaviours is followed by a series of questions about

how the child would react if s/he was prevented from indulging in each behaviour that

has been rated.

The two questions on circumscribed interests are structured slightly differently,

being rated in terms of the usual or unusual nature of the interest, the degree of

obsessionality with the interest, the typical or atypical manifestation of the interest, and

the degree to which it prevents the individual from pursuing other interests.

During interview administration, care is taken to ensure that the same behaviour

is not coded twice, under different questions. In cases where the same behaviour arises

twice or seems to fit under two different categories, the behaviour is coded according to

the most notable feature of the behaviour. For example, if a caregiver reports that their

child continuously kicks around a ball while walking around the house, this behaviour

would be coded under the object manipulation question, rather than the repetitive

pacing question. Similarly, a behaviour is not always coded under the question which

elicits its description by the informant, if it fits more appropriately under another

question. If a child shows two distinct behaviours which both fall under one question,

both are recorded and scored.

Differences in the current version. The version of the RBI used in the current

research differed from Turner’s in several ways. Firstly, only the questions which were

ticked “yes” on the RBQ were asked within the RBI. Secondly, the questions regarding

the circumstances which commonly lead to the display of the discrete-episode type

behaviours were not included. This was because it was found during initial testing that

parents found these questions quite difficult to answer clearly, and also because it was

felt that data gleaned from these questions were not essential for the current study.

Thirdly, the questions about how the child would react if s/he was prevented from

indulging in each of the “steady-state” behaviours were not asked either, for similar

reasons. Finally, the section on compulsive behaviours was expanded from two to five

questions, covering a larger range of behaviours. As a result of the latter three

modifications, the current version of the RBI includes 52 rather than 59 questions.

None of the questions from the original RBI about the presence and severity of the

repetitive behaviours themselves were removed or changed in the current version.

Summary variables used in statistical analyses. The main measures derived

from the current version of the RBI were the presence of behaviour and severity

summary scores for each behavioural category. The presence of behaviour summary

score was calculated by assigning a score of 1 for each question that received a

frequency rating above 0 (or severity rating above 0 for the “steady-state” behaviours),

and then calculating a sum of scores for all questions in the category.

The severity summary scores were slightly more complex. For the discrete-

episode behaviours, Turner simply used the frequency codes and did not use the

duration codes in her analyses. In the current research, it was decided that a severity

score which included both frequency and duration information would be a more

accurate reflection of the time spent on each behaviour. For each possible combination

of frequency and duration codes, the maximum number of minutes per week spent

doing the behaviour was calculated. For example, for a behaviour coded (2) “3-6 times

per week” for frequency and (3) “4-9 minutes” for duration, the maximum number of

minutes per week would be 54 (6 x 9). Because there was a very large range in the

number of minutes per week possible (0 to 10080), each combination of codes was then

ranked in severity, such that the lowest number of minutes per week was given a score

of 0 and the highest was given a score of 32. Thus, each of the possible combinations of

frequency and duration codes corresponded with a score between 0 and 32 inclusive.

Each discrete-episode behaviour rated on the RBI was therefore given a severity score

of between 0 and 32, and the severity summary score for each behavioural category

consisted of the sum of the severity scores for each behaviour in that category. For the

“steady-state” behaviours, the severity scores for each behaviour were simply the same

as the 0, 1 or 2 rating assigned during interview, with the severity summary score being

the sum of these scores across the behaviours in each category. The severity summary

scores for all behavioural categories were converted to t scores (with a mean of 50 and

standard deviation of 10) using the grand mean and standard deviation across the autism

and control groups, thereby enabling comparisons across different categories while

controlling for the fact that the number of items and range of scores is variable across

categories6.

To reduce the number of statistical comparisons required in analyses examining

the relationship between cognitive functioning and repetitive behaviours, Turner further

collapsed the severity summary scores for each behavioural category into four

composite variables: Repetitive Movements, Sameness Behaviour, Repetitive

Language, and Circumscribed Interests. The same composite variables were used in

this research, with the addition of a Compulsive Behaviours variable (due to the

addition of items in this category in the current version). The Repetitive Movements

composite score was the sum of severity summary scores for the stereotyped

manipulation of objects, stereotyped movements, tic-like behaviours, and self-injurious

behaviours categories. The Sameness Behaviour composite score included the severity

summary scores for insistence on sameness of environment, rigid adherence to routines

and rituals, and object attachments. The Compulsive Behaviours, Repetitive Language,

and Circumscribed Interests composite scores were simply the severity summary scores

for those categories.

Reliability and validity. In her PhD, Turner reports test-retest and inter-rater

reliability data for her version of the RBI. In terms of test-retest reliability, she reports

an average of 96% agreement across two administrations with regard to the simple

presence or not of each behaviour. The agreement was 83% for the frequency and

6 From this point onwards, the term “severity summary score” will mean the t score.

duration codes for the discrete episode behaviours, and 92% for the severity codes for

the “steady-state” behaviours. Inter-rater reliability was very good, at a mean of 99.5%

agreement for the frequency and duration codes, with a corresponding mean Kappa

value of .99. For the severity codes, there was a mean agreement of 91%, with a Kappa

value of 0.87. Turner (1996) did not explicitly examine the validity of the RBI in her

thesis. However, in the current studies, a high correlation between the

Repetitive/Restricted Behaviours domain of the ADI-R and an overall sum of severity

summary scores across categories on the RBI, which was conducted across all groups in

Studies 1 and 2, r = .73, p < .001, suggested good construct validity. The underlying

factor structure of the RBI was also examined in Study 1, the results of which are

reported in Section 4.3.5.1 of Chapter 4.

3.5.2 Measures of social behaviour and communication

3.5.2.1 Social Behaviour Questionnaire (SBQ; Skuse et al., 1997)

The SBQ is a 12-item questionnaire completed by the individual’s caregiver, which was

originally devised for use with a sample of individuals with Turner’s syndrome (Skuse

et al., 1997). It includes 12 statements primarily relating to the child’s everyday social

awareness and behavioural appropriateness; for example, “not aware of other people’s

feelings”, “does not pick up on body language”, and “does not understand how to

behave when out, e.g., in shops or other people’s houses”. These statements are rated as

0 (not at all true), 1 (quite or sometimes true), or 2 (very or often true). Scores therefore

range from 0 to 24. The questionnaire demonstrates good internal consistency, test-

retest reliability, and validity (Skuse et al., 1997).

3.5.2.2 Social and communication ADI-R domains

As the SBQ is a brief, limited measure of social functioning, questions in the social

domain of the ADI-R which related to current functioning were selected and scores

summed to form an additional measure of social behaviours. Similarly, scores on

questions in the communication domain relating to current functioning were also

summed as measure of communicative ability. Only questions relating to current

functioning were used (rather than all the questions usually used to calculate the

traditional algorithm for social behaviours and communication) because relationships

with current cognitive capacity were of central interest, as well as for the sake of

comparability with the RBI, from which measures of current behaviour only were taken.

These two ADI-R summary scores of current social behaviours and communication

correlated quite highly, r = .77, p < .001. A factor analysis conducted with the two

ADI-R summary scores and the SBQ score also demonstrated that all three measures

loaded on the same factor (the results of this factor analysis are described more fully in

Section 4.3.5.2 of the following chapter). It was therefore decided to create a composite

score of all three measures of social/communicative ability (i.e., the SBQ and the

current social and communication scores from the ADI-R). This was achieved by

conducting a factor analysis deriving factor scores for each participant using a

regression equation.

CHAPTER 4

Study One: Profile, Primacy, and Independence of Theory of Mind and Executive Function Impairments in Autism Spectrum Disorders

4.1 Introduction 4.1.1 Aims 4.1.2 Hypotheses

4.2 Method

4.2.1 Participants 4.2.2 Procedure

4.3 Results

4.3.1 Data screening 4.3.2 Group comparisons on ToM and EF tasks

4.3.2.1 False belief tasks 4.3.2.2 Dewey Stories 4.3.2.3 Tower of London 4.3.2.4 IDED set-shifting task 4.3.2.5 Response Inhibition and Load task 4.3.2.6 Opposite Worlds task 4.3.2.7 Relational Complexity 4.3.2.8 Pattern Meanings 4.3.2.9 Uses of Objects 4.3.2.10 Stamps task 4.3.2.11 Summary and effect sizes of group comparisons

4.3.3 Universality of ToM and EF deficits 4.3.4 Ability of ToM and EF variables to predict group membership 4.3.5 Behavioural measures: Group comparisons and derivation of indices

used in correlational analyses 4.3.5.1 Repetitive Behaviours Interview 4.3.5.2 Social and communicative functioning

4.3.6 Correlations between ToM/EF and behavioural measures 4.3.7 Relationship between ToM and EF

4.3.7.1 Correlations between ToM and EF 4.3.7.2 Dissociations between ToM and EF

4.4 Discussion 4.4.1 Profile of ToM and EF deficits 4.4.2 Primacy of ToM and EF deficits 4.4.3 Independence of ToM and EF deficits 4.4.4 Towards a “multiple primary deficits” model of ToM and EF in ASDs

4.1 Introduction

4.1.1 Aims

Chapter 2’s literature review revealed that individuals with autism consistently display

both ToM and EF deficits, but that the primacy and independence of these two

impairments remain a matter of current debate. The first of the two studies contained in

this thesis was principally aimed at elucidating the profile, primacy, and independence

of ToM and EF deficits in children with ASDs, with the broader aim of clarifying the

structure of the cognitive level of explanation in a causal model of autism. Thus, the

three central aims of Study One were to determine i) the specific profile of ToM and EF

deficits which characterises ASDs; ii) whether impairments in ToM and/or EF can

adequately meet the criteria for a primary cognitive deficit in ASDs, and which appears

to be the most primary; and iii) whether or not ToM and EF impairments are related in

ASDs, and if so, what the nature of that relationship might be (i.e., which theory of the

ToM-EF relationship is best supported by the data). The remainder of this section

describes how these aims were addressed in the current study.

i) Aim 1: Determining the profile of ToM and EF impairments. The specific

profile of ToM and EF impairments in ASDs was examined by comparing the

performance of individuals with ASDs with control participants matched on age and

non-verbal ability on a range of ToM and EF tasks. In particular, emphasis was given to

the precise measurement of a range of EF components. As described in Section 2.2.3 of

Chapter 2, previous studies of EF in autism have been weakened by the use of tasks

which are often impure and/or require non-verbal responses only (which may advantage

individuals with ASDs), and which do not cover the full range of EF components. This

study sought to address those weaknesses, not only in order to provide an accurate map

of the cognitive profile typical of ASDs, but also to help determine whether that profile

may be unique to autism (as, for example, the presence of inhibition deficits would be

inconsistent with the unique EF profile proposed by Ozonoff and colleagues; see

Section 2.2.3) and how each component may relate to ToM ability (discussed further

below). As described in Chapter 3, planning, set-shifting, inhibition, working memory,

relational reasoning, and generativity components were all measured. Both verbal and

non-verbal tests were used where possible. A task involving both inhibitory and

working memory demands was included, following suggestions that only tasks

combining both components are i) impaired in autism and ii) related to ToM. A test of

relational reasoning was included in order to examine Halford’s (1993) notion that the

capacity to integrate multiple relations may be a key ability underlying false belief

understanding (this represents a “common conceptual requirements” account of the

ToM-EF relationship, as described below in hypothesis 3). Generativity was also

assessed in detail, in response to indications that generativity deficits may hold strong

explanatory value in terms of the symptoms of autism (e.g., Jarrold et al., 1996; Turner,

1997).

ii) Aim 2: Determining the primacy of ToM and EF impairments. As we have

seen, common criteria used to judge the primacy of a cognitive deficit to a disorder are

its i) universality in individuals with the disorder, ii) uniqueness to the disorder, iii)

causal precedence or ability to account for the earliest symptoms of the disorder, and iv)

explanatory value or ability to account for the whole range of symptoms displayed by

individuals with the disorder. In this study, all of these criteria were tested in some way

for both the ToM and EF hypotheses of autism except for the third criterion of causal

precedence, as children below the age of 5 were not included in the sample. The main

reason for this was that it was important to test the range of EF components, using both

verbal and non-verbal response modes if possible, which is difficult for a young sample

both because tests in some EF domains (e.g., generativity) are not yet available for this

age group and because the limited verbal abilities of young children constrain the tests

appropriate for use1.

The criterion of universality was addressed in this study by calculating the

proportion of participants with ASDs displaying an impairment on the variable in

question, with “impairment” defined as a score worse than one standard deviation from

the control mean – a stricter cutoff for impairment than the lenient criterion of any score

below the control mean, which was used by Ozonoff et al. (1991). The uniqueness

criterion was tested indirectly, by analysing which ToM and EF variables best predicted

membership of the ASD group (this methodology was also used by Ozonoff et al., 1991,

to assess uniqueness). As individuals from other clinical groups were not assessed,

with the exception of a few children in the control group with mild intellectual

handicaps, this test of uniqueness should be considered as addressing whether the ToM

and EF deficits displayed were unique to individuals with ASDs compared with

individuals of equivalent age and non-verbal ability (rather than being unique to ASDs

1 Some 4-year-old autistic and control children were tested as part of the WAFSASD (see Method – Section 4.2) and it was found that many were unable to adequately comprehend or perform several of the EF tasks.

compared with all other clinical conditions). Explanatory value was measured by

calculating correlations between ToM/EF variables and behavioural measures of autistic

symptomatology (i.e., social/communicative functioning and repetitive behaviours). A

particular emphasis was placed on a thorough examination of each cognitive

impairment’s relationship with repetitive behaviours and restricted interests, in

comparison with a briefer assessment of social and communicative functioning.

Although this emphasis was not strictly necessary for the exploration of explanatory

value, it was considered important because this third aspect of the autistic triad has been

one of the main grounds for discriminating the ToM and EF hypotheses (that is, both

ToM and EF capabilities show relationships with and appear able to explain social and

communicative impairment, but the ToM hypothesis does not account well for repetitive

behaviours). In addition, these non-social aspects of autistic symptomatology have been

largely neglected in previous research, with only one published study directly

addressing the relationship between ToM/EF and repetitive behaviours (Turner, 1997).

iii) Aim 3: Determining the independence of ToM and EF impairments. The

nature of the relationship between ToM and EF in children with ASDs was investigated

by comparing the pattern of correlations between ToM and EF variables in the ASD

participants with similar correlations in the control group. Thus, the presence of

significant correlations between ToM and EF in individuals with ASDs would be

suggestive of an underlying relationship, and the pattern of correlations would show

which EF components may be important for ToM performance or development. The

incidence and direction of dissociations between ToM and EF deficits in the ASD group

were also examined, by calculating the proportion of ASD participants with impaired

EF who displayed intact ToM, and vice versa (with impairment defined in the same way

as for the universality calculations). This allowed assessment of whether one ability

appeared to be a prerequisite for the other (or whether one impairment ever occurred

without the other), which is relevant for the question of primacy as well as helping to

discriminate between the different theories of the ToM-EF relationship – in particular,

the two emergence accounts (see Section 2.3 in Chapter 2).

4.1.2 Hypotheses

Predictions for the profile of deficits. It was expected that both ToM and EF deficits

would be found in our sample of individuals with ASDs, with poorer performance

expected on higher-level ToM measures. In terms of the specific profile of EF deficits,

based on the outcomes of previous research it was predicted that ASD participants

would show impairments in planning, set-shifting, and generativity, but not inhibition or

working memory. However, consistent with Russell’s (1997b) proposal, it was

hypothesised that ASD participants may show impairments on the task combining

inhibition and working memory requirements. It was expected that in domains where

both verbal and non-verbal measures were used, individuals with ASDs would be more

likely to show impairments on verbal tasks. No specific predictions were made with

regard to performance on the relational reasoning task, which has not been used

previously with individuals with autism; however, given previous findings of intact

working memory in ASDs, it was thought possible that this domain may also be intact

(as it tests Halford’s (1993) notion of working memory).

Predictions for the primacy and independence of deficits. In considering the possible

outcomes of analyses of the primacy and independence of ToM and EF deficits in

individuals with ASDs, a number of different hypotheses are conceivable, all of which

hold different implications for theories of the primary cognitive deficit(s) of autism as

well as theories of the ToM-EF relationship. These hypotheses include the following:

1. There is only a single, primary deficit in ASDs, with no secondary impairments.

This hypothesis would be supported if only ToM or only EF impairments are

displayed by the ASD group. This possibility is not likely given fairly consistent

evidence that both ToM and EF impairments are present in children with ASDs.

2. ToM and EF impairments are related in ASDs such that one deficit is primary and

either causes or explains the other, which is secondary. This possibility could be

consistent with expression, emergence, and common neuroanatomical bases

accounts of the ToM-EF relationship. For example:

i) If EF deficits are primary and cause a ToM deficit because of performance-

based factors (i.e., the expression account), this would be revealed in a

pattern of results showing EF deficits as more primary, significant

correlations between ToM and certain EF components (most likely inhibition

and working memory), and no or few dissociations such that those EF

components are impaired but ToM is intact2.

2 Dissociations in the other direction would also be unlikely as EF should not be intact in individuals with ASDs if EF deficits are primary.

ii) If a ToM impairment is primary and causes a secondary EF deficit because

of functional dependence during development (i.e., Perner’s emergence

account), this would be reflected in a pattern of results demonstrating a ToM

deficit meeting criteria for primacy, significant correlations between ToM

and EF, and no dissociations in ASD participants such that ToM is impaired

but EF is intact (dissociations in the other direction would also be

inconsistent with Perner’s theory, as discussed in Section 2.3.1.3, and

unlikely as per footnote 2).

iii) If an EF (or ToM) deficit is primary and a secondary ToM (or EF) deficit is a

consequence of its neuroanatomical proximity, then one would expect to see

evidence of the primacy of EF but not ToM (or ToM but not EF), and

correlations between ToM and EF, but dissociations would be acceptable

such that EF (or ToM - i.e., the primary domain) is impaired and

performance in the other domain is intact.

3. ToM and EF impairments are related, but neither is primary; there is a third deficit

which is primary and causes both deficits. This result would not be supportive of

either the ToM hypothesis or the EF hypothesis of autism. It would be consistent

with a “common conceptual requirements” account of the ToM-EF relationship.

This hypothesis would be reflected by results showing neither ToM nor EF deficits

adequately meeting the criteria for primacy (as while they would be caused by the

primary deficit, there would not be as direct a relationship with symptoms),

significant correlations between ToM and EF variables, and no or few dissociations

in either direction (at least on tasks with the common conceptual basis).

4. ToM and EF impairments are independent in ASDs, but only one is primary. In this

hypothesis, the most likely explanation for the co-occurrence of the non-primary

impairment would be its neuroanatomical proximity to the primary impairment, but

unlike version iii) of hypothesis 2, the two deficits are not correlated. This lack of

correlation despite neuroanatomical proximity would suggest something unusual

about the ToM-EF relationship in ASDs as compared with typically developing

children. Results would show primacy of one of the deficits but not the other, and

no correlations between ToM and EF deficits. Dissociations would be allowable

such that performance in the primary domain is impaired but the second impairment

does not always occur.

5. ToM and EF impairments are independent in ASDs and both are equally primary.

Like hypothesis 4, the co-occurrence of impairments would be most likely explained

by their neuroanatomical proximity, but unlike hypothesis 4, both are primary.

Results would be expected to demonstrate that both ToM and EF deficits meet

criteria for primacy, but there would be few significant or strong correlations (as

although ToM and EF deficits would have to co-occur in the large majority of ASD

participants, they may not necessarily co-vary in severity). Dissociations would not

be expected to occur if all criteria for primacy were met by both impairments, as

both primary deficits would have to be impaired in each individual with an ASD.

This is a somewhat unlikely outcome as it is improbable that two independent

deficits would both show complete explanatory value for the full range of

symptoms.

6. ToM and EF impairments are independent in ASDs, and neither meets all criteria

for primacy. This represents a more classic “multiple primary deficits” model of

ASDs, in which the deficits both hold causal importance but neither are universal or

can account for the full range of symptoms. Again, the co-occurrence of

independent deficits is most likely to be caused by common neurobiological

substrates. This hypothesis is consistent with at least three different scenarios

regarding cognitive deficits in ASDs, for example:

i) There may be different subgroups of individuals with different primary

deficits (these subgroups may be classified according to level of intellectual

functioning or symptom severity, for example – see Section 1.2 in Chapter

1). In this case, neither ToM nor EF would be universal among the whole

sample, but both may hold good explanatory value within the relevant

subgroup (correlations with symptoms across the whole sample may not be

strong, however). Dissociations in both directions would be expected, such

that one subgroup could show intact ToM but impaired EF, and another

subgroup could show the opposite pattern (there may also be a third group

where both abilities are impaired).

ii) If autism is considered to be a multidimensional spectrum, for example if a

ToM deficit was the basis of one aspect of symptomatology (e.g., social

impairment) and EF deficits were the basis for another aspect (e.g., repetitive

behaviours), then neither deficit would be likely to be universal (if the

sample was heterogeneous and not all individuals showed all symptoms),

and each deficit would only hold explanatory value for the relevant symptom

domain. ToM-EF dissociations in either direction may occur in individuals

who do not display all aspects of symptomatology.

iii) There may be a third (or more) cognitive deficit(s), which may be more

primary than or at least equally primary as ToM and EF deficits. This may

actually also be the case for either of the above two scenarios (i.e., there

could be 3 subgroups characterised by different primary deficits, or 3

independent cognitive deficits underlying the three aspects of

symptomatology). This third deficit may be related to either ToM or EF

deficits, but would not explain them both as in hypothesis 3.

Hence, hypotheses 1-4 all represent different versions of a single primary cognitive

deficit model of autism, whereas hypotheses 5 and 6 both represent multiple primary

deficits models.

In the only previous study to directly address the primacy and independence of

ToM and EF deficits in autism in a similar manner to this study, Ozonoff et al. (1991)

found most support for version iii) of hypothesis 2. They found that ToM and EF were

correlated in autism, but that EF deficits were more primary, as judged by their

universality and uniqueness to autism. Although dissociations were not explicitly

examined, they reported that a subset of their ASD sample showed impaired EF but

intact ToM. They interpreted this pattern of results as suggesting a neuroanatomical

link between ToM and EF deficits in autism, such that they were correlated but the

relationship was not causal at a cognitive level. However, as described in Chapter 2,

Ozonoff et al.’s (1991) study was weakened by i) its use of impure EF tasks and the

employment of an EF composite score which obscured the specific nature of both EF

deficits and the ToM-EF relationship; ii) its lenient definition of impairment; and iii) its

failure to partial out age from the ToM-EF correlations. In addition, Ozonoff et al. did

not examine the presence of ToM-EF dissociations in both directions, the outcome of

which is an important discriminator between the six hypotheses outlined above.

Because of these weaknesses, and because other research relevant to the primary

and independence of ToM and EF in autism has been equivocal, no strong predictions

about which one of the above hypotheses was likely to be supported were made prior to

conducting this study. Because of weak plausibility, a low likelihood was placed on

hypotheses 1 and 5, and based on previous studies it was also suspected that neither

ToM nor EF deficits may fully meet all criteria for primacy. The current study may

nevertheless be considered an exploratory but extensive investigation of the primacy

and independence of ToM and EF impairments in ASDs. It builds upon Ozonoff et al.’s

(1991) original study and other relevant research by i) utilising a range of EF tasks

designed to tap separate components of EF, the results of which were analysed

separately throughout; ii) adopting a stricter criterion of impairment; and iii) partialling

out age, VIQ and PIQ from all significant correlations. It also explicitly examines the

presence of double dissociations between ToM and EF, and includes investigations of

the explanatory value of ToM and EF deficits as an additional measure of primacy

(which hold particular importance as a way of discriminating between the 3 scenarios

presented in hypothesis 6).

4.2 Method3

4.2.1 Participants

Autism Spectrum Disorders (ASD) Group. There were 48 participants with ASDs

ranging in age from 5 to 18 years. Participants in this group were mainly recruited

through Western Australian autism centres (specialising in assessment and/or therapy

with individuals with ASDs) and support groups, including the Autism Association of

Western Australia, Intervention Services for Autism and Developmental Delay, the WA

Disability Services Commission, and the Asperger Syndrome Support Group.

Participants of a previous study on the genetics of autism conducted through the Centre

for Clinical Research in Neuropsychiatry were also invited to participate in the current

study. The study was advertised using brochures, features in newsletters, and

presentations to professionals and parents. Parents expressed interest by returning a slip

via mail to the research team giving consent to be contacted about the study.

3 Both of the studies in this thesis formed part of the Western Australia Family Study of Autism Spectrum Disorders (WAFSASD), a large-scale project funded by a National Health and Medical Research Council grant awarded to chief investigators Joachim Hallmayer, Murray Maybery, and Dorothy Bishop. The rationale and methodology for this thesis were nevertheless developed largely independently from the broader aims of the WAFSASD. The current author selected and developed all of the cognitive measures used in the thesis (as well as the RBI), was principally responsible for administration of these tasks to participants, and chose and conducted all statistical analyses reported. However, the diagnostic instruments were selected in collaboration with other WAFSASD investigators, and similarly were administered by research assistants for the WAFSASD. In addition, recruitment of families was conducted in collaboration with WAFSASD research assistants. Some of the probands with autism who participated in the WAFSASD were too low-functioning to complete all of the cognitive tasks used in this study, and were not included.

All participants had received a clinical diagnosis of autism (n = 28), Asperger

syndrome (n = 13) or PDDNOS (n = 7) from a health professional (e.g., paediatrician,

psychiatrist, psychologist). The presence of autistic symptomatology in at least one

domain was then verified using the Autism Diagnostic Interview – Revised (ADI-R).

Two participants (one with a clinical diagnosis of autism and one with Asperger

syndrome) were excluded as they did not exceed cutoff scores in any of the three ADI-R

domains (i.e., social interaction, communication, restricted/repetitive behaviour). Of the

remaining 46 participants in the ASD group, 34 met criteria in all three domains of the

ADI-R, 10 met criteria in two domains, and 2 met criteria in one domain only.

Other exclusion criteria were the presence of genetic abnormalities or

neurological dysfunction (e.g., head injury, encephalitis, neurofibromatosis, cerebral

palsy), with the exception of epilepsy4. There were four participants in the ASD group

with comorbid diagnoses, as reported by their parents (2 with dyspraxia, 1 with

epilepsy, and 1 with dyspraxia and epilepsy).

Control Group. Forty-nine control children ranging in age from 5 to 17 years were

recruited to participate in the study. Of these, 46 were typically developing children and

3 had mild intellectual disabilities. The control group was selected to match the ASD

group on age and PIQ (reasons for not matching on VIQ are described below).

Recruitment of this group was mainly achieved through Western Australian schools,

again through brochures and newsletters mailed to parents. Because of difficulty

recruiting sufficient numbers of boys with low PIQ, in some schools all boys whose

parents gave consent were tested on IQ measures, and then the parents of those boys

with PIQs in the range of 60-95 were contacted and asked if they would like to

participate in the larger study. The children with mild intellectual disabilities were

recruited through the WA Disability Services Commission (as controls for children in

the ASD group with PIQs between 60 and 70).

Exclusion criteria were a known or suspected ASD, as well as genetic and

neurological abnormalities. Mothers of control participants completed the Autism

Screening Questionnaire (ASQ) in order to screen for symptoms of autism in the control

group. If participants scored above the cutoff point of 10 on the ASQ, the ADI-R was

administered. One participant, who had a mild intellectual disability, met criteria for

4 Although epilepsy is a neurological illness, because it is a common comorbid condition of autism, it was felt that exclusion of participants with epilepsy may result in a sample which was non-representative of autism.

autism on the ADI-R and was excluded from further analysis, leaving 48 participants in

the control group. Two participants in the control group had received clinical diagnoses

of ADHD (as reported by their parents).

Demographic characteristics of each group are presented in Table 2. The ASD and

control groups were matched on chronological age, t(92) = 1.74, p = .09, and PIQ, t(92)

= .92, p > .1. Children in the ASD group had significantly lower VIQs than the control

group, t(92) = 3.7, p < .001. Because children with autism typically show a significant

discrepancy between VIQ and PIQ, matching groups on Full-Scale IQ or on both VIQ

and PIQ was not considered appropriate or possible. VIQ was therefore included as an

additional independent variable in group comparisons, as described in Section 4.3.2.

All participants had a PIQ of 60 or above, and a VIQ of 50 or above. The proportion of

girls was slightly higher in the control group than in the ASD group, and chi-square

analysis revealed that the difference approached significance, χ2 (1, N = 94) = 3.65, p =

0.06. However, this was not considered to be a problem as analyses conducted to

compare the performance of boys and girls in the control group on all cognitive tasks

revealed no significant differences. Gender was not introduced as an additional

independent variable (IV) in analyses because the number of girls in the ASD group was

considered to be too small.

Table 2. Demographic characteristics of the samples

ASD group (n = 46) Control group (n = 48)

Age: Mean (SD, range) 10.73 (3.96, 5-18) 9.49 (2.94, 5-17)

Male: Female 40: 6 34: 14

PIQ: Mean (SD, range) 96.07 (18.23, 63-138) 99.42 (16.99, 64-137)

VIQ: Mean (SD, range) 91.76 (21.77, 52-150) 106.58 (16.85, 64-138)

The ASD and control groups were also matched in terms of their families’

socioeconomic status. This was assessed using education data from both mothers and

fathers, which was coded using the following system: 1 = up to year 10 (or equivalent)

of high school; 2 = up to year 12 (or equivalent) of high school; 3 = diploma, trade

certificate, apprenticeship, or other traineeship; and 4 = university degree. A chi-square

analysis comparing the education levels of ASD and control parents (the analysis

included both mothers and fathers) revealed that there was no difference in the

education level of the parents of ASD and control children, χ2 (3, N = 150) = 5.71, p >

.1. The difference remained non-significant when only the highest code from each

family was included in the analysis, χ2 (3, N = 89) = 3.35, p > .1.

With an n of 46 in the ASD group and 48 in the control group, the power of the

study to detect medium sized effects (i.e., d = .5) at an alpha level of .05 reached an

acceptable level at .78.

4.2.2 Procedure

All questionnaires, parental interviews, and cognitive tasks are described in detail in

Chapter 3. Initial screening questions regarding medical history (to assess whether

participants met criteria for participation) were asked of the participant’s mother via

telephone. Informed consent was obtained from the mother of each participant, on

behalf of both herself and her child (direct consent was also obtained from participants

over 12 years of age, with the exception of children whose level of understanding of the

research was judged to be insufficient to give informed consent). Questionnaires were

generally sent to participants’ mothers prior to the first testing session. Tests and

parental interviews were usually administered at the participants’ homes, or in testing

rooms at the Centre for Clinical Research in Neuropsychiatry. The ADI-R took

approximately 2 hours to administer, and the RBI an additional 5-30 minutes, depending

on the number of questions asked. The test battery took approximately 2.5 hours in

total to administer5. The order of test administration was fixed, except the order of

Wechsler subtests differed according to whether the WPPSI-R, WISC-III or WAIS-III

was administered (the order of subtest administration specified by each test was

retained). Testing was often divided into two sessions, in order to prevent fatigue and

distractibility. For practical reasons, when testing was conducted across more than one

session, the break was not always at the same point within the battery. Some tests were

administered only to participants within a certain age range. The order of testing (not

including other tests administered for WAFSASD) and the age range for each test is

displayed in Table 3.

5 This includes other tests not reported within this thesis but which were conducted as part of the WAFSASD. The IQ, ToM and EF tests took between 1.5 and 2 hours in total.

Table 3. Order of test battery and age range for each test

Test Age range

1. WPPSI-R: i) Object Assembly

ii) Vocabulary

iii) Picture Completion

iv) Similarities

WISC-III: i) Picture Completion

ii) Similarities

iii) Vocabulary

iv) Object Assembly

WAIS-III: i) Picture Completion

ii) Vocabulary

iii) Similarities

iv) Object Assembly

2. Stamps task – first 4 trials 5-16

3. Tower of London All ages

4. Dewey Stories 10+

5. Simple false belief task 5-16*

6. First-order false belief task 5-16*

7. Second-order false belief task 5-16*

8. IDED set-shifting: Perseveration condition 7+

9. Response Inhibition and Load task 7+

10. IDED set-shifting: Learned Irrelevance condition 7+

11. Uses of Objects All ages

12. Relational Complexity All ages

13. Stamps task – second 4 trials 5-16

14. Pattern Meanings All ages

15. Opposite Worlds 7-16

*Not all children within this age range were necessarily administered these tasks. The

structure of administration of the false belief tasks is described in Section 3.3 of Chapter

4.3 Results

This section includes subsections covering i) data screening; ii) group comparisons on

ToM and EF tasks; iii) analyses addressing the universality of ToM and EF deficits in

the ASD group; iv) logistic regression analyses examining the ability of ToM and EF

task performance to predict ASD/control group membership; v) group comparisons on

and derivation of indices for the behavioural measures; vi) correlations and multiple

regressions examining the relationship between ToM/EF and behavioural measures; and

viii) analyses examining the relationship between ToM and EF. SPSS (Statistical

Package for the Social Sciences) Version 10.0.5 was used for all analyses.

4.3.1 Data screening

Data from all measures were screened for normality and outliers. For variables with

distributions that did not depart substantially from normality, outliers falling more than

3 standard deviations (SD) from the mean of the group (i.e., ASD or control) were

trimmed to 3 SD from the mean. Several variables demonstrated highly skewed

distributions. Square root, logarithm and inverse transformations were attempted for

these variables. If transformation was successful, the transformed variable was used for

all analyses (including correlations and regressions). For some variables where a large

proportion of participants all gained the same score, transformations were ineffective.

For these variables, scores were dichotomised. Again, the dichotomised variable was

used in all analyses. Relevant specific details regarding outliers, transformations and

dichotomising of scores are included within the results section for each measure.

4.3.2 Group comparisons on ToM and EF tasks

For all group comparisons, the performance of the ASD group as a whole was compared

with the control group. For tests administered only to participants within a certain age

range (see Table 3), t-tests were conducted to check that the participants available from

the ASD and control groups for those tests were still matched on age and PIQ. Results

showed that the groups were matched for all tests, with the exception of Dewey Stories.

The way in which this was handled is described in Section 4.3.2.2.

To address concerns that the range of symptom severity within the ASD group

may affect results (i.e., autism versus other PDD subgroups may display different

patterns of results), comparisons were also conducted between participants in the ASD

group who exceeded cutoff scores in all three domains of the ADI-R (i.e., met “full

criteria”; n = 34) and those who exceeded cutoff scores in only one or two domains (i.e.,

met “partial criteria”; n = 12). The two subgroups were matched on age and PIQ.

Almost all comparisons on cognitive tasks revealed no significant subgroup

differences6. The only task on which the two subgroups showed different patterns of

performance was Pattern Meanings, and these results are reported in Section 4.3.2.8.

For all other tasks, as there were no significant differences it was thought appropriate to

consider the “full criteria” and “partial criteria” subgroups together as one sample for

group comparisons.

As described in Section 4.2.1, four participants in the ASD group and two in the

control group had a clinical diagnosis other than an ASD (e.g., ADHD, epilepsy,

dyspraxia). To check that the presence of a non-ASD diagnosis was not strongly

influencing results, group comparisons were conducted excluding these participants

from the sample. All significant group differences remained significant, so participants

with non-ASD diagnoses were included in all analyses reported.

A consistent approach to group comparisons on each task was followed,

involving the following steps:

1. T-tests (or chi-square analyses for dichotomous variables) comparing the

performance of the ASD and control groups were conducted for all task variables.

2. Scatterplots between task variables and age, VIQ, and PIQ were examined for any

non-linear relationships. No significantly curvilinear relationships were detected.

The relationship between age and some task variables was slightly curvilinear, but

not to an extent that warranted the use of special analyses.

3. Pearson product-moment correlations7 were conducted between the task variables

and age, VIQ, and PIQ.

6 It should be noted that some caution should be exercised in interpreting these non-significant subgroup differences because the power to detect them may not be adequate. However, the lack of subgroup differences is still likely to mean that including the PDD subgroup in the ASD sample did not have any significant effect on the overall result besides increasing the sample size and therefore power of the main analyses. 7 Although some variables were dichotomous, Cohen and Cohen (1983) state that the formula for the point biserial correlation coefficient is computationally equivalent to the formula for the product-moment correlation coefficient. They assert that the difference in formula is of no significance when computer programs are used for data analysis, because whatever formula the program uses will work when variables are scored 0-1. For ease of reporting, r is used throughout the results section to denote both a Pearson product-moment and a point biserial correlation coefficient (which are computed identically anyway).

4. For task variables which were significantly correlated with age and/or PIQ, an

analysis of covariance (ANCOVA) was conducted, mainly to assess whether any

non-significant group differences became significant when extraneous variance

attributable to age and/or PIQ was removed. Miller and Chapman (2001)

recommend the use of ANCOVA in this way as a “noise reduction technique”.

5. If task variables were correlated with VIQ, ANCOVA was not considered to be an

appropriate technique to examine the effect of VIQ on group comparisons, as the

groups were not matched on VIQ. In their review of the use of ANCOVA with

nonrandomly assigned groups, Miller and Chapman (2001) argue that ANCOVA

cannot be used to “control for” group differences on a covariate, which they state is

a highly consistent view in the technical literature. Essentially, this is because when

the covariate and the independent variable (in this case, group) are not independent,

the regression adjustment of the independent variable (IV) may remove part of the

effect of group or produce a spurious effect of group (see Miller & Chapman, 2001,

for further explanation). As an alternative to ANCOVA in this situation, Maxwell

and Delaney (1990) suggest the “blocking” of participants on the covariate, and then

introducing the “blocked” variable as an additional IV in analyses. This strategy

was adopted in the current study. Participants (in the ASD and control groups

combined) were divided into three equal groups according to their VIQ score, and

then VIQ level was used as an IV in a 2-way ANOVA (with group as the other IV)

and the task variable as the dependent variable (DV). In this way, the influence of

VIQ on group comparisons was assessed by examining whether i) the main effect of

group remained significant when VIQ was controlled for by introducing it as an

additional IV, or ii) if any non-significant group differences became significant

when the effect of VIQ was separated out. In addition, any group x VIQ

interactions would be of interest in examining the possibility that group differences

were found for some VIQ levels but not others (i.e., interactions would indicate

heterogeneous regression slopes).

6. If dichotomous variables correlated with age and/or IQ variables, the effect of

age/IQ was assessed by conducting logistic regression analyses with the

dichotomous variable as the outcome variable, and group and age/IQ as the

predictor variables. This allowed assessment of the independent contribution of

group to the outcome variable minus the variance attributable to age/IQ.

4.3.2.1 False belief tasks

Examination of distributions of scores from all false belief tasks revealed that a large

proportion of participants (particularly from the control group) gained perfect scores for

both belief and control questions. All variables were therefore recoded as dichotomous

such that a perfect score was coded as 1 and any other score as 0. This fairly strict

scoring criterion was considered appropriate for the age and level of ability of both the

ASD and control groups, as a more lenient scoring system would have produced ceiling

effects. It should be noted, however, that a score of 0 is better interpreted as indicating

an “unstable” false belief performance rather than a true failure on the task.

Five participants (four in the ASD group, one in the control group) were not

administered the First-order and Second-order false belief tasks due to equipment

malfunction. These participants had all passed the Simple false belief task and were

therefore assigned the mean value of other participants in their group who had passed

the Simple false belief task (which was in turn coded dichotomously according to the

criteria described above). As the false belief tasks were administered to a restricted age

range, the overall sample size for all false belief tasks was 89 (n = 43 in the ASD group,

n = 46 in the control group). However, the ns for the memory and reality questions, as

well as the own belief questions in the Simple false belief task, were limited to those

who actually did the task – as these questions were not assumed to be passed or failed

according to performance on other false belief tasks, as was the case for the belief

questions (see Section 3.3 for a description of the structure of administration of the false

belief tasks). Percentages of participants gaining perfect scores (i.e., “perfect scorers”)

in each group for each false belief task (on the belief questions only) are presented in

Table 4.

i) Simple false belief task. A chi-square analysis revealed that there was no

statistically significant difference between the ASD and control groups on the reality

questions, χ2 (1, N = 36) = .82, p > .1. However, on the belief questions referring to the

participant’s own previous belief, significantly fewer children in the ASD group were

perfect scorers, χ2 (1, N = 36) = 4.42, p < .05, indicating that they were more likely to

incorrectly state their own previous beliefs8. On the belief questions referring to others’

beliefs, significantly fewer children in the ASD group were perfect scorers, χ2 (1, N =

8 While there were significant group differences on this variable, it was not included in subsequent analyses (e.g., correlations) because the sample size for the variable was considered to be too small.

89) = 7.25, p < .01, indicating that they were less likely to make accurate predictions

about the beliefs of others.

Performance on the others’ belief questions was significantly correlated with

both age, r = .23, p < .05, and VIQ, r = .37, p < .001. A logistic regression with age,

VIQ and group as the predictors showed that according to the Wald criterion, the

independent contribution of group to variance in others’ belief questions performance

became only marginally significant with age and VIQ partialled out, z = 3.61, p = .06.

ii) First-order false belief task. There was no significant difference between

the ASD and control groups on the reality questions, χ2 (1, N = 76) = 1.74, p > .1, or

memory questions, χ2 (1, N = 76) = .001, p > .1. On the belief questions, a significantly

lower proportion of children in the ASD group gained perfect scores, χ2 (1, N = 89) =

11.34, p < .01, indicating that they were less likely to make accurate predictions about

others’ false beliefs.

Performance on belief questions was significantly correlated with both age, r =

.26, p < .05, and VIQ, r = .44, p < .001. In a logistic regression with age, VIQ and

group as the predictors, the independent contribution of group remained significant, z =

7.60, p < .01.

iii) Second-order false belief task. As for the other false belief tasks, there

was no significant difference between the ASD and control groups on the reality

questions, χ2 (1, N = 68) = 1.34, p > .1, or memory questions, χ2 (1, N = 68) = .29, p >

.1. There were significantly fewer perfect scorers in the ASD group on the belief

questions, χ2 (1, N = 89) = 4.93, p < .05.

Scores on belief questions were significantly correlated with age, r = .28, p <

.01, and VIQ, r = .44, p < .001. In a logistic regression with age, VIQ and group as the

predictors, the independent contribution of group was no longer significant, z = 2.16, p

iv) Overall false belief performance indices. An aggregate score was also

calculated across the three false belief tasks, for use in other analyses. The sum of

correct responses on belief questions (including only others’ belief questions for the

simple false belief task) was dichotomised in the same way as for each individual task,

such that a perfect score (15/15) was coded as 1 and any other score as 0. Chi-square

analysis revealed that there were significantly fewer perfect scorers overall in the ASD

group than in the control group, χ2 (1, N = 89) = 8.1, p < .01. The aggregate score was

significantly correlated with age, r = .29, p < .01, and VIQ, r = .42 p < .001. Group

remained a significant predictor of the aggregate score when age and VIQ were

partialled out in a logistic regression, z = 5.96, p < .05.

As the dichotomous scoring system used for the aggregate score was a fairly

strict one, a more lenient scoring criterion was also used for an alternative aggregate

score. In the alternative system, any participant scoring 13 or more out of 15 (i.e.,

making either 0, 1, or 2 incorrect responses) was given a score of 1 (“high scorers”), and

participants with lower scores were assigned a 0 (“low scorers”). A chi-square analysis

showed that significantly fewer ASD participants were high scorers than control

participants, χ2 (1, N = 89) = 6.25, p < .05. The alternative aggregate score correlated

significantly with age, r = .26, p < .05, PIQ, r = .24, p < .05, and VIQ, r = .50, p < .001.

When a logistic regression was conducted with these age and IQ variables and group as

predictors, the effect of group became only marginally significant, z = 2.81, p = .09.

Table 4. False belief task results: Percentage of participants in each group with perfect

scores [or high scores in the case of the alternative aggregate score] on belief questions,

and significance of group comparisons

ASD group Control group p p with age/

IQ control

Simple false belief:

Own belief 55.0 87.5 *

Others’ belief 72.1 93.5 ** -

First-order false belief 48.8 82.6 ** **

Second-order false belief 51.2 73.9 * -

Aggregate score 39.5 69.6 ** *

Alternative aggregate [55.8] [80.4] * -

* p < .05; ** p < .01; *** p < .001; - p > .05.

4.3.2.2 Dewey Stories

With the Dewey Stories task administered to only a small subset of the sample in the

older age range, the ASD (n = 17) and control (n = 18) participants who completed the

task were not matched on age and PIQ. However, the total score on Dewey Stories was

not significantly correlated with either age, r = -.19, p > .1, or PIQ, r = -.30, p = .08, and

so the non-matching of groups on these variables was not considered to be important.

The total score variable did not require transformation. A t-test comparing the

total scores of the ASD group (M = 7.94, SD = 3.91) and control group (M = 5.89, SD =

2.95) revealed a marginally significant group difference in the expected direction, t(33)

= 1.76, p = .09. The total score was significantly correlated with VIQ, r = -.48, p < .01.

Because of the small sample size for this task, VIQ was split into two levels rather than

three. An ANOVA with group and VIQ level as the IVs showed that group differences

in the total score did not remain significant when assessed independently of VIQ,

F(1,31) = .29, p > .1. The group x VIQ level interaction was not significant, F(1,31) =

.81, p > .1.

Because the Dewey Stories task was of uncertain validity as a measure of ToM,

correlations were conducted between the total score and the false belief variables. Raw

correlations were significant for all false belief variables except the simple false belief

task, however when age, PIQ and VIQ were partialled out, there were no longer any

significant correlations between false belief variables and the Dewey Stories total score.

It therefore appears that the validity of the Dewey Stories task as a measure of

mentalising ability is questionable, and it may be better considered as a measure of

social awareness and understanding of acceptable social behaviours. Throughout

subsequent analyses, the Dewey Stories task is considered a measure of “social

cognition”.

4.3.2.3 Tower of London (ToL)

The main performance indices of the ToL were the overall sum of adjusted extra move

scores (from here on referred to as the total adjusted extra move score) and the total

number of problems completed in the minimum number of moves. These two scores

were highly correlated, r = -.96, p < .001, hence only the total adjusted extra move score

was used in analyses. The number of rule violations per block administered was also

analysed. Because many participants committed no or very few rule violations, this

variable was highly skewed and was recoded as a dichotomous variable, with

participants making 0-1 violations per block being given a score of 0 (“low rule

violators”) and participants making any higher number of violations scored as 1 (“high

rule violators”).

Two participants had missing data on the ToL, one from the ASD group and one

from the control group, and were not included in analyses (resulting in n = 45 for the

ASD group and n = 47 for the control group). A t-test comparing the total adjusted

extra move scores of the ASD group (M = 26.31, SD = 7.78) and control group (M =

22.53, SD = 7.35) revealed that the ASD group made a higher number of extra moves

than the control group, t(90) = 2.40, p < .05. A chi-square analysis also showed that

significantly more participants in the ASD group (44.4%) than the control group

(23.4%) were high rule violators, χ2 (1, N = 92) = 4.56, p < .05.

Both ToL indices were significantly correlated with age (r = -.41, p < .001, for

the total adjusted extra move score; r = -.37, p < .001, for rule violations), and VIQ (r =

-.39, p < .001, for the total adjusted extra move score; r = -.26, p < .05, for rule

violations). An ANCOVA conducted on the total adjusted extra move score, with group

and VIQ level as the IVs and age as the covariate, revealed that the group difference

remained significant when age and VIQ were controlled, F(1,85) = 4.98, p < .05. The

group x VIQ level interaction was not significant, F(2,85) = .48, p > .1. Group also

remained a significant predictor of rule violation status (low/high) when age and VIQ

were assessed independently in a logistic regression, z = 4.82, p < .05.

4.3.2.4 IDED Set-shifting task

All set-shifting variables were highly skewed, with a large number of participants

making no errors or only one error to criterion in each stage. As a result, all variables

were recoded such that any error score of 0 or 1 was coded as 0 and any higher number

of errors was coded as 1. Because the reversal stages were not crucial to the current

study, only the first reversal stage (SDR) in each condition (i.e., Perseveration and

Learned Irrelevance) was included in analyses. Only the extra-dimensional shift (EDS)

stages were included in subsequent analyses (i.e., correlations etc.) as they were the

central variables of interest.

The overall N for the task (which had a restricted age range) was 72 (n = 36 in

both the ASD and control groups). Due to computer malfunction, data for the

Perseveration condition from one participant in the ASD group were invalid and not

included in analyses. The percentage of participants in each group making only 0 or 1

errors (i.e., “low error scorers”) for each stage in each task condition is displayed in

Table 5. There were no significant group differences on any variable. However, there

was a marginally significant trend for a smaller proportion of participants from the ASD

group to be low error scorers on the EDS stage of the Learned Irrelevance condition, χ2

(1, N = 72) = 3.77, p = .052, suggesting that children with ASDs may have found it

more difficult to shift their attention to a previously irrelevant stimulus dimension.

No variables were significantly correlated with age or PIQ, and only the IDS

stage of the Learned Irrelevance condition was significantly correlated with VIQ, r =

-.34, p < .01. The effect of group remained non-significant when a logistic regression

on this variable was performed with VIQ and group as predictors.

Table 5. IDED Set-shifting task results: Percentage of low error scorers in each group

for each stage of each task condition, and significance of group comparisons

IQ control

Perseveration condition

SD stage 60.0 66.7 -

SDR stage 52.9 47.1 -

CD stage 45.3 54.7 -

IDS stage 44.9 55.1 -

EDS stage 60.0 72.2 -

Learned Irrelevance condition

SD stage 63.9 66.7 -

SDR stage 77.8 77.8 -

CD stage 63.9 61.1 -

IDS stage 77.8 77.8 - -

EDS stage 13.9 33.3 -

* p < .05; ** p < .01; *** p < .001; - p > .05.

4.3.2.5 Response Inhibition and Load (RIL) task

For all RIL task conditions, error variables (i.e., the percentage of errors made) were

highly skewed, with many participants making only 0-2% errors. These variables (with

the exception of the percentage of errors made in choosing the most recently displayed

shape in Condition 3) were recoded such that 0-2% errors was coded as 0 (a “low error

score”), and any higher percentage of errors was coded as 1 (a “high error score”).

However, as this precluded the use of a repeated measures ANOVA to compare

increments in performance across Conditions 1-3, the main error variables used in

analyses for these conditions were:

i) the inhibition error difference score - the difference between the scores

for Condition 2 (the inhibition condition) and 1 (the control condition);

ii) the load error difference score – the difference between the scores for

Condition 3 (the working memory load condition) and Condition 2; and

iii) the inhibition + load error difference score – the difference between the

scores for Conditions 3 and 1.

These difference scores were normally distributed. Four outliers were trimmed: one

control participant’s inhibition and inhibition + load error difference scores, another

control participant’s load error difference score, and one ASD participant’s inhibition +

load error difference score. The distribution of the percentage of errors made in

choosing the most recently displayed shape in Condition 3 (or the shape error score, a

measure of working memory ability under conditions requiring inhibitory control) was

also skewed, but a square root transformation was effective for this variable.

Although the median RT variables for all conditions demonstrated roughly

normal distributions, for the sake of consistency, an inhibition RT difference score, load

RT difference score, and inhibition + load RT difference score were also calculated

(representing the same comparisons between conditions as for the error data). One

outlier on the inhibition + load RT difference score from a participant in the ASD group

was trimmed.

The overall N for the task was 71 (n = 36 in the ASD group, n = 35 in the

control group). Due to computer malfunction, one participant in the ASD group had

incomplete data in Condition 3, and so error and RT data from that condition as well as

the difference scores involving Condition 3 were not included in analyses. Table 6

displays the mean and SD of each group (and the significance of group comparisons) for

the error and RT difference scores, and the shape error score. There were no significant

group differences on t-tests of the error difference scores, although there was a trend for

the ASD group to show a larger inhibition + load error difference score, t(68) = 1.72, p

= .09. A t-test comparing the shape error scores for Condition 3 revealed that the ASD

group made significantly more shape errors, t(68) = 2.03, p < .05, indicating that in an

inhibition task with a working memory load, individuals with ASDs were less able to

respond accurately on a measure of their working memory ability.

Examination of the error data for each condition separately revealed that there

was a significantly lower proportion of low error scorers in the ASD group (27.8%) than

the control group (51.4%) in Condition 2, χ2 (1, N = 71) = 4.16, p < .05. However, the

inhibition error difference score did not show a significant group difference, probably

because there was also a trend for fewer children in the ASD group to be low error

scorers in Condition 1, the control condition (47.2% vs. 68.6% in the control group), χ2

(1, N = 71) = 3.31, p = .07. The proportion of low error scorers in the ASD group was

also marginally lower for Condition 3 (22.9% vs. 42.9%), χ2 (1, N = 70) = 3.17, p = .08.

Thus, the overall pattern of results for the error scores in Conditions 1-3 suggested that

the ASD group tended to make more errors on all tasks, but their performance accuracy

was not proportionally worse in task conditions with inhibitory and working memory

demands (at least on the inhibitory aspect of the task – note that the ASD group

performed significantly worse on the shape error score, an index of their working

memory performance when the task contained both inhibitory and working memory

demands).

There were no significant differences between groups on any RT difference

scores, and no trends were evident. There were no significant RT differences between

the groups when Conditions 1-3 were analysed separately. In subsequent analyses, only

the error and RT difference scores and the shape error score were used, and separate

error and RT data for Conditions 1-3 were not included.

Table 6. RIL task results: Mean (and SD) of each group, and significance of group

comparisons, for error and RT difference scores and the shape error score

IQ control

Error difference scores:

Inhibition 3.43 (6.37) 1.24 (6.20) -

Load 2.14 (7.91) 0.66 (4.66) -

Inhibition + load 5.23 (8.41) 1.95 (7.49) -

RT difference scores:

Inhibition 194.53 (166.88) 191.95 (198.21) - -

Load 116.25 (200.42) 174.53 (193.95) - -

Inhibition + load 314.48 (232.88) 366.57 (214) - -

Working memory measure:

Shape error score 25.52 (21.48) 15.43 (13.89) * **

* p < .05; ** p < .01; *** p < .001; - p > .05.

Note: The means and SDs shown for the shape error score are for the raw data, prior to

transformation.

It should be noted that for both the ASD and control groups, there was a significant

increase in both the number of errors made and the time taken to respond in the

inhibition condition (and load condition) compared with the control condition,

indicating that these conditions were more difficult than the control condition and

therefore that the instruction to respond to the opposite colour to the stimulus did

require inhibitory control.

Age was significantly correlated with the shape error score (r = -.32, p < .01),

the inhibition RT difference score (r = -.30, p < .05), and the inhibition + load RT

difference score (r = -.33, p < .01). VIQ was correlated with the load RT difference

score (r = -.24, p < .05). ANCOVAs on the inhibition and the inhibition + load RT

difference scores with age as the covariate did not change the non-significant effect of

group for these variables: F(1,68) = .28, p > .1, for inhibition and F(1,67) = .22, p > .1,

for inhibition + load. The group difference in the shape error score remained significant

when an ANCOVA was conducted with age as a covariate, F(1,67) = 7.84, p < .01. The

group difference on the load RT difference score also remained non-significant when a

two-way ANOVA with group and VIQ as the IVs was conducted, F(1,64) = .22, p > .1.

The interaction between group and VIQ was not significant, F(2,64) = .07, p > .1.

4.3.2.6 Opposite Worlds task

Opposite Worlds task variables used in group comparisons were the Same World error

score, Opposite World error score, Same World time score and Opposite World time

score (each score equating to the sum of two trials). There was one outlier in the ASD

group on the Opposite World time score, which was trimmed to 3 SD from the mean.

In subsequent analyses (to be reported in sections to follow), the error and time

difference scores between the Opposite and Same World conditions were the main

variables used as these were thought to be appropriate summary scores (representing the

performance decrement when inhibitory demands are introduced) for use in correlations

and other analyses. Means and SDs for all variables are displayed in Table 7.

The N for the task was 65 (n = 29 for the ASD group, n = 36 for the control

group). For the error scores, a two-way repeated measures ANOVA was conducted

with group (ASD, control) as the between-subjects factor and condition (Same World,

Opposite World) as the within-subjects factor. There was a significant main effect of

condition, F(1, 63) = 22.96, p < .001, but the main effect of group was not significant,

F(1, 63) = 1.62, p > .1. The interaction approached significance, F(1, 63) = 3.01, p =

.09, suggesting there was a trend for the ASD group to make comparatively more errors

in the Opposite World condition. Follow-up simple effects analyses showed that there

was no significant difference between the groups in the number of errors made in the

Same World condition, t(63) = .32, p > .1, but there was a marginally significant

difference in the Opposite World error scores, t(63) = 1.71, p = .09.

A two-way repeated measures ANOVA with group as the between-subjects

factor and condition as the within-subjects factor was also conducted on the time scores.

There was a significant main effect of condition, F(1, 63) = 107.77, p < .001, and a

significant effect of group, F(1, 63) = 5.2, p < .05. The interaction was also significant,

F(1, 63) = 7.36, p < .01, indicating that participants in the ASD group took

comparatively longer to complete the Opposite World condition (in other words, they

showed a larger performance decrement from the Same World to the Opposite World

condition compared with the control group). Follow-up analyses confirmed that there

was no significant difference between the ASD and control group on the Same World

time scores, t(63) = 1.36, p > .1, but the ASD group took significantly longer in the

Opposite World condition, t(63) = 2.66, p < .05.

Table 7. Opposite Worlds results: Mean (and SD) of each group for error/time scores

in each condition and difference scores, and significance of group comparisons

IQ control

Error variables:

Same World error score 1.21 (1.57) 1.08 (1.56) -

Opposite World error score 2.69 (2.54) 1.78 (1.74) - -

Error difference score 1.48 (2.2) 0.69 (1.45) - -

Time variables:

Same World time score 27.27 (6.42) 25.0 (6.89) - -

Opposite World time score 38.42 (12.55) 31.53 (8.26) * *

Time difference score 11.12 (9.02) 6.53 (4.2) ** **

* p < .05; ** p < .01; *** p < .001; - p > .05.

Note: The difference scores relate to the interaction term on repeated measures

ANOVAs.

Neither VIQ nor PIQ correlated with any task variables, but age was

significantly correlated with the Same World time score, r = -.42, p < .001, the Opposite

World time score, r = -.43, p < .001, and the Opposite World error score, r = -.28, p <

.05. When age was introduced as a covariate in two-way repeated measures

ANCOVAs, there was no change in any of the results.

4.3.2.7 Relational Complexity

In this task, the main variable used for analyses was simply the total score (i.e., total

number correct), summed across all trials. There was one outlier in the ASD group for

this variable, which was trimmed.

A t-test comparing the total score of the ASD group (M = 9.66, SD = 3.9) and

control group (M = 9.71, SD = 4.2) was not significant, t(92) = .05, p > .1.

The total score correlated with both age, r = .58, p < .001, and VIQ, r = .28, p <

.01. An ANCOVA conducted on the total score with group and VIQ level as IVs and

age as a covariate did not influence the non-significant effect of group, F(1, 87) = .003,

p > .1. There was no significant interaction between group and VIQ level, F(2, 87) =

.18, p > .1.

4.3.2.8 Pattern Meanings

All error variables (i.e., redundant, repetitive, incorrect, and uninterpretable responses)

showed significant positive skew. For redundant responses, a square root

transformation was effective. Repetitions were recoded such that 0 or 1 repetition(s)

was coded as 0 and 2 or more repetitions were coded as 1. Due to the very small

number of incorrect and uninterpretable responses, these two variables were summed to

form a combined incorrect/uninterpretable responses variable, which was recoded such

that 0 errors remained at 0, and 1 or more errors was coded as 1. A “sum of errors”

variable was created, where the number of error responses was summed across all

categories. This variable was also skewed, and was transformed using a logarithm

equation. The other major variable for the Pattern Meanings task was the number of

correct responses, which was normally distributed.

There were no statistically significant group differences in the mean number of

correct responses produced, t(92) = 1.38, p > .1, or the sum of errors, t(92) = .14, p > .1.

Similarly, individual analyses of error variables did not reveal any significant group

differences. There was no significant difference in the mean number of redundant

responses, t(92) = .46, p > .1. The proportion of low error scorers in each group was not

significantly different for repetitions, χ2 (1, N = 94) = .001, p > .1, or incorrect/

uninterpretable responses, χ2 (1, N = 94) = .68, p > .1.

However, as mentioned previously, this task was the only one on which the “full

criteria” (i.e., autism) and “partial criteria” (i.e., other PDD) subgroups showed

significant differences. The partial criteria subgroup made significantly more errors

than the full criteria subgroup on the sum of errors variable, t(44) = 2.62, p < .05. When

specific error types were analysed, it was found that the partial criteria subgroup made

significantly more redundant responses, t(44) = 2.49, p < .05, and a higher proportion of

the partial criteria subgroup made a high number of repetitions, χ2 (1, N = 46) = 4.31, p

< .05. There was also a trend for the partial criteria subgroup to make more correct

responses, t(44) = 1.83, p = .07. This pattern of results therefore suggests that the

partial criteria subgroup generated more responses overall, whether correct or not. Each

of the two subgroups was then compared to the control group. It was found that the full

criteria subgroup demonstrated significantly fewer correct responses than controls, t(80)

= 2.06, p < .05, and the partial criteria subgroup produced significantly more error

responses overall than controls, t(58) = 2.44, p < .05 (in terms of specific error types,

the partial criteria subgroup produced significantly more redundant responses than

controls but there were no significant differences for other error types). Because the

two subgroups displayed different patterns of performance, Table 8 displays means,

SDs, the percentage of low scorers for dichotomous variables, and significance of group

comparisons for all variables separately for the two subgroups9.

Across the whole sample, the sum of errors was significantly correlated with

age, r = -.46, p < .001, as were all individual error variables: redundant responses, r = -

.32, p < .01, repetitions, r = -.25, p < .05, and incorrect/uninterpretable responses, r = -

.26, p < .05. VIQ was correlated with the number of correct responses, r = .20, p < .05,

and the sum of errors, r = -.20, p < .05, but of the individual error variables, only

repetitions were significantly correlated with VIQ, r = -.22, p < .05. All group

differences remained non-significant after controlling for these variables when the

whole ASD sample was analysed as one group. When the separate analyses for the full

and partial criteria subgroups were conducted with the relevant age and IQ variables

9 The two subgroups were not analysed separately in subsequent analyses involving correlations with behavioural and ToM variables, as it was of interest to see whether Pattern Meanings performance correlated with symptom severity (or ToM performance) across the whole sample.

controlled, the difference between the full criteria subgroup and controls on correct

responses became non-significant, F(1, 76) = 2.31, p > .1, and the difference between

the partial criteria subgroup and controls on the sum of errors became only marginally

significant, F(1, 53) = 3.02, p = .09, although the difference between these two groups

on redundant responses remained significant, F(1, 57) = 7.17, p < .05. There were no

significant interactions between group and VIQ level in any analyses.

Table 8. Pattern Meanings results: Mean (and SD) of each subgroup [or the percentage

of low error scorers for dichotomous variables], and significance of group comparisons

ASD group Control group p p with

____________________________ age/IQ

Full subgroup Partial subgroup control

Correct responses 21.32 (7.87) 26.50 (9.83) 25.1 (8.42) *1 -

Sum of errors 7.21 (9.12) 15.75 (10.81) 7.4 (7.51) *2 -

Individual error types:

- Redundant 4.06 (4.94) 8.92 (7.51) 4.23 (4.34) *2 *2

- Repetition [67.6] [33.3] [58.3] - -

- Incorrect/uninterpretable [73.5] [58.3] [77.1] - -

* p < .05; ** p < .01; *** p < .001; - p > .05.

Note: The means and SDs shown for the sum of errors and redundant responses are for

the raw data, prior to transformation. 1 Difference was between full criteria subgroup and controls 2 Difference was between partial criteria subgroup and controls

4.3.2.9 Uses of Objects

As for the Pattern Meanings task, all error variables (including the additional non-useful

responses variable) were positively skewed. For redundant responses and repetitions,

log transformations were effective. A square root transformation improved the

distribution of non-useful responses. Again, incorrect and uninterpretable responses

were summed to form a combined variable, which was recoded as dichotomous in the

same way as for the Pattern Meanings task. A “sum of errors” variable was also

created, which was normally distributed for this task. One outlier on this variable from

the control group was trimmed. The total number of correct responses, as well as the

number of correct responses for conventional and non-conventional items separately, all

had approximately normal distributions. Means, SDs, the percentage of low scorers for

dichotomous variables, and significance of group comparisons for all variables are

Table 9. Uses of Objects results: Mean (and SD) of each group [or the percentage of

low error scorers for dichotomous variables], and significance of group comparisons

IQ control

Correct responses:

- Total 19.07 (8.99) 26.42 (9.5) *** **

- Conventional items 7.04 (3.86) 10.25 (4.35)

- Non-conventional items 12.02 (5.77) 16.17 (6.02)

Sum of errors 18.41 (12.84) 17.52 (10.09) - -

Individual error types:

- Redundant 6.02 (5.79) 5.42 (3.76) - -

- Repetition 4.28 (3.99) 5.13 (4.55) - -

- Non-useful 6.61 (6.05) 6.38 (5.53) - -

- Incorrect/uninterpretable [63.0] [64.6] - -

* p < .05; ** p < .01; *** p < .001; - p > .05.

Note: The means and SDs shown for redundant responses, repetitions, and non-useful

responses are for the raw data, prior to transformation.

To examine whether or not the ASD group produced proportionally fewer correct

responses on the conventional versus non-conventional items, a two-way repeated

measures ANOVA was performed with group as the between-subjects factor and

condition (conventional, non-conventional) as the within-subjects factor. There was a

significant main effect of group, F(1, 92) = 14.82, p < .001, and condition, F(1, 92) =

155.97, p < .001, but the interaction was not significant, F(1, 92) = 1.16, p > .1,

indicating that the ASD group produced fewer correct responses than the control group

for both conventional and non-conventional items (with the conventional items being

more difficult for both groups), but were not proportionally worse on conventional

items. Because of this, the separate totals for conventional and non-conventional items

were not used in further analyses.

There was no significant difference between groups on the sum of errors, t(92) =

.37, p > .1, and individual analyses of error variables did not reveal any significant

group differences. There was no significant difference in the mean number of

redundant responses, t(92) = .64, p > .1, repetitions, t(92) = 1.22, p > .1, or non-useful

responses, t(92) = .08, p > .1. The proportion of low error scorers in each group was not

significantly different for incorrect/uninterpretable responses, χ2 (1, N = 94) = .02, p >

Age was significantly correlated with the number of correct responses, r = .31, p

< .01, the sum of errors, r = -.34, p < .01, and all individual error variables (except

repetitions): redundant responses, r = -.27, p < .05, non-useful responses, r = -.26, p <

.05, and incorrect/uninterpretable responses, r = -.38, p < .001. VIQ was correlated

with the number of correct responses, r = .45, p < .001, repetitions, r = -.27, p < .05, and

incorrect/uninterpretable responses, r = -.25, p < .05. The group difference in the

number of correct responses remained significant in an ANCOVA with group and VIQ

level as the IVs and age as a covariate, F(1, 87) = 12.66, p < .01. Group remained a

non-significant effect on the sum of errors when age was introduced as a covariate in an

ANCOVA, F(1, 91) = 1.05, p > .1. Similarly, group differences in all individual error

variables remained non-significant when age and/or VIQ was partialled out using either

ANCOVA or logistic regression. There were no significant interactions between group

and VIQ level in any analyses.

4.3.2.10 Stamps task

Both the rule adherence and restriction scores demonstrated highly skewed distributions

and were recoded as dichotomous variables. For rule adherence, a score between 0 and

6 inclusive was coded as 0 and a score of 7 or 8 was coded as 1. For restriction, a score

of 0 was left as 0 and a score between 1 and 8 inclusive was coded as 1. The

complexity and originality scores showed approximately normal distributions. Means

and SDs for the latter two variables and the proportion of low scorers for the former two

variables, along with the significance of group comparisons for all scores, are presented

in Table 10.

The N for the task was 87 (n = 41 for the ASD group, n = 46 for the control

group). T-tests revealed significant group differences on the complexity score, t(85) =

2.73, p < .01, indicating that the ASD group produced less complex patterns than the

control group, and on the originality score, t(85) = 2.81, p < .01, indicating that the ASD

group produced fewer original patterns than the control group. It was found using chi-

square analysis that there was a lower percentage of low scorers in the ASD group on

the restriction score, χ2 (1, N = 87) = 5.76, p < .05, indicating that a larger proportion of

the ASD group tended to use fewer stamps than were available. For the rule adherence

score, there was a marginally significant trend for a smaller proportion of the ASD

group to produce patterns adhering to one rule, χ2 (1, N = 86) = 3.50, p = .06, which was

contrary to expectation.

The originality score was significantly correlated with both age, r = .34, p < .01,

and VIQ, r = .40, p < .001. The restriction score correlated with VIQ, r = -.36, p < .01.

In a two-way ANCOVA with group and VIQ level as the IVs and age as a covariate, the

group difference in the originality score remained significant, F(1, 80) = 4.55, p < .05.

The interaction between group and VIQ level was not significant, F(2, 80) = .96, p > .1.

When a logistic regression was performed on the restriction score, group was no longer

a significant predictor when it was assessed independently of VIQ, z = 1.47, p > .1.

Table 10. Stamps task results: Mean (and SD) of each group [or the percentage of low

scorers for dichotomous variables], and significance of group comparisons

IQ control

Complexity score 18.63 (3.02) 20.39 (2.98) **

Originality score 3.17 (2.51) 4.78 (2.8) ** *

Restriction score [82.9] [97.8] * -

Rule adherence score [26.8] [11.1] -

* p < .05; ** p < .01; *** p < .001; - p > .05.

4.3.2.11 Summary and effect sizes of group comparisons

Table 11 presents a summary of the results of group comparisons on the main variables

from each cognitive task. Overall, participants in the ASD group performed

significantly more poorly than controls on tasks measuring false belief understanding,

planning, verbal inhibition, working memory (under conditions where inhibition was

required), and both verbal and non-verbal generativity (with different patterns of results

for the two subgroups of ASD participants meeting “full criteria” and “partial criteria”

on the Pattern Meanings task); but not awareness of social norms, set-shifting, non-

verbal inhibition or relational reasoning (although marginally significant differences

were obtained on certain measures of social awareness, set-shifting and non-verbal

inhibition). Age and VIQ influenced some of these results, reducing the significance of

group comparisons for two false belief variables, two verbal generativity variables and

one non-verbal generativity variable.

Table 11. Summary and effect sizes of significant group differences

Measure

Significant

difference?

Significant

difference with

age/IQ control?

Effect size:

r (and d)

Simple false belief: Own belief .35 (.75)

Other’s belief - .28 (.58)

First-order false belief .36 (.77)

Second-order false belief - .24 (.50)

False belief aggregate .30 (.63)

False belief alternative aggregate - .26 (.54)

Social Cognition:

Dewey Stories - - .28 (.58)

Planning:

ToL: Adjusted extra move score .24 (.50)

Rule violations .21 (.43)

Set-shifting:

IDED Perseveration condition:

EDS stage errors - -

IDED Learned Irrelevance cond.:

EDS stage errors - - .23 (.47)

Inhibition:

RIL task error difference scores:

Inhibition - -

Load - -

Inhibition + load - - .23 (.47)

RIL task RT difference scores:

Inhibition - -

Load - -

Inhibition + load - -

Table 11 continued

Measure

Significant

difference?

Significant

difference with

age/IQ control?

Effect size:

r (and d)

Inhibition continued:

Opposite Worlds:

Error difference score - - .21 (.43)

Time difference score .31 (.65)

Working memory:

RIL task shape error score .27 (.56)

Relational reasoning:

Relational Complexity score - -

Generativity:

Pattern Meanings:

Correct responses 1 - .23 (.47)

Sum of errors 2 - .34 (.73)

Uses of Objects:

Correct responses .37 (.80)

Sum of errors - -

Stamps task:

Complexity score .28 (.58)

Originality score .29 (.61)

Restriction score - .26 (.54)

Rule adherence score - - .20 (.41)

�significant to at least p < .05 level; - p > .05. 1 Difference was between full criteria subgroup and controls only 2 Difference was between partial criteria subgroup and controls only

It should be noted that while Bonferroni corrections were not performed, the fact that

group differences followed a consistent pattern and were all in the expected direction

(such that ASD participants performed more poorly than controls) signifies that the

results are likely to be valid. Table 11 also lists the effect sizes obtained for all

significant and marginally significant group differences, as a measure of the strength of

each effect. The “effect size correlation”, or r (Rosenthal, 1991), was used as the

primary measure of effect size. The effect size correlation simply measures the size of

the correlation between the independent and dependent variable (a phi correlation was

calculated for dichotomous variables, which is equivalent to Pearson’s r and point

biserial correlations for continuous variables). However, all values of r were also

converted to d (as shown in Table 11) using an equation supplied by Rosenthal (1991),

and the size of each effect was evaluated using Cohen’s (1988) system for classifying

small, medium and large effects. The largest effect size, and the only one to classify as

a large effect, was for Uses of Objects correct responses - a measure of verbal

generativity. Most other effect sizes fell in the medium range, including the Dewey

Stories total score, on which there was only a marginally significant group difference

but for which there was a small sample size. All other variables for which only

marginally significant group differences were found displayed small effect sizes, and

the ToL rule violations also showed only a small effect size.

4.3.3 Universality of ToM and EF deficits

Ozonoff et al. (1991) assessed universality of ToM and EF deficits in their study by

calculating the proportion of individuals in their autism group who scored below the

mean of the control group. As discussed in Section 2.2.3, this is a lenient criterion for

defining a deficit. In this study, it was decided to adopt the stricter criterion of a score

more extreme (in the direction of poorer performance) than 1 SD from the mean of the

control group (i.e., in the extreme 16% of control scores for a normal distribution) as the

definition of “impairment”. The universality of a deficit on continuous variables was

therefore assessed by calculating the proportion of participants in the ASD group

scoring more poorly than 1 SD from the mean. This was done only for variables where

significant group differences were found (including variables for which the group

difference did not remain significant when age and IQ variables were partialled out, but

not including variables on which only marginally significant group differences were

found).

For variables coded dichotomously, the “more poorly than 1 SD from the mean”

strategy was obviously not feasible, but it was necessary for the calculation of

universality to be comparable to that for continuous variables. To address this, the

percentage of control participants gaining a score of 0 (or 1 if a higher score was poorer)

was calculated, and if it was approximately 16%, the percentage of ASD participants

gaining that score was considered a comparable measure of the universality of a deficit

on that variable (as a score at the 16th percentile corresponds to a score at 1 SD below

the mean for a normal distribution). For the false belief variables, the alternative

aggregate score (see Section 4.3.2.1) was considered the best measure to use in

assessing universality10, as 19.6% of control participants gained a score of 0.

Universality was also calculated for the first-order false belief task, on which 17.4% of

control participants scored 0. The percentages for the two other dichotomous variables

where significant group differences were found were not quite as ideal. For ToL rule

violations, 23.4% of control participants gained a high error score of 1. This variable

was therefore recoded using a more lenient criterion such that a score between 0 and 1.5

rule violations per block scored 0, which resulted in a more appropriate 17% of control

participants scoring 111. For the Stamps task restriction score, only 2.2% of control

participants gained a high restriction score of 1, so this variable was not included in the

universality calculations.

The percentages of ASD participants demonstrating a deficit on the ToM and EF

variables where significant group differences were found are displayed in Table 12. It is

evident that neither ToM nor EF deficits are universal within the ASD sample12. The

percentages of ASD participants showing deficits also appear to be fairly comparable

across ToM and EF variables, although there was some variability among the EF

variables. Within the EF tasks, deficits in verbal inhibition and verbal generativity were

the most prevalent.

10 It is worth noting that although an aggregate score was used for the false belief variables, an aggregate or composite score was not calculated across the EF tasks (as was done by Ozonoff et al., 1991) for these universality calculations or for subsequent analyses because it was not thought to be valid or meaningful, particularly in light of the fact that one of the aims of the study was to examine the specific profile of EF deficits in ASDs and the relationship of each EF component with behavioural symptomatology and with ToM. In support of this, although there were some intercorrelations between EF domains, for the most part EF task variables were not significantly correlated with each other and appeared to be measuring different constructs (these correlations are presented in Appendix B and discussed further in Section 4.4.1). The fact that group differences were found on some EF tasks but not others solidifies this view. In addition, within EF domains, verbal and non-verbal measures often did not correlate with each other (i.e., for the different tests of inhibition and generativity). EF variables were therefore considered separately throughout analyses. 11 Group differences were still significant for this recoded variable, χ2 (1, N = 93) = 4.70, p < .05. 12 Even when Ozonoff et al.’s (1991) more lenient criterion for defining a deficit was used, ToM and EF “deficits” still could not be considered universal, with proportions ranging from 60.0 to 82.6%.

Table 12. Universality of ToM and EF deficits in the ASD group

% of ASD group displaying a deficit

False belief alternative aggregate score 44.2

First-order false belief 51.2

Planning:

ToL: Adjusted extra move score 28.9

Rule violations 37.0

Inhibition:

Opposite Worlds time difference score 48.3

Working memory:

RIL task shape error score 37.1

Generativity:

Pattern Meanings: Correct responses 28.3

Sum of errors 26.7

Uses of Objects correct responses 41.3

Stamps task: Complexity score 19.5

Originality score 29.3

4.3.4 Ability of ToM and EF variables to predict group membership

In order to investigate the “uniqueness” of ToM and EF impairments to autism (as

compared with matched controls), a logistic regression analysis was conducted to

examine which cognitive task variables were best able to discriminate the ASD group

from the control group. A direct logistic regression was performed with group as the

outcome variable, and VIQ and all ToM and EF variables on which there were

significant group differences as the predictors. Logistic regression was chosen as the

method of analysis rather than discriminant function analysis because logistic regression

is more suitable when there is a mixture of dichotomous and continuous predictor

variables (Tabachnik & Fidell, 1996). Direct logistic regression evaluates the

independent contribution made by each predictor over and above that of the other

predictors (i.e., each predictor is assessed as if it entered the equation last).

Because not all participants completed every task (mainly due to age limits on

certain tasks, as well as missing data), only those participants with data for all the

predictor variables were included in the logistic regression. There were 27 participants

in the ASD group and 32 in the control group who met these criteria, and these limited

groups were matched on age (M = 11.26, SD = 3.18 for the ASD group; M = 10.13, SD

= 2.27 for the control group), t(57) = 1.58, p > .1, and PIQ (M = 94.52, SD = 15.78 for

the ASD group; M = 99.78, SD = 18.68 for the control group), t(57) = 1.16, p > .1.

A test of the full model with all 12 predictors against a constant-only model was

statistically reliable, χ2 (12, N = 59) = 31.03, p < .01, indicating that the predictors, as a

set, reliably distinguished children with ASDs from controls. 84.4% of the control

group and 77.8% of the ASD group were classified correctly by the model. Table 13

presents regression coefficients, Wald statistics, odds ratios, and 95% confidence

intervals for odds ratios for each of the 12 predictors. According to the Wald criterion,

the only reliable predictors of group membership were the Opposite Worlds time

difference score (a verbal measure of inhibition) and the number of correct responses on

the Uses of Objects task (a measure of verbal generativity). Performance on first-order

false belief questions approached significance as a predictor (p = .08).

Two possible limitations with this initial analysis were that i) correlations

between variables derived from the same task (or set of tasks) may have affected the

ability of individual variables from those tasks to emerge as a significant predictor, and

ii) the ratio of cases to predictors was lower than it should be in the ideal regression. In

order to address these limitations, another logistic regression was conducted where only

one variable from each task was included (VIQ, first-order false belief, ToL adjusted

extra move score, RIL shape error score, Opposite Worlds time difference score, Uses

of Objects correct responses, and Stamps task originality score). The ratio of cases to

predictors was therefore substantially higher in this alternative analysis. Variables were

chosen on the basis of the effect size of group comparisons and their representativeness

of task performance. Results were almost the same as the initial regression, with the

only difference being that the level of significance of the Uses of Objects correct

responses variable dropped from p = .04 to .07. The first-order false belief task variable

remained only marginally significant as a predictor13 (p = .08). The initial logistic

regression was therefore interpreted as a valid indicator of the ability of each task

variable to predict group membership14.

13 When the false belief alternative aggregate score was included instead (as this was the variable used in the universality calculations), the results also remained the same with the exception that the false belief aggregate was a non-significant, rather than a marginally significant, predictor of group membership. 14 As it was possible that VIQ and false belief variables may have affected each other’s contribution due to their significant correlation, another logistic regression (with the initial set of task variables) was conducted without including VIQ. First-order false belief performance was found to be a significant

Table 13. Logistic regression analysis of group membership as a function of VIQ, ToM

and EF variables

95% C. I. for odds ratio

Wald test ___________________

Variables B (z-ratio) Odds ratio Upper Lower

VIQ -.04 2.46 .96 .91 1.01

False belief tasks:

Simple .68 .15 1.98 .06 59.92

1st - order -2.05 2.96 .13 .01 1.33

2nd - order .87 .53 2.38 .23 24.56

Adj. extra move score .03 .18 1.03 .90 1.19

Rule violations -.07 .01 .93 .15 5.93

RIL task:

Shape error score -.02 .01 .98 .70 1.37

Opposite Worlds:

Time difference score .16 3.98* 1.17 1.0 1.38

Uses of Objects:

Correct responses -.10 4.03* .91 .82 1.0

Stamps task:

Complexity score -.19 1.36 .83 .60 1.14

Originality score .08 .25 1.08 .80 1.47

Restriction score -2.26 1.26 .10 .0 5.43

*p < .05; ** p < .01; *** p < .001.

predictor in this analysis, z = 4.12, p < .05. However, rather than suggesting that false belief performance was actually a meaningful predictor of group membership, this pattern of results (i.e., the change in significance of false belief as a predictor when VIQ was included) indicates that false belief understanding did not add significant additional variance to the regression beyond that contributed by VIQ.

4.3.5 Behavioural measures: Group comparisons and derivation of indices

used in correlational analyses

4.3.5.1 Repetitive Behaviours Interview (RBI)

Group comparisons. Severity summary scores were the main RBI variables used in

analyses. Distributions of the severity summary scores were frequently skewed for the

ASD group, and highly skewed for the control group. However, all transformations

were ineffective. Non-parametric statistics were used for group comparisons of the

severity of different types of repetitive behaviours. As expected, Mann-Whitney U tests

revealed that children in the ASD group exhibited significantly more severe repetitive

behaviours in all categories of the RBI (all ps < .001, except for self-injurious

behaviours, where p < .01)15. Medians and ranges of the severity summary scores

(expressed as t scores) for the ASD and control groups are presented in Table 14.

Table 14. Median (and range) of RBI severity summary scores for the ASD and control

groups

Median (range) of severity summary scores

RBI category ASD group Control group

Stereotyped manipulation of objects 54 (45-119) 45 (45-75)

Stereotyped movements 58 (46-110) 46 (46-63)

Tic-like behaviours 49 (47-130) 47 (47-57)

Self-injurious behaviours 48 (48-172) 48 (48-67)

Compulsive behaviours 60 (46- 99) 46 (constant)

Object attachments 53 (46-108) 46 (46-60)

Insistence on sameness of environment 60 (45- 98) 45 (45-69)

Rigid adherence to routines and rituals 61 (47-119) 47 (47-54)

Repetitive use of language 61 (46-109) 46 (46-62)

Circumscribed interests 64 (45- 91) 45 (45-73)

Derivation of indices used in correlational analyses. Consistent with Turner’s (1996,

1997) study, severity summary scores from the RBI were summed across categories to

form composite severity summary scores (i.e., Repetitive Movements, Sameness

15 Non-parametric group comparisons were also conducted for the “presence of behaviour” summary scores, and the outcomes were identical.

Behaviour, Compulsive Behaviours, Repetitive Language, and Circumscribed Interests

composites; see Section 3.5.1.2 in Chapter 3), which were used in correlational analyses

with cognitive measures16. These composite scores generally demonstrated normal

distributions in the ASD group. One outlier (in the ASD group) on the Repetitive

Movements composite score was trimmed. For variables with skewed distributions,

scatterplots were examined for evidence of curvilinearity and multivariate outliers, and

no major problems were identified.

In order to examine the factor structure of the RBI and the statistical validity of

Turner’s (1996, 1997) categories and composite scores (which were based on classes of

repetitive behaviour derived from the literature), principal components analysis with

varimax rotation was conducted on the severity summary scores from each RBI

category (including the data from both the ASD and control groups). Evaluation of

two- and three-factor solutions indicated that a two-factor model appeared to be more

meaningful. The two factors explained 57.0% of the total variance in the RBI, with

39.7% accounted for by a High-level Repetitive Behaviours factor (eigenvalue 3.97),

and 17.3% by a Low-level Repetitive Behaviours factor (eigenvalue 1.73). Factor

loadings are displayed in Table 15.

Table 15. Factor loadings of RBI severity summary scores

RBI category

Factor 1: High–level

Repetitive Behaviours

Factor 2: Low-level

Repetitive Behaviours

Stereotyped manipulation of objects .449 .707

Stereotyped movements .793

Tic-like behaviours .779

Self-injurious behaviours .540

Compulsive behaviours .752

Object Attachments .668

Insistence on sameness of environment .865

Rigid adherence to routines and rituals .835

Repetitive use of language .721

Circumscribed interests .419

Note: Factor loadings lower than .4 are not shown

16 Correlations were also conducted using the “presence of behaviour” summary scores (which were also summed to form composite scores). These showed an almost identical pattern of correlations with cognitive measures, as well as being highly correlated with the severity summary scores.

As the factors derived differed from the composite scores used by Turner (1996, 1997),

factor scores for each participant were calculated using a regression equation, and these

factor scores were also used in correlational analyses with cognitive measures.

4.3.5.2 Social and communicative functioning

Group comparisons. Group comparisons were conducted for each of the three measures

of social and communicative functioning separately. On the Social Behaviour

Questionnaire (SBQ), one outlier in the control group was trimmed. A t-test revealed

that, as expected, participants in the ASD group (M = 16.22, SD = 5.79) scored

significantly higher on the SBQ, indicating more abnormal social behaviours than the

control group (M = 5.06, SD = 4.75), t(88) = 10.0, p < .001. Unsurprisingly, there were

also significant group differences indicating a higher number of abnormal current

behaviours in the ASD group in the social domain of the ADI-R, (ASD group: M =

15.02, SD = 7.92; Control group: M = 2.33, SD = 3.21), t(47) = 2.74, p < .01, and in the

communication domain, (ASD group: M = 17.0, SD = 5.54; Control group: M = 4.0, SD

= 4.36), t(47) = 3.97, p < .001.

Derivation of indices used in correlational analyses. As mentioned in Section 3.5.2.2

of the previous chapter, a principal components analysis was conducted with scores

from the SBQ and scores on current behaviours only from the Social and

Communication domains of the ADI-R, which showed that all three measures loaded on

one factor (smallest factor loading = .80) which explained 75.29% of the variance in the

sample (eigenvalue 2.26). Factor scores for each participant were calculated using a

regression equation, on which higher scores indicated more abnormal

social/communicative functioning. This social/communication score was used in all

correlational analyses.

4.3.6 Correlations between ToM/EF and behavioural measures

The explanatory value of ToM and EF impairments was examined by correlating

cognitive task performances with behavioural indices17. As the incidence of repetitive

behaviours and abnormal social behaviours was very low in the control group,

correlations between cognitive and behavioural measures were conducted for the ASD

group only. If raw correlations were statistically significant, partial correlations

(controlling for age, PIQ and VIQ) were also conducted. Table 16 displays raw

correlations and relevant partial correlations between cognitive measures and

behavioural factors (i.e., the two RBI factor scores and the social/communication factor

score). High-level repetitive behaviours correlated only with the Uses of Objects correct

responses (in an unexpected direction, such that a higher number of correct responses

was associated with more severe high-level repetitive behaviours), but this correlation

was not significant when age and IQ variables were partialled out. Low-level repetitive

behaviours showed significant raw and partial correlations with the Opposite Worlds

time difference score, a verbal measure of inhibition (in the expected direction, such

that poorer inhibitory ability was correlated with increased severity of low-level

repetitive behaviours). The social/communication factor showed a significant raw

correlation in the expected direction with the Stamps task complexity score, which

remained significant when age and IQ variables were controlled.

These results demonstrate that the behavioural symptoms of ASDs showed

different patterns of correlation with cognitive measures. In general, however, there

were few significant correlations (with high-level repetitive behaviours in particular

being poorly explained by the available data). Of note, the false belief aggregate score

did not correlate significantly with any behavioural factors18. Dewey Stories, a higher

level measure of social cognition, also showed no significant correlations with any

behavioural factors, including social/communicative functioning. The only EF variables

to correlate significantly with behavioural factors were select measures of verbal

inhibition and non-verbal generativity.

17 As for the correlations conducted between age, PIQ, VIQ, and cognitive task variables, the same computational formula was used for correlations between continuous variables (i.e., Pearson product-moment correlation coefficients) and correlations between continuous and dichotomous variables (i.e., point biserial correlation coefficients), as recommended in Cohen and Cohen (1983). 18 Correlations were also conducted for all false belief tasks individually as well as for the alternative aggregate score, but no significant correlations emerged.

Table 16. Raw and partial correlations between cognitive measures and behavioural

factors within the ASD group

Factor score

Cognitive task

High-level

Rep. Behaviours

Low-level

Rep. Behaviours

Social/

Communication

ToM (n = 43):

False belief aggregate .17 .08 .24

Social Cognition (n = 17):

Dewey Stories total -.18 -.08 .09

Planning:

ToL (n = 46):

Adj.extra move score -.02 -.04 -.04

Rule violations -.06 .01 -.02

Set-shifting:

IDED Perseveration condition (n = 35):

EDS stage errors .10 -.33 -.14

IDED Learned Irrelevance condition (n = 36):

EDS stage errors .0 .20 .0

Inhibition:

RIL task error difference scores (n = 35 except inhibition score, n = 36):

Inhibition .03 .12 .22

Load .04 -.21 -.08

Inhibition + load .05 -.09 .12

RIL task RT difference scores (n = 35 except inhibition score, n = 36):

Inhibition -.03 .14 -.07

Load -.12 -.23 -.15

Inhibition + load -.11 -.10 -.19

Opposite Worlds (n = 29):

Error diff. score .06 -.04 -.03

Time diff. score -.07 .38* .47* .12

Table 16 continued

Factor score

Cognitive task

High-level

Rep. Behaviours

Low-level

Rep. Behaviours

Social/

Communication

Working memory:

RIL shape error score .10 .22 .20

Relational Reasoning:

Relational Complexity (n = 46):

Total score .18 -.09 -.10

Generativity:

Pattern Meanings (n = 46):

Correct responses .06 .03 -.06

Sum of errors -.07 .05 .0

Uses of Objects (n = 46):

Correct responses .33* .21 -.05 .0

Sum of errors -.03 -.12 .16

Stamps task (n = 41):

Complexity score .10 -.07 -.38* -.52**

Originality score .28 .09 .13

Restriction score .0 -.15 -.03

Rule adherence score -.03 -.16 -.29

* p < .05; ** p < .01; *** p < .001.

Note: Partial correlations controlled for age, VIQ and PIQ. All tests were two-tailed.

Ns listed for each task show the sample size for correlations with the behavioural

factors.

Correlations between cognitive task variables and RBI composite scores (equivalent to

those used by Turner, 1996, 1997) were also of interest, both in terms of examining

patterns of correlations with more specific types of behaviour and determining whether

these results replicate those reported by Turner. These are presented in Table 17.

Repetitive Movements demonstrated significant raw and partial correlations in the

expected direction with the Opposite Worlds time difference score, a verbal measure of

inhibitory capacity (consistent with the correlation between this variable and Low-level

Repetitive Behaviours). Sameness Behaviour, Compulsive Behaviours and Repetitive

Language did not correlate significantly with any cognitive task variables.

Circumscribed Interests demonstrated significant raw correlations with three variables

(false belief aggregate, Uses of Objects correct responses, and Stamps task restriction

score), all in the opposite direction than expected; however, none of these correlations

remained significant when age and IQ were partialled out.

Overall, each RBI composite score demonstrated a unique pattern of correlations

with cognitive task variables, although again there were few significant correlations,

with only the Repetitive Movements composite showing a significant partial correlation

with a cognitive variable. When age and IQ were controlled, ToM and social cognition

variables did not correlate with any RBI composite scores. Only one EF measure, of

verbal inhibition, correlated significantly with an RBI composite (Repetitive

Movements, as described above).

Table 17. Raw and partial correlations between cognitive measures and RBI composite scores within the ASD group RBI composite score Cognitive task

Repetitive Movements

Sameness Behaviour

Compulsive Behaviours

Repetitive Language

Circumscribed Interests

ToM (n = 43): False belief aggregate .14 .03 .08 .11 .32* .12 Social Cognition (n = 17): Dewey Stories total -.09 -.14 -.14 -.14 -.05 Planning: ToL (n = 46):

Adj.extra move score -.03 -.04 -.04 .0 -.03 Rule violations -.05 -.01 -.19 .24 -.04 Set-shifting: IDED Perseveration condition (n = 35): EDS stage errors -.31 -.02 .06 .07 -.11 IDED Learned Irrelevance condition (n = 36): EDS stage errors .23 .0 .08 .03 -.05 Inhibition: RIL task error difference scores (n = 35 except inhibition score, n = 36): Inhibition .09 .16 .13 -.01 -.20 Load -.22 .01 -.10 .13 -.16 Inhibition + load -.11 .10 .01 .10 -.29 RIL task RT difference scores (n = 35 except inhibition score, n = 36): Inhibition .10 .0 -.03 .16 .14 Load -.28 -.12 -.14 -.07 -.13 Inhibition + load -.18 -.07 -.14 .07 -.01

Table 17 continued RBI Composite Score Cognitive task

Repetitive Movements

Sameness Behaviour

Compulsive Behaviours

Repetitive Language

Circumscribed Interests

Inhibition continued: Opposite Worlds (n = 29): Error diff. score -.01 -.01 .04 .05 -.06 Time diff. score .37* .48* .08 .0 .09 .10 Working memory (n = 35): RIL shape error score -.18 .15 .12 .31 .07 Relational Reasoning: Relational Complexity (n = 46): Total score -.06

Generativity: Pattern Meanings (n = 46): Correct responses .07 .0 -.08 .07 .15 Sum of errors .01 -.06 -.09 .13 .05 Uses of Objects (n = 46): Correct responses .06 .18 .27 .08 .35* .17 Sum of errors -.10 -.05 -.11 .11 -.24 Stamps task (n = 41): Complexity score -.03 -.02 .15 -.13 .15 Originality score .12 .13 .23 .24 .31 Restriction score -.16 -.10 .10 .21 -.31* -.15 Rule adherence score -.16 -.05 -.10 -.08 -.09 *p < .05; ** p < .01; *** p < .001. Note: Partial correlations controlled for age, VIQ and PIQ. All tests were two-tailed. Ns listed for each task show the sample size for correlations with the RBI severity summary scores.

4.3.7 Relationship between ToM and EF

The relationship between ToM and EF in the ASD and control groups was investigated

by examining both correlations and dissociations between the two domains. The Dewey

Stories total score was omitted from these analyses, for two main reasons: firstly

because it does not appear to measure ToM and therefore any relationships with EF

would be difficult to interpret within the theoretical frameworks that exist regarding the

ToM-EF relationship; and secondly because only older participants completed the task

and so the sample size for correlations was small19.

4.3.7.1 Correlations between ToM and EF

Correlations between task variables were calculated separately for the ASD and control

groups. Again, partial correlations (controlling for the effects of age, VIQ and PIQ)

were conducted if raw correlations were significant. Table 18 presents raw and relevant

partial correlations between ToM and EF task variables within the control group.

Correlations are displayed separately for the various false belief variables rather than the

overall aggregate score because the pattern of correlations was different for the three

tasks. In this group, simple false belief task performance correlated with the ToL

adjusted extra move score, and the Pattern Meanings and Uses of Objects sum of errors

(with all correlations in the expected direction, such that poor false belief performance

correlated with poor EF task performance); however when age, VIQ and PIQ were

controlled, only the correlation with the Pattern Meanings sum of errors remained

significant. First-order false belief task performance correlated with the ToL adjusted

extra move score, Relational Complexity total score, Uses of Objects correct responses

and Stamps task originality score (all in the expected direction), but only the

correlations with ToL and Uses of Objects variables were significant when age and IQ

were partialled out. Second-order false belief task performance correlated with the ToL

adjusted extra move score and rule violations, the RIL task load error and RT difference

scores and inhibition RT difference score, the Relational Complexity total score, the

Uses of Objects correct responses, and Stamps task originality score (all in the expected

direction except the RIL task load RT difference score); but with age and IQ controlled,

19 Correlations were conducted out of interest, but did not reveal much of importance. There were only a small number of significant correlations with EF variables in the control and ASD groups, and a few of these were in the opposite than expected direction.

only the correlations with ToL rule violations, RIL task load error and RT difference

scores, and the Stamps task originality score remained significant.

Overall, in the control group, ToM variables demonstrated relationships with

measures of planning, non-verbal inhibition (under working memory load conditions),

relational reasoning, and both verbal and non-verbal generativity, but several of the

correlations were mediated by age and IQ effects (there were no significant partial

correlations with relational reasoning ability). All correlations were in the expected

direction - such that poorer performance on EF tasks correlated with poorer false belief

task performance - with the exception of the RIL task load RT difference score. A

possible explanation for this is that participants who performed well on false belief tasks

made fewer errors on the working memory load condition, but at the expense of speed

(i.e., they demonstrated a cautious speed/accuracy tradeoff). Another noticeable aspect

of the pattern of correlations was that there tended to be more correlations with the

second-order than the simple and first-order false belief tasks, which is likely to be

partly due to the fact that only a small proportion of participants in the control group

failed to obtain a perfect score on the lower-order tasks.

Table 19 displays raw and partial correlations between ToM and EF tasks within

the ASD group. In children with ASDs, simple false belief task performance correlated

with the ToL adjusted extra move score, Uses of Objects correct responses, and the

Stamps task restriction score (with all correlations in the expected direction); however

when age and IQ variables were controlled, only the correlation with the Stamps

restriction score remained significant. First-order false belief task performance

correlated with the ToL adjusted extra move score, Uses of Objects correct responses

and Stamps task originality score (all in the expected direction), but none of the

correlations were significant when age and IQ were partialled out. Second-order false

belief task performance correlated with the ToL adjusted extra move score and rule

violations and the Uses of Objects correct responses (all in the expected direction); none

of these correlations were significant with age and IQ controlled.

Overall, the ASD group showed noticeably fewer significant correlations

between ToM and EF variables than the control group, with only one correlation

remaining significant with age and IQ controlled (between simple false belief

performance and a non-verbal measure of generativity).

Table 18. Raw and partial correlations between ToM and EF measures within the control group False belief task _______________________________________________ EF task Simple 1st-order 2nd-order ToL (n = 46): Adj. extra move score -.30* -.26 -.40** -.35* -.45** -.27 Rule violations -.06 -.01 -.48** -.34* IDED Set-shifting task condition (n = 34): Perseveration EDS stage errors -.27 -.25 -.28 Learned Irrelevance EDS stage errors -.13 -.23 -.13 RIL task (n = 33): Error difference scores: Inhibition .27 .07 -.02 Load -.16 -.33 -.45** -.54** Inhibition + load .12 -.15 -.30 RT difference scores: Inhibition -.05 -.05 -.44* -.35 Load .26 .24 .42* .52** Inhibition + load .19 .18 -.02 Shape error score -.20 -.19 -.31 Opposite Worlds (n = 35): Error difference score .08 -.20 .03 Time difference score .02 -.03 -.09 Relational Complexity (n = 46): Total score .26 .30* .07 .48** .14 Pattern Meanings (n = 46): Correct responses -.02 .28 .15 Sum of errors -.37* -.32* -.10 -.26 Uses of Objects (n = 46): Correct responses .11 .44** .30* .51*** .28 Sum of errors -.34* -.27 -.19 -.28 Stamps task (n = 45): Complexity score .01 .28 .29 Originality score .23 .40** .25 .56*** .40** Restriction score .04 .07 .09 Rule adherence score -.10 .02 -.06 * p < .05; ** p < .01; *** p < .001. Note: Partial correlations controlled for age, VIQ and PIQ. All tests were two-tailed. Ns listed for each EF task show the sample size for the correlations with the ToM tasks.

Table 19. Raw and partial correlations between ToM and EF measures within the ASD group False belief task _________________________________________________ EF task Simple 1st-order 2nd-order ToL (n = 43): Adj. extra move score -.33* -.07 -.53***-.24 -.35* -.04 Rule violations -.25 -.16 -.30* -.10 IDED Set-shifting task condition: Perseveration (n = 32): EDS stage errors -.15 .08 -.23 Learned Irrelevance (n = 33): EDS stage errors .15 -.22 -.19 RIL task (n = 32 except inhibition difference scores, n = 33): Error difference scores: Inhibition -.28 -.13 -.08 Load -.18 -.03 -.17 Inhibition + load -.32 -.09 -.17 RT difference scores: Inhibition .05 .13 .0 Load -.17 .07 .07 Inhibition + load -.16 .13 .04 Shape error score -.09 -.05 .05 Opposite Worlds (n = 29): Error difference score -.24 .01 .08 Time difference score -.21 .06 -.12 Relational Complexity (n = 43): Total score .10 .23 .27 Pattern Meanings (n = 43): Correct responses .19 .01 .26 Sum of errors .02 -.10 .09 Uses of Objects (n = 43): Correct responses .32* .11 .31* .03 .48** .31 Sum of errors .13 .02 .12 Stamps task (n = 41): Complexity score .23 .08 -.01 Originality score .11 .39* .10 .30 Restriction score -.56***- .46** -.16 -.21 Rule adherence score .15 .01 -.15 * p < .05; ** p < .01; *** p < .001. Note: Partial correlations controlled for age, VIQ and PIQ. All tests were two-tailed. Ns listed for each EF task show the sample size for the correlations with the ToM tasks.

Table 20 presents a summary of the significant partial correlations between ToM and

EF domains in the control and ASD groups, clearly demonstrating the different pattern

of correlations in the two groups.

Table 20. Summary of significant partial correlations between ToM and EF variables

in the control and ASD groups

EF domain Control group ASD Group

Planning

Set-shifting

Inhibition – Non-verbal *

Inhibition – Verbal

Working Memory

Relational Reasoning

Generativity – Verbal

Generativity – Non-verbal

* Correlations marked with an asterisk were in the opposite direction than expected.

Note: Each tick represents one significant correlation between a false belief and an EF

variable in that domain.

4.3.7.2 Dissociations between ToM and EF

While the correlative evidence presented in the previous section was suggestive of a

relative independence between ToM and EF in the ASD group compared with controls,

it was also of interest to examine the incidence and direction of dissociations between

ToM and EF deficits within the ASD group. This was achieved by defining a deficit on

any given task in the same way as for the universality calculations in Section 4.3.3. The

proportion of ASD participants with a ToM deficit who displayed unimpaired

performance on EF tasks was calculated, and conversely, the proportion of participants

with a given EF deficit who displayed unimpaired ToM performance was also

calculated. For ease and simplicity of interpretation, the false belief alternative

aggregate score was used as the measure of ToM performance (as for the universality

calculations) rather than analysing all the false belief variables separately. However, all

EF variables on which significant group differences were found were analysed

separately. The results of these calculations are displayed in Table 21.

Table 21. The incidence of ToM-EF dissociations in the ASD group

EF measure

% of ToM-impaired ASD participants

with unimpaired EF

ToL: Adjusted extra moves score 47.4 19

Rule violations 36.8 19

Opposite Worlds time difference score 44.4 9

RIL shape error score 54.5 11

Uses of Objects correct responses 47.4 19

Stamps task: Complexity score 83.3 18

Originality score 50.0 18

% of EF-impaired ASD participants

with unimpaired ToM

ToL: Adjusted extra moves score 23.1 13

Rule violations 40.0 20

Opposite Worlds time difference score 64.3 14

RIL shape error score 58.3 12

Uses of Objects correct responses 44.4 18

Stamps task: Complexity score 62.5 8

Originality score 25.0 12

These data clearly demonstrate that dissociations between ToM and EF occurred

relatively frequently (usually in around 50% of the participants showing impairments)

and in both directions, such that the presence of a ToM impairment did not necessarily

result in an EF impairment and vice versa. These results are consistent with the

correlative data in indicating an independence between ToM and EF deficits in the ASD

group.

4.4 Discussion

This section includes four subsections. The first three of these examine the profile,

primacy and independence of ToM and EF deficits in ASDs in this study, comparing the

current findings to those of previous studies and considering alternative interpretations

of the results. In the final section, an attempt is made to interpret the outcomes in terms

of the six alternative hypotheses regarding primacy and independence outlined in the

introduction.

4.4.1 Profile of ToM and EF deficits

As predicted, both ToM and EF deficits were found in this sample of individuals with

ASDs. However, a unique profile of spared and impaired abilities emerged, which

included both expected and unexpected features.

Profile of ToM deficits. In the ToM domain, a higher proportion of ASD

participants than controls displayed unstable performance on all false belief tasks and on

aggregate scores, although partialling out age and VIQ reduced the significance of

group comparisons on the simple and second-order tasks as well as on the alternative

aggregate score (which involved a more lenient scoring criterion). These results suggest

that false belief understanding was significantly impaired in the ASD group, but that on

two of the tasks the impairment was partially attributable to the poorer verbal skills of

the ASD participants20. This lack of robustness of ToM deficits on false belief tasks

was underscored by the relatively high percentage of ASD participants who

demonstrated errorless performance on the tasks, which ranged from 48.8% on the

standard first-order task to 72.1% on the simple (unexpected contents and unexpected

identity) tasks, with 39.5% displaying perfect performance across all tasks. According

to the alternative aggregate score which more reliably indicates poor performance,

55.8% of ASD participants were high scorers on false belief tasks. The highest

percentage of first-order false belief task passers found in previous studies was 90%

(Dahlgren & Trillingsgaard, 1996), with the next highest being 55% (Prior et al., 1990).

Although the high 72.1% on the simple false belief task is likely to be an overestimation

due to the fact that perfect performance on it was assumed if the first-order task was

passed (for 7-16 year-olds who began with the first-order task), it is nevertheless clear

that the sample of ASD participants in this study demonstrated better false belief task

performance than the majority of samples from previous studies. The finding that false

belief performance was significant correlated with both age and VIQ suggests that the

relatively old mean age and high level of verbal ability of the sample probably explains

this good false belief performance, consistent with previous studies demonstrating that

individuals with autism passing false belief tasks tend to be older and have higher verbal

ability (e.g., Eisenmajer & Prior, 1991; Prior et al., 1990; Sparrevohn & Howie, 1995).

20 Notably, the effect of group on performance on the simple and second-order false belief tasks remained significant when age only was included in the regression, but did not remain significant when VIQ only was included, suggesting that VIQ had a greater impact on the significance of group differences than did age.

The age and level of ability of the control sample meant that a high percentage

of controls also demonstrated flawless false belief performance. However, it is of note

that the use of a fairly strict scoring criterion prevented extreme ceiling effects, with

30.4% of controls demonstrating unstable false belief performance on the aggregate

score. Even using the more lenient alternative aggregate, 19.6% of controls emerged as

low scorers, the majority of whom were between 6 and 10 years of age. This suggests

that beyond the age of 5, either ToM is still developing or other cognitive factors may

be influencing false belief performance (this latter possibility is discussed further below

in Section 4.4.3).

Despite the fact that ceiling effects did not pose a significant problem on the

false belief tasks, the relatively high proportion of perfect performances in both the

ASD and control groups indicates that the assessment of ToM in this study would have

been significantly strengthened by the inclusion of a more advanced ToM task such as

Happé’s (1994a) “Strange Stories” task or Baron-Cohen et al.’s (1997, 2001a) “Eyes

Task”. The use of the Dewey Stories task represented an attempt to tap into higher-

level social cognitive skills, however its lack of correlation with false belief variables

(after partialling out age, PIQ, and VIQ) indicates that it is questionable as a measure of

mentalising ability and can probably be successfully performed by drawing on more

declarative knowledge of social norms. In light of this, it is noteworthy that ASD

participants did not show a significant impairment on the task, with marginally

significant differences reducing to non-significance when VIQ was accounted for

(although the medium effect size of the group difference suggested that the marginal

significance may have been due to the small size of the sample who completed the task).

This suggests that high-functioning individuals with ASDs often have intact knowledge

of what is considered “normal” or appropriate, but that this knowledge does not aid or

interact with either their mentalising skills or their own social skills (Dewey Stories

performance did not correlate significantly with social/ communicative functioning).

Interestingly, a similar pattern of results has been demonstrated previously with patients

with damage to the ventromedial prefrontal cortex (e.g., Saver & Damasio, 1991).

Profile of EF deficits. ASD participants also demonstrated an interesting pattern

of strength and weakness on the various EF components tested. Consistent with

previous research, individuals with ASDs displayed robust planning impairments on the

ToL, both in terms of the number of extra moves made and the frequency with which

the rules of the task were violated. However, the small to medium effect size was

somewhat lower than expected, with Pennington and Ozonoff (1996) reporting an

average effect size of 2.07 on the similar Tower of Hanoi (ToH) task across the studies

conducted up until then. This discrepancy probably cannot be attributed to the age or

level of functioning (i.e., IQ level) of the sample because previous studies using Tower

tasks have also used older, high-functioning participants. One difference between the

ToL administration procedure used in this study compared with other studies using the

ToL and ToH is that during the initial task instructions participants were actively

encouraged to plan the movements of the discs in advance. This may have positively

influenced performance on the task and reduced the size of the difference between the

ASD and control groups. Nevertheless, the fact that planning impairments persisted in

the ASD participants despite this extra cueing provides evidence of the severity of their

deficit in this domain. Furthermore, as the ToL and ToH have been found to hold

slightly different cognitive demands (e.g., Welsh et al., 1999), a comparison of effect

sizes across the two tasks should be viewed with caution (unfortunately, the only other

study to use the ToL rather than the ToH - Hughes et al.’s (1994) study - did not report

standard deviations and therefore the effect size from that study could not be directly

compared). Following from this, it should also be noted that Welsh et al. (1999) found

that ToL performance tapped working memory and inhibition as well as planning

ability, and therefore the poor ToL performance demonstrated by the ASD group may

not necessarily reflect a planning impairment. However, the lack of group differences

on separate and more direct measures of working memory and non-verbal inhibition (as

discussed below) make this unlikely, supporting the interpretation of the ToL result as

indicating a planning deficit in the ASD group.

The absence of significant impairments in attentional shifting abilities on the

IDED set-shifting task was an unexpected result, given fairly consistent evidence of set-

shifting difficulties in previous studies (e.g., Ciesielski & Harris, 1997; Hughes &

Russell, 1993; Hughes et al., 1994; Ozonoff et al., 1994). A difficulty with mental

flexibility holds an intuitive appeal in explaining autistic symptoms such as

perseveration and rigid adherence to routines and rituals, and Ozonoff (1997b) has

suggested that a shifting impairment may in fact be the key feature of the EF profile

which characterises autism. However, there have been at least two other studies which

have also failed to find set-shifting deficits in autism. Ozonoff et al. (2000) found no

significant difference between their high-functioning autistic participants and controls

on the original IDED set-shifting task from the CANTAB battery. They attributed this

result to the fact that their task was computerised, thereby facilitating the performance

of their autistic participants. However, the fact that their participants with Asperger

syndrome did show impaired performance on the task, along with Hughes et al.’s (1994)

previous finding of a deficit on the same computerised task in individuals with autism,

speak against this explanation.

Turner (1997) also found that her high-functioning participants with autism

displayed intact performance on both conditions of the modified IDED set-shifting task

used in the current study, although her low-functioning participants demonstrated

impairments on the EDS stage of the Perseveration condition. There was also evidence

of a marginally significant difference in set-shifting performance in the current sample,

but contrary to Turner’s results this occurred in the Learned Irrelevance condition.

These two negative results using the same task (i.e., Turner’s and the current study)

necessitate some decomposition of the requirements of the task. Although the design of

the modified IDED task allows more specific analysis of the component processes

involved in the task than the original version, it appears to do this at the expense of the

impact and obviousness of the shift. As Turner pointed out, in the modified IDED task,

the change in stimulus dimension that occurs in the EDS stage of both conditions (i.e.,

the introduction of the new relevant stimulus dimension of solidity in the Perseveration

condition and the new irrelevant dimension of size in the Learned Irrelevance condition)

signals very clearly that the task has changed. This means that it is fairly easy for the

participant to deduce the rules of responding for that condition without relating them to

the previous condition or becoming easily “stuck” in their previous mode of responding.

This in turn suggests that either the validity of the task as a measure of set-shifting is

questionable or that the nature of the shift required is too easy for high-functioning

individuals with ASDs. The fact that Owen et al. (1993) found different impairments on

the task in patients with frontal lesions and Parkinson’s disease indicates that the

problems on the task will emerge if the shifting deficit is severe enough. Hence, the

lack of convincing evidence of impairments on the task in high-functioning autism may

indicate that a set-shifting or cognitive flexibility deficit may not be as central to autism

as previously thought. Most of the initial studies on which this notion was based used

the WCST as their measure of cognitive flexibility, on which impaired performance

may be caused by a range of different factors. The variability in findings on more pure

set-shifting measures such as the IDED tasks calls into question the importance of the

role of set-shifting in the EF profile characteristic of autism.

Results obtained in the inhibition domain were also contrary to predictions and

added to previous research in an interesting way. Most earlier studies have not found

impairments in inhibition in individuals with autism (Brian et al., 2003; Ozonoff et al.,

1994; Ozonoff & Jensen, 1999; Ozonoff & Strayer, 1997), and those that have found

apparent inhibition deficits have used tasks on which performance could be influenced

by other EF capacities such as cognitive flexibility, working memory or generativity

(Hughes, 1996b; Rinehart et al., 2002; Williams et al., 2002). Notably, all of the studies

finding intact inhibition in autism have used non-verbal tasks except one study in which

the Stroop task was used (Ozonoff & Jensen, 1999). In the present study, previous

findings of unimpaired non-verbal inhibitory control in autism were replicated using the

newly developed RIL task, on which neither accuracy or RT measures revealed

inhibitory difficulties in the ASD group. However, significant and robust verbal

inhibition impairments were found on the Opposite Worlds test, particularly on RT

measures (a trend was also evident on error measures). This result stands in contrast to

that obtained by Ozonoff and Jensen (1999) with their autistic sample of similar size

and age range using the Stroop, which involves very similar verbal inhibitory

requirements. Closer inspection of Ozonoff and Jensen’s data reveals, though, that the

autism group in their study performed at a very similar level to their ADHD group

(autism group mean = 27.7 versus 27.4 for the ADHD group, on an unspecified scale),

the latter of which did differ significantly from the control group (mean = 32.0). The

lack of a significance difference from controls in the case of the autism group was likely

to have been due to their larger standard deviation (11.4 versus 7.0 for the ADHD

group). However, it is also notable that while the ADHD group was matched with

controls on all age and IQ variables, the autism group was not matched to the control

group on VIQ, PIQ, or Full-Scale IQ (FSIQ), and this was handled by covarying FSIQ

in all group comparisons. As discussed in Section 4.3.2, ANCOVA is not considered an

appropriate statistical technique for accounting for group differences in cases such as

this, as it may also remove part of the effect of group. Ozonoff and Jensen’s result may

therefore represent a false negative. It will be interesting to monitor the outcome of

further studies on verbal inhibition, particularly in regard to how inhibition performance

in autism may be distinguished from that displayed in ADHD.

The interaction between inhibition and working memory was another topic of

interest for this study, with Russell (1997b) proposing that impairments in these

domains only emerge in autism if the task at hand requires both abilities simultaneously.

Although inhibition deficits were revealed on a verbal task with minimal working

memory requirements, results from the non-verbal RIL task were largely consistent with

this proposal. While ASD participants were able to successfully perform the RIL task

condition involving only non-verbal inhibitory requirements (and as discussed further

below also showed intact performance on the Relational Complexity task, which

arguably requires working memory but not inhibition), on the condition involving both

inhibition and working memory requirements, the ASD participants made significantly

more errors on a measure of working memory capacity. There was also a trend for the

ASD group to make more errors on a measure of inhibitory capacity for this condition

as compared with the control condition. This suggests that in situations where both

(non-verbal) inhibition and working memory are required, individuals with ASDs are

unable to maintain an adequate level of performance in either domain, but particularly

in working memory (although it may be the case that the working memory component

of the task was more vulnerable in this case because that task requirement was added

after the inhibitory component and was therefore more novel; or, alternatively, because

it was tested less frequently).

No group differences were identified on the Relational Complexity task,

suggesting that the capacity to integrate multiple relations in parallel (Halford, 1993;

Halford et al., 1998; Waltz et al., 1999) is not impaired in children with ASDs. This

result further indicates that failure on false belief tasks in children with ASDs is unlikely

to have its basis in a working memory or relational reasoning deficit. This was

confirmed by the lack of significant correlations between false belief and Relational

Complexity performance in either the ASD or control group. However, although Waltz

et al. (1999) found that frontal lobe patients were significantly impaired on their version

of the Relational Complexity task, the validity of the task as a measure of relational

reasoning is yet to be determined. It could be argued that the task does not tap working

memory or integrative capacity as strongly as it first appears. As the stimuli and all

possible response choices are always present and visible to the participant, it is possible

that the participant can check the accuracy of each response choice against the

requirements of each relational change one by one, rather than having to hold in mind

all the relevant relational changes simultaneously. All that would then be required is for

the participant to notice all the relational changes which are occurring and accurately

check whether each response choice fits the sequence of each change correctly. These

requirements are quite obviously different to the relational integration arguably required

by false belief tasks. So, while results from the Relational Complexity task did not hold

much promise in suggesting a relational integration difficulty in ASDs, the use of

different kinds of relational complexity task (such as the transitive inference task also

used by Waltz et al., 1999) could be an interesting avenue for further research.

Results from the generativity tasks were more promising. On the verbal Uses of

Objects task, the group difference on the number of correct responses variable met the

criterion for a large effect. This is consistent with previous studies demonstrating

generativity impairments in autism using other tasks (Boucher, 1988; Craig & Baron-

Cohen, 1999; Lewis & Boucher, 1991), and in particular replicates Turner’s (1999)

study, which found that both low- and high-functioning children with autism generated

fewer responses than controls on the Uses of Objects task21. However, unlike Turner,

the ASD sample in this study did not produce a higher number of error responses.

Although the scoring systems used in the two studies were slightly different, even on

categories common to both studies such as redundant responses, there were discrepant

outcomes. Another difference between the studies was that Turner allowed 150s for her

participants to produce responses, whereas only 90s was allowed in the current study. It

is possible that during the extra 60s given in Turner’s study, a pressure to respond had

accumulated over a longer time and so the children with autism produced inappropriate

responses when they were unable to generate correct ones; whereas the children in this

study felt less of a demand to produce a response. Regardless, it appears that the

individuals with ASDs in both studies demonstrated difficulty spontaneously generating

correct verbal responses on this task.

In contrast, results from the Pattern Meanings task revealed no significant group

differences on any variable overall. This was somewhat surprising as it was also

thought to be a test of verbal generativity and ASD participants were found to produce

fewer responses and make more errors on the task in Turner’s (1999) study. However,

more detailed analyses involving the two subgroups of ASD participants meeting “full

criteria” and “partial criteria” on the ADI-R showed that the full criteria subgroup

generated fewer correct responses than controls (although this effect disappeared when

age and VIQ were controlled), and the partial criteria subgroup generated more error

responses than controls (although this effect became marginally significant when age

and VIQ were controlled). This discrepancy between the full and partial criteria

subgroups appeared to be due to a tendency for the partial criteria subgroup to produce

more responses overall than the fully autistic subgroup, such that the partial criteria

subgroup produced as many correct responses as the control group but were also more

likely to produce error responses. This suggests that the less severe subgroup (in terms

21 Turner (1999) actually calculated the total number of responses overall rather than the number of correct responses. Although it was not reported, the ASD group in the current study also produced significantly fewer responses overall than the control group, replicating Turner.

of the range and number of symptoms present) reacted to problems generating responses

by producing errors, whereas the more severe subgroup reacted by not producing

responses at all.

The failure to replicate Turner’s (1999) findings of significant differences in the

overall sample on the Pattern Meanings task (and the lack of robustness of subgroup

differences) requires further comment. The shorter time period allowed in the current

study may also account for the lack of robust or significant differences, but this does not

seem likely to be the sole cause given the strong generativity deficit displayed on the

Uses of Objects task in the same time period. It could be argued that Pattern Meanings

is not as good a task at discriminating those with poor generativity, as a larger range of

responses are acceptable than for the Uses of Objects task. Scoring was fairly lenient

for the task as it was often necessary to accept responses which the pattern possibly

could be, even if they were a little far-fetched. This could explain the lack of an overall

difference in the number of error responses made, but even given the lenient scoring,

one would expect a reduced number of correct and total responses if the ASD

participants experienced difficulty producing ideas. It may be that a combination of

these two explanations can account for the lack of significant overall group differences

in this study, in that the majority of children with ASDs were able to produce adequate

responses for a 90s period because they found the task easier than the Uses of Objects

task and the scoring was more lenient, but if the task had been continued for another

minute, they may have started producing fewer and more inappropriate responses.

Consistent with this interpretation, the rate of producing responses was similar for the

control participants across the two studies (approximately 1 every 3.6s in the current

study and 1 every 3.3s for the high-functioning controls in Turner’s study), but the ASD

participants in the current study produced responses at a faster rate (1 every 3.97s) than

the high-functioning ASD participants in Turner’s study (approximately 1 every 4.7s).

Results on the Pattern Meanings task in this study should not, therefore, be interpreted

as evidence against a verbal generativity deficit (although any such deficit on this task

was clearly more subtle than on the Uses of Objects task).

Non-verbal generativity impairments in ASD participants also emerged in this

study, with performances on the Stamps task revealing that individuals with ASDs

produced less complex and fewer original patterns and were more restricted in their use

of the stamps available. There was also a trend for children with ASDs to show less

adherence to one rule for each pattern. Results on the originality and restriction scores

were consistent with Frith (1972), however contrary to this study Frith found no

difference in the complexity of patterns produced by her sample of children with autism,

and she also found that her sample showed a very high degree of rule adherence. In the

current study, the lack of rule adherence was likely to have been attributable to a certain

proportion of the ASD participants who produced random patterns with no underlying

rule. These participants may also have been the cause of the lower mean complexity

score of the ASD group, as random or unidentifiable patterns were assigned the lowest

complexity score of 1. The main difference between the two studies was the level of

functioning of the samples, with 14 out of 20 of Frith’s participants having an estimated

PIQ below 60. It is possible that higher-functioning individuals with ASDs may have

opted to produce random patterns when unable to produce original rules, whereas

lower-functioning participants may have simply produced the same pattern repeatedly.

This hypothesis cannot be directly tested in the current sample because all participants

had PIQs above 60. Nevertheless, it is evident that the generativity impairment which

characterises autism extends across both verbal and non-verbal domains as well as

across all levels of functioning.

Concluding comments on the profile of impairments. In summary, then, the

ASD group in this study demonstrated a characteristic profile of strength and weakness

in the cognitive domains tested, with impairments on measures of ToM, planning,

verbal inhibition, tasks combining inhibition and working memory, and both verbal and

non-verbal generativity, but intact performance on tests of awareness of social norms,

set-shifting, non-verbal inhibition and relational reasoning. Consistent with predictions,

the largest effects were on verbal tasks22 (measures of false belief, verbal inhibition and

verbal generativity), consolidating the importance of including tasks involving both

verbal and non-verbal responses where possible. Certain aspects of the profile of

impairments found in this study were inconsistent with initial predictions based on

previous studies, such as the absence of set-shifting deficits, and the presence of

impairments in verbal inhibition. These findings suggest that the EF profile

characteristic of autism as proposed by Ozonoff and colleagues (e.g., Ozonoff, 1997;

Ozonoff & Jensen, 1999) may require modification, and its discriminant validity (i.e.,

its uniqueness to autism as compared with other clinical conditions) merits further

investigation.

22 It should be noted that there were no significant correlations between PIQ and any ToM or EF measures. This indicates that the measures of PIQ on which the control and ASD samples were matched measured different abilities to those measured by the non-verbal EF tasks, and therefore that the relative lack of group differences on non-verbal EF tasks compared with verbal tasks cannot be accounted for by the matching of the groups on PIQ.

It should be pointed out, however, that the neat profile described above of course

assumes reasonable construct validity of each task variable. This assumption deserves

some critical analysis, particularly given the well-documented difficulties with EF

measurement discussed in Chapter 2 (Section 2.2). It is possible that both i) certain

variables are not actually measuring what they are purported to and ii) there is overlap

between the EF domains measured and/or the tasks used in different domains. The ideal

way to address this uncertainty would be to conduct a factor analysis of all the EF

variables in the battery, however the high number of variables in relation to the number

of participants prevented a valid factor analysis from being performed on this sample.

Interpretation of each variable therefore relied mainly on previous literature as well as

informed qualitative analysis of the requirements of each task. The choice of relatively

pure EF tasks and/or tasks which included control conditions, allowing decomposition

of the processes involved in task performance, facilitated the ease and clarity with

which variables could be interpreted. Examination of raw and partial correlations

(partialling out age, VIQ and PIQ) between EF variables in the control group was also

informative (these are presented in Appendix B), in general demonstrating weak and

relatively few significant correlations between EF domains as well as several strong

intra-domain correlations, thereby validating the notion that the tasks measure mostly

independent constructs. This was the case even for variables which could conceivably

belong in a different category to other more central variables from the same task, such

as rule violations on the ToL or the error variables on the verbal generativity tasks (both

of which could reflect inhibition or working memory), with these variables usually

correlating more strongly with other variables from the same task than those from other

tasks. It appears, therefore, that there is no strong evidence to suggest that the

underlying abilities assumed to be measured by each of the EF variables are invalid.

4.4.2 Primacy of ToM and EF deficits

Having identified the ToM and EF profile which characterises this sample of individuals

with ASDs, the next question concerns the primacy of each of these deficits. In this

study, primacy was measured by calculating the universality, uniqueness (this criterion

was measured indirectly), and explanatory value of each variable on which significant

group differences were found (or all variables in the case of explanatory value). Results

showed that while ToM and EF deficits showed similar prevalence within the ASD

group, measures of ToM did not successfully discriminate between the ASD and control

groups or show any significant relationships with behavioural measures, yet several EF

indices emerged as significant predictors of autism group membership and two EF

variables correlated significantly with measures of symptomatology. Overall, it would

appear that EF deficits are relatively more primary23 than a ToM deficit in ASDs.

However, before making any strong conclusions, results derived from each index of

primacy require a more detailed discussion.

Universality. The first matter of note is that neither ToM nor EF deficits, as

defined by a score worse than one standard deviation from the mean of the control

group (or a close approximation in the case of dichotomous variables), were universal

among this sample of high-functioning individuals with ASDs. Within the ASD group,

44.2% displayed a ToM deficit and the prevalence of EF deficits ranged from 19.5% to

48.3% (with impairments in verbal inhibition and verbal generativity being the most

prevalent of the EF components). All deficits remained non-universal even using the

more lenient criterion of any score below the mean of the control group, contrary to the

results obtained by Ozonoff et al. (1991) which showed that deficits defined in this way

on their EF composite were almost universal (96%) amongst their autism group whereas

ToM deficits were not (52% on a first-order composite and 87% on a second-order

composite). However, the current results are consistent with most other studies which

report prevalence data on ToM and/or EF impairment in autism, the majority of which

have not found either ToM (see Happé, 1995) or EF deficits (e.g., Liss et al., 2001;

Hughes et al., 1994; Ozonoff & Jensen, 1999) to be universal. It should also be noted

that it is unlikely that EF deficits would have been universal in the Ozonoff et al. (1991)

study if a stricter definition of a deficit had been used.

In any case, unless the ToM and EF tasks used were too easy for a proportion of

the participants, these results suggest that neither a ToM nor EF deficit is the single

primary deficit in autism, but rather (as outlined in hypothesis 6 in the introduction),

that either i) different ToM and EF profiles are found in different subgroups of

individuals with autism, rather than both deficits co-occurring in all individuals; ii) ToM

and EF deficits underlie different aspects of symptomatology, and therefore may be

present in differing degrees of severity according to the individual’s position on the

multidimensional autism spectrum; or iii) an unidentified third deficit may be more

primary or at least equally primary. A fourth possibility is also conceivable, which is

23 The notion of “relative primacy” refers to the relative ability of each impairment to meet the criteria for a primary deficit. Although the term “primary” is usually used in the context of a single primary deficit, in a multiple primary deficits model it is also possible for one deficit to hold superior causal importance (e.g., explanatory value) over another, and therefore have superior relative primacy.

that different developmental stages of autism are characterised by different cognitive

profiles. These four possibilities will be re-visited later in this section and discussed

further in Section 4.4.4.

Uniqueness. Results on the uniqueness criterion more clearly discriminated

between the ToM and EF accounts, with verbal measures of inhibition and generativity

being the strongest predictors of autism group membership (deficits on these two

variables were also the most universal among the ASD group and had the largest effect

sizes of all the EF variables), while first-order false belief performance was only a

marginally significant predictor24. While these results do not allow any strong

inferences regarding the uniqueness of these deficits to autism as opposed to other

clinical groups, they do suggest that deficits in verbal inhibition and verbal generativity

are particularly central to ASDs. This is an interesting result given that mental

flexibility and planning deficits were previously thought to be the most significant in

autism. It also adds to the previous study by Ozonoff et al. (1991), which showed that

EF performance was the best predictor of autism group membership, but did not analyse

the key EF components involved.

Explanatory value. In terms of explanatory value, correlations between

cognitive and behavioural measures revealed that ToM variables did not correlate

significantly with any behavioural domain, whereas two EF measures showed

significant relationships with various aspects of autistic symptomatology. The lack of

explanatory value of the ToM tasks, particularly the non-significant relationship

between ToM and social/communicative functioning, was a somewhat surprising result,

although not without precedent (Prior et al., 1990; Sparrevohn & Howie, 1995; the lack

of relationship with repetitive behaviours is also consistent with Turner, 1997). If a

ToM deficit is the primary basis for social/communicative impairments in autism then

one would expect that those who performed poorly on the ToM tasks would have been

those who showed more severe social impairment. Yet this was not the case: although

the correlation was not significant, its direction actually suggested the opposite trend,

such that those with better performance on the false belief aggregate tended to show

more abnormal social/communicative functioning25. The reason for this is unclear, but

24 When the false belief alternative aggregate score was used, it was a non-significant predictor of group membership. 25 This unexpected trend still existed when the false belief variables were analysed separately and when the Social Behaviour Questionnaire and the Social and Communication domains of the ADI-R were analysed separately rather than together as one factor score. This suggests that it was not simply a spurious individual result, which was a possibility given that only one correlation between ToM and social/communicative functioning was conducted (in comparison with the wider range of EF tasks and measures of repetitive behaviour).

in any case it constitutes evidence against the notion that a ToM deficit underlies the

social/communicative impairments which characterise autism. Although an inability to

appreciate others’ mental states is an intuitively appealing explanation for abnormal

social behaviours, a one-to-one relationship between an emergent behaviour and

underlying cognitive deficit cannot be assumed; abnormal social behaviours are not

necessarily caused by an impairment in a social or ToM module (Bowler, 2001). The

existence of a significant correlation in the appropriate direction between

social/communicative functioning and an EF measure casts further doubt on the idea

that ToM deficits underlie the social/communicative symptoms of autism while EF

deficits underlie repetitive behaviours and restricted interests.

EF measures demonstrated better explanatory value than ToM variables, but

there were still only two EF variables showing significant correlations with behaviour: a

measure of verbal inhibition, which correlated with low-level repetitive behaviours

(when the RBI data was broken down further using Turner’s (1996, 1997) categories,

the verbal inhibition measure correlated with repetitive movements); and a non-verbal

generativity measure, which correlated with social/communicative functioning. This

latter result consolidates previous findings of a predictive relationship between EF

impairment and abnormal social behaviours in autism, although previous studies

identified different EF components, most commonly set-shifting, as holding explanatory

value (Berger et al., 2003; Gilotty et al., 2002; McEvoy et al., 1993). This may be

because the previous studies incorporated different measures of set-shifting to the one

used here and did not include any tests of non-verbal generativity. The significant

correlation between verbal inhibition and repetitive movements was consistent with

Turner’s (1997) findings, which showed that performance on a test of “recurrent

perseveration”, on which inhibitory control was required, also correlated significantly

with repetitive movements in her sample of children with autism. This relationship

between inhibitory control and repetitive movements makes intuitive sense; it is easy to

imagine how inhibitory impairment could lead to difficulties “stopping” a particular

movement sequence26.

However, Turner’s (1997) study demonstrated several other significant

correlations between EF and RBI measures which were not replicated in this study.

These included a significant relationship between set-shifting performance on the

26 Of course, the usual caveat about correlation and causation applies here. Arguments against the opposite direction of causation (i.e., that the behavioural symptoms may cause the EF deficits) are presented in Section 2.2.3 of Chapter 2.

modified IDED task and repetitive use of language and circumscribed interests; and

significant associations between performance on generativity measures (including the

verbal generativity tasks used in this study) and sameness behaviours and circumscribed

interests. The lack of any significant partial correlations between verbal generativity

variables and behavioural measures in this study was also surprising given the apparent

centrality of that domain in the analyses addressing the universality and uniqueness

criteria. These discrepancies with Turner’s study are somewhat difficult to explain.

While Turner did not partial out age or ability variables from her correlations (choosing

instead to divide her sample into low- and high-functioning subgroups), the results of

raw correlations in this study were not consistent with Turner’s findings either – the

Uses of Objects correct responses variable actually correlated significantly with

circumscribed interests in the opposite direction than predicted, and there were no other

significant raw correlations consistent with Turner’s results.

This failure to replicate is reflective of a general paucity of significant

correlations between cognitive and behavioural measures in the ASD group in this

study27. Neither ToM nor EF variables could account for the full range and extent of

autistic symptomatology measured, or even one complete symptom domain. Some

behavioural categories, such as high-level repetitive behaviours and several sub-

categories of the RBI falling under that heading, did not correlate with any cognitive

task variables at all; and conversely, some cognitive variables on which deficits were

significant and relatively prevalent did not show any relationships with

symptomatology. What might explain this? One possibility is that the behavioural

measures used were not sufficiently accurate, sensitive or wide-ranging, but this would

not seem to be the most likely reason given the well-documented diagnostic validity of

the ADI and the wide range and depth of domains covered by the RBI. The fairly

heterogeneous nature of the sample (i.e., the inclusion of participants meeting ADI-R

criteria in only one or two domains) is not a plausible explanation for these results

either, as variations in the range of behaviours displayed is more likely to increase,

rather than decrease, the probability of finding correlations.

One potentially influential factor is the behavioural therapy received by most

children with ASDs. It may be the case that relationships between underlying cognitive

deficits and behavioural expressions have been distorted because therapeutic

27 Pellicano, Maybery, Durkin, & Maley (submitted) also recently found a similar lack of significant correlations between ToM and EF measures and autistic symptomatology (as measured by the ADI-R) in a substantial sample of children with autism.

intervention shapes the nature and severity of the behaviours which would otherwise

occur if no intervention took place (without affecting cognitive functioning as strongly).

Parental discipline would have a similar effect, particularly in the case of repetitive

behaviours; indeed, during the administration of the RBI when questions were asked

regarding how long their child usually indulged in particular repetitive behaviours,

parents would often answer “Until I tell him/her to stop”.

This highlights the importance of considering the interaction between

environmental and genetically based cognitive influences on behavioural expression.

Just as there is no one-to-one mapping between genes and cognition (Karmiloff-Smith,

Scerif, & Thomas, 2002), it is also unlikely for direct or simple relationships to exist

between cognition and behaviour. It is probable that the relationship between cognitive

functions and behavioural outcomes is dynamic and changes continuously throughout

development. Hence, correlations between current cognitive status and current

behaviours may not reveal the cascade of processes which has shaped the nature and

severity of those behaviours, and they are likely to be weak and unreliable, resulting in

failures to replicate such as that which occurred with this and Turner’s (1997) study.

Correlations between cognitive and behavioural factors may also be weakened by the

use of parental report as the method of behavioural measurement, rather than direct

observation (this is discussed further in Chapter 7). Explanatory value would probably

be best measured using longitudinal designs, examining correlative and predictive

relationships between early cognitive deficits and both early and later behaviours, using

both observational and parental report methods of behavioural measurement.

Notwithstanding these concerns, the findings on explanatory value are consistent

with the results on the universality criterion in indicating the unlikelihood of a single

primary deficit model (or a model in which both deficits meet all criteria for primacy),

and could suggest that either i) different subgroups within the autism spectrum are

characterised by different cognitive and behavioural profiles, with this variability

obscuring and diluting clear relationships in the overall sample; or ii) a third cognitive

domain which was not measured is at least equally primary and can account for the

behaviours which did not correlate with any of the cognitive variables included in this

study. The “multidimensional spectrum” possibility, as described in hypothesis 6 (ii),

was not supported by these results, as this model (or at least the version described)

would predict strong correlations between particular cognitive deficits and the

behavioural domains they were purported to underlie, regardless of the variability of the

sample.

Concluding comments on the primacy of ToM and EF deficits. So far, then, it

appears that while EF deficits do not adequately or consistently meet all the criteria for

primacy in ASDs, they fare slightly better than the ToM hypothesis. In evaluating the

relative primacy of ToM and EF deficits, however, the comparative level of difficulty of

the ToM and EF tasks is perhaps one of the most important issues to be addressed, as it

is possible that these results simply reflect the fact that the ToM tasks were easier than

the EF tasks. The older, high-functioning nature of the overall sample was necessary to

achieve the aim of specifically assessing the full range of EF components, however the

consequence of this was that a larger than usual percentage of both ASD and control

participants displayed perfect performance on both first- and second-order false belief

tasks, thereby reducing the universality of ToM deficits. Although performance was not

quite at ceiling in the control group, the high level of performance overall may have

reduced the potential size of the group difference (this was also pointed out by Perner

and Lang (2000) in reference to the Ozonoff et al. (1991) study). This would have the

consequence of decreasing the ability of the ToM measures to discriminate the ASD

group from the control group, therefore affecting their performance on the uniqueness

criterion. The lack of significant correlations between ToM tasks and behavioural

measures could also have been a by-product of task difficulty, because it may have been

the case that ToM task passers still showed social and/or other behavioural impairments

- in other words, the behavioural measures may have been more sensitive than the ToM

measures, thereby reducing the strength of the relationship.

What may be said in defence of the validity of results derived from the ToM

measures in this study? Firstly, the inclusion of the second-order false belief task,

which has previously been failed by 10-18 year-old individuals with autism (Baron-

Cohen, 1989b) extended the range of difficulty in the ToM task domain. Secondly, it is

noteworthy that ToM and EF deficits were of roughly equal prevalence (using the ToM

scoring criterion which more clearly indicates low scorers), with a tendency for the

ToM deficit to be slightly more prevalent than most EF deficits in the current sample.

Given this, its lack of uniqueness and explanatory value cannot be readily discounted as

an artefact of the unequal difficulty of the tasks. Similarly, the significant proportion of

individuals who showed impaired performance on ToM tasks but unimpaired

performance on EF tasks (discussed below) indicates that for some individuals, the false

belief tasks were more difficult than the EF tasks (at least when evaluated with

reference to control group performance). Thirdly, performance on all of the false belief

tasks was far from the ceiling in the ASD group, allowing enough variability in the

sample for correlations with behavioural measures to emerge. The fact that false belief

performance showed medium level correlations with VIQ confirms that it was not at

ceiling and also suggests that it was assessed with some reliability. Finally, these

findings are consistent with previous research, with non-universality typical of all ToM

studies in autism (see Section 2.1.3 in Chapter 2), the discrepancy between the

uniqueness or discriminability of ToM and EF consistent with Ozonoff et al.’s (1991)

results, and the lack of explanatory value of ToM replicating studies by Turner (1997)

on repetitive behaviours and by Prior et al. (1990) and Sparrevohn and Howie (1995) on

social behaviours. For these reasons, the lack of universality, uniqueness and

explanatory value of the ToM deficit in this sample cannot be convincingly rejected as

an uninteresting consequence of the level of difficulty of the false belief tasks.

One additional alternative interpretation of the results indicating superior

primacy of EF deficits on the uniqueness criterion also requires acknowledgement.

When commenting on Ozonoff et al.’s (1991) results, Perner (1998; Perner & Lang,

2000) argued that the finding that an EF deficit discriminates better between ASD and

control groups than a ToM deficit does not necessarily indicate that the EF deficit is

more primary. He argues that a partial impairment in ToM (which he equates with

metarepresentational capacity) should actually result in a more severe impairment in EF,

as the SAS (Supervisory Attentional System) depends on metarepresentational capacity

and so any metarepresentational impairment will be magnified during EF task

performance. However, three findings are inconsistent with this explanation: i) the

roughly equal effect sizes of ToM and EF deficits in the ASD group, which suggest that

the deficits are equally severe; ii) the lack of explanatory value of the ToM tasks

(relative severity of impairment is irrelevant to that index of primacy, and if ToM was

primary it should show relationships with symptoms of autism); and iii) the presence of

a significant proportion of cases showing impaired ToM but intact EF (discussed

below), which this account does not allow for. Therefore, the evidence suggesting

better discriminative ability of EF deficits in this ASD sample appears to be a valid

indicator of superior primacy.

4.4.3 Independence of ToM and EF deficits

Given that EF deficits appear to be more primary than a ToM deficit in ASDs, is it

possible, then, that they can explain or subsume the ToM deficit which also

characterises these individuals? The relative absence of significant correlations and the

frequency of dissociations between ToM and EF performances in the ASD group

suggest that this is not in fact the case, and instead indicate fairly persuasively that the

two deficits are largely independent in individuals with ASDs. The fact that the

dissociations occurred in both directions importantly demonstrated that mastery of one

domain did not appear to be a prerequisite for the other. Instead, they suggest that the

two deficits co-occur in ASDs probably because of their proximal neuroanatomical

substrates. These results stand in contrast to the handful of studies which have found

significant correlations between ToM and EF in autism (Colvert et al., 2002; Ozonoff et

al., 1991; Zelazo et al., 2002), but consolidate previous reports of ToM-EF dissociations

in autistic individuals (Baron-Cohen & Robertson, 1995; Baron-Cohen et al., 1999b;

Ozonoff et al., 1991). As described in Section 2.3.2 of Chapter 2, the three studies

which have found an association between ToM and EF in autism have either failed to

partial out the effects of age and/or IQ or used only one type of EF task28. The

importance of partialling out the effects of age and ability was verified in this study, as

almost all of the several significant raw correlations in the ASD group between false

belief tasks and measures of planning and verbal generativity became non-significant

when these factors were accounted for. The current results are therefore likely to

represent a more reliable indication of the nature of the specific relationship between

ToM and EF in ASDs.

A number of alternative interpretations of these results are conceivable,

however. In their meta-analysis of the studies on the ToM-EF relationship conducted

up to that point, Perner and Lang (1999) found significant non-homogeneity among the

size of the correlations and proposed that the length of the testing session may be an

important confounding factor. They found a significant negative correlation between

the estimated testing duration per session and the size of the ToM-EF correlation

reported, leading them to suggest that longer testing sessions could result in fatigue

which would affect performance and decrease the strength of the correlation. It is

possible, then, that the relatively long testing sessions in this study (approximately 2.5

hours in total, including all tests in the WAFSASD battery; this was usually divided into

two sessions) may have influenced the strength of the correlations. Similarly, the fact

that the order of test administration was the same for all participants could potentially

28 In the one study (Colvert et al., 2003) which did partial out age and ability variables, only one EF task was included, the DCCS task. As this task is multifactorial and may be failed for a number of different reasons (see Perner & Lang, 2002), no equivalent task was included in the current battery. It is interesting to note that when ToM-EF correlations have been found in autism, the EF measures have been impure and/or consisted of composite scores.

have introduced extra fatigue-related variance to performance on the tasks completed

towards the end of each session. The questionable reliability of both ToM and EF tasks

(as discussed in Chapter 2) could also leave the correlations vulnerable to extraneous

variance. However, while these factors may have introduced a degree of extra variance

to the data, it is unlikely that they could have differentially affected the ASD and control

groups in such a way as to fully account for the striking difference in the number of

significant correlations observed in the two groups. Also, it is not the case that the tasks

at the end of the battery showed the weakest correlations, in either the ASD or control

group (e.g., the Opposite Worlds test, which was the last test to be administered,

showed strong correlations with repetitive movements in the ASD group; and the

generativity tasks, which were also administered towards the end of the battery,

demonstrated significant correlations with false belief variables in the control group).

In Section 2.3.1 of Chapter 2, it was argued that the close relationship between

ToM and EF which has been consistently demonstrated in typically developing 3 – 5

year-olds may not necessarily hold for older age groups. It is therefore also possible

that the lack of significant correlations in the ASD group may be a consequence of their

age, and that a relationship would be observed in a younger sample. However, the

presence of a range of significant partial correlations between ToM and EF measures in

the age-matched control group in this study suggests that age was not the most

important factor causing the outcome in the ASD group. Nevertheless, the pattern of

correlations demonstrated in the control group revealed a number of differences to those

typically observed in younger children, suggesting that the nature of the ToM-EF

relationship may change throughout development. Firstly, the significant correlations

occurred mostly (although not always) with the second-order false belief task, which

could reflect both the larger proportion of non-perfect scorers on this task as well as the

higher EF demands of the task (consistent with Tager-Flusberg & Sullivan, 1994b).

Second and more importantly, some EF components such as inhibition and working

memory did not show the usual relationship with false belief performance (the RIL task

Load error difference score, which reflects performance on a task combining inhibitory

and working memory requirements, did correlate with the second-order false belief task,

but other indices of that task such as the shape error score did not correlate with false

belief scores); whereas other variables which have not commonly been associated with

ToM performance, such as planning and generativity, did show significant correlations

with variables from both the first- and second-order tasks.

In order to further explore changes in the ToM-EF relationship with age, the

control group was divided into two age subgroups (5-8 year-olds and 9-18 year-olds)

and raw and relevant partial correlations were conducted separately for the two

subgroups (these are presented in Appendix C). Although the sample size for some of

the correlations was quite small in the younger age subgroup because some tasks were

administered only to participants over the age of 6, the results showed clearly that there

were a larger number of significant ToM-EF correlations in the younger subgroup, and

that the pattern of correlations was different to that observed in the older subgroup29.

While the larger number of significant correlations in the younger subgroup may simply

be due to the increased failure rate on false belief tasks in that subgroup (although note

that several controls over the age of 8 did not demonstrate perfect performance), the fact

that correlations with different EF variables were revealed in the older subgroup

suggests that there are also qualitative differences in the ToM-EF relationship at

different developmental stages. These findings are consistent with the few studies

which have been conducted previously on the ToM-EF relationship in children over the

age of 5, which have also demonstrated a smaller number and different pattern of

correlations compared to studies of younger children (Charman et al., 2002; Perner et

al., 2002a). The mechanisms underlying these developmental changes remain open to

speculation. One possibility is that a functional dependence between ToM and certain

aspects of EF such as inhibition exists as both of these abilities are developing (as

outlined in the emergence accounts), but once a certain level of development takes place

the ToM-EF relationship revolves more around performance-based factors (as proposed

by the expression accounts). However, the dissociability of impairments in the ASD

group is inconsistent with both emergence accounts (this is discussed further later).

Another possibility is that performance-based (or expression) factors explain the

relationship at all ages, but the EF components which influence ToM performance

change with age, as it may be that different skills are required for children of different

ages to solve ToM tasks (e.g., inhibitory control may be more important at a young age

as one’s own perspectives may be more salient)30. Although the typical development of

29 Separate correlations for the same two age subgroups were also conducted within the ASD group. Both subgroups showed no significant ToM-EF correlations after age and IQ variables were partialled out, with the exception of the correlation between the Stamps task restriction score and the simple false belief task, which was only significant in the older subgroup. This suggests that ToM and EF deficits were independent in all age ranges included in this sample of ASD participants. 30 The “common conceptual bases” account tested in this study, involving relational complexity, did not receive support in the current results – individuals with ASDs showed a ToM impairment but no relational reasoning deficit, and there were no significant correlations between the Relational Complexity task and false belief variables in either group. Nonetheless, this type of account would be unlikely to

ToM and EF is not the major focus of this thesis, these ideas and results certainly merit

further exploration in future studies using a wider range of ToM tasks and including

participants with a wider range of ages.

Returning to the ToM-EF relationship in ASDs, given that the lack of

correlations between ToM and EF found in the ASD group as compared with the control

group cannot be easily dismissed as a result of the length of the testing session or age of

the sample (as the two groups were matched on these factors), the question remains as

to why ToM and EF should be correlated in typically developing children but largely

uncorrelated in children with ASDs, where deficits in both co-exist. Something akin to

the ToMM-SP model proposed by Leslie and colleagues (Leslie & Thaiss, 1992; Leslie

& Roth, 1993) could potentially account for this pattern of results. In this model,

typically developing children may fail false belief tasks because of their processing

requirements (based on the ToM-EF correlations in the control group in this study, the

SP would include planning and generativity as well as inhibitory/working memory

processes), but children with autism fail because they lack a ToMM. Consistent with

the results from this study and in accordance with their predictions, Roth and Leslie

(1998) found that performances on a false belief task and a non-mentalistic control task

with similar processing requirements were significantly correlated in typically

developing children, but not autistic children. Similarly, the lack of correlations

between ToM and EF in the ASD group in this study could reflect the notion that EF

factors did not add any extra variance to their ToM performance – the false belief tasks

were failed because of ToM-specific factors and not because of poor EF. This

interpretation of the current results is consistent with the notion that ToM may be a

domain-specific capacity, although the ToMM-SP model in its original form cannot

adequately account for other results obtained in this study as it holds that a ToM deficit

is primary to autism. This interpretation would also suggest that the correlations

observed in the control group were caused by some individuals performing poorly on

the false belief tasks because of weaknesses in aspects of EF and not because of a ToM

impairment. This explanation would therefore favour an expression account of the

ToM-EF relationship in typical development.

However, while this explanation can account for the lack of ToM-EF

correlations in the ASD participants who failed ToM tasks, it does not explain the lack

of a relationship in ASD participants who showed EF impairments but intact ToM. The

explain developmental changes in the ToM-EF relationship as the common conceptual basis occurs because of common task structures, regardless of age.

expression account of the ToM-EF relationship would predict that those ASD

participants showing impairments in the EF components which were correlated with

ToM performance in the control group (e.g., planning, generativity) should sometimes

fail ToM tasks because of poor EF, thereby resulting in correlations between ToM and

EF performance. These correlations would only occur in the half of the ASD sample

showing impaired EF and may therefore have been too weak to emerge as significant.

Another possibility, though, is that those ASD participants who scored well on false

belief tasks did so via non-conventional routes to success, such as by using the

compensatory strategies described in Section 2.1.3 of Chapter 2. If so, then those EF

capacities which are normally required for successful ToM performance may not have

been needed, as the task-solving strategy may have been previously learned or taught

and therefore not dependent on online problem-solving skills. This speculation requires

empirical confirmation, however. One method of testing it would be to examine

correlations between EF measures and higher-level, more advanced and ecologically

valid measures of ToM, for which compensatory strategies may be less likely to have

developed.

It is not the case, however, that there was a complete absence of correlations

between ToM and EF in the ASD group. One non-verbal generativity variable (the

Stamps task restriction score) showed a significant correlation with simple false belief

task performance such that poorer generativity (higher restriction) was associated with

unstable performance on the simple false belief task. While generativity has not

previously been an EF component of particular focus in the literature on the ToM-EF

relationship, its potential role in the false belief performance of children with autism has

been highlighted previously by Peterson and Bowler (2000). Peterson and Riggs (1999)

had earlier argued that tests of false belief and subtractive reasoning (assessed by asking

a question such as “If the marble had not been moved, where would it be now?) require

similar counterfactual reasoning capabilities, in that they both involve processing a

negative counterfactual question of the form “If not-F, then Q” (where F is a known fact

and Q is a question). However, Peterson and Bowler’s (2000) results, which showed

that subtractive reasoning ability appeared to be necessary but not sufficient for accurate

false belief performance in children with autism, led them to suggest that the false belief

task required a crucial additional factor: that of generativity. They argued that in

subtractive reasoning tasks, children are given the supposition “not-F” as part of the

problem, but in false belief tasks it must be generated. A generativity impairment could

therefore explain why children with autism found false belief tasks more difficult than

subtractive reasoning tasks in their study. They proposed that both subtractive

reasoning and generativity were additional requirements for successful false belief

performance, beyond basic mentalistic understanding. Although no subtractive

reasoning tasks were included in this study, this kind of model fits quite well with the

current data, which suggested that ToM performance was largely independent of EF-

related factors in the ASD group, but that generativity played some role in false belief

performance. It is not clear, however, why the restriction score did not show significant

correlations with performance on the more difficult first- and second-order false belief

tasks (although these correlations were in the predicted direction). Nevertheless, this

result requires a slight modification to the two ToMM-SP-like and compensatory

strategy accounts proposed above, indicating that it may be the case that some

individuals with ASDs showed unstable performance on simple false belief tasks

because of a generativity impairment.

4.4.4 Towards a “multiple primary deficits” model of ToM and EF in ASDs

The next challenge is to attempt to unite this rather complex set of results on the profile,

primacy, and independence of ToM and EF deficits into a coherent theoretical

framework. In the introduction to this chapter, six hypotheses regarding the primacy

and independence of ToM and EF in ASDs and their implications for theories of autism

and models of the ToM-EF relationship were considered. The first hypothesis was that

there is only a single, primary deficit in ASDs, with no secondary impairments. As both

ToM and EF deficits were present in this sample of individuals with ASDs, this

hypothesis was not supported. Hypotheses 2 and 3 represented different scenarios in

which ToM and EF deficits were related in ASDs, either because one caused the other

or because both were caused by a third, more primary deficit. Neither of these

hypotheses were supported in this study, with the evidence suggesting that ToM and EF

deficits are largely independent in ASDs, with their co-occurrence most likely explained

by the neuroanatomical proximity. Hypotheses 4, 5, and 6 all proposed that ToM and

EF deficits were independent, but differed in terms of the primacy of those deficits.

Notwithstanding the other explanations for the results considered in previous sections of

this discussion, the non-universality and incomplete explanatory value of both ToM and

EF deficits in this study indicate that neither ToM nor EF deficits meet all the criteria

for primacy. This rules out hypothesis 4 (that either ToM or EF is the single primary

cognitive impairment of ASDs) and also excludes hypothesis 5 (that both ToM and EF

deficits are primary).

This leaves hypothesis 6: that ToM and EF impairments are independent in

ASDs, and neither meets all criteria for primacy. Of the six hypotheses, this found the

most support in the results from this study. Three different versions of this “multiple

deficits” model were presented in the introduction: i) different ToM and EF profiles are

found in different subgroups of individuals with autism, rather than both deficits co-

occurring in all individuals (in this model, explanatory value across all ASD individuals

may be low because the presence of different subgroups may obscure relationships in

the overall sample); ii) ToM and EF deficits underlie different aspects of

symptomatology, and therefore may be present in differing degrees of severity

according to the individual’s position on a multidimensional autism spectrum; or iii)

there may be an unidentified third deficit which is at least equally primary (and may

underlie symptoms which were unrelated to ToM and EF). A fourth version was also

proposed in Section 4.4.2 of this discussion: that iv) different stages of development of

individuals with ASDs may be associated with different primary cognitive deficits.

Each of these four possibilities will now be considered in turn.

i) Subgroups. Subgroups of individuals with ASDs can be classified or defined

in several different ways, such as by severity of symptoms, the domains in which

symptoms are present, or level of intellectual impairment (e.g., Beglinger & Smith,

2001; Prior et al., 1998). The only subgroups which were explicitly examined in this

study were the “full criteria” and “partial criteria” subgroups, defined according to the

number of domains in which a higher-than-threshold level of symptomatology was

present (as assessed by the ADI-R). Although these two subgroups showed different

patterns of performance on the Pattern Meanings task, which suggested that their verbal

generativity difficulties may be expressed in slightly different ways, there were no other

differences between the two subgroups on any other EF or ToM measures. This

indicates that the number of domains in which symptoms are present does not relate

systematically to the profile of ToM and EF deficits displayed. However, this

subgrouping method did not distinguish which symptom domains were present within

the “partial criteria” subgroup, which may have obscured more fine-grained differences.

While other subgroup divisions were not specifically analysed, the lack of

significant or strong correlations between ToM and EF variables and any measures of

symptom severity also suggests that subgroups based on overall symptom severity

(rather than presence of symptoms in particular domains) are also unlikely to be

associated with consistent profiles of ToM and EF deficits. It is a stronger possibility

that subgroups based on level of functioning (as measured by IQ) may be characterised

by different ToM and EF profiles, as several group differences on both ToM and EF

measures were mediated by VIQ. Previous research has also suggested that level of

functioning (which has been measured by adaptive skills as well as IQ) has shown the

most promise in discriminating subgroups and predicting outcome (see Beglinger &

Smith, 2001; Fein et al., 1999; Stevens et al., 2000). When comparisons were

conducted between “low VIQ” and “high VIQ” subgroups within the ASD group31, it

was found that the low VIQ subgroup performed significantly more poorly on ToM

measures (including both aggregate scores), and on one EF measure (the ToL).

However, there were no other EF task differences such that the high VIQ subgroup

performed more poorly than the low VIQ subgroup, which suggests that this subgroup

division did not map directly onto “ToM-impaired, EF-intact” and “EF-impaired, ToM-

intact” subgroups, instead indicating that the low VIQ group was more impaired in ToM

and equally impaired in most domains of EF relative to the high VIQ group.

However, this assumes that there are only two subgroups based on ToM-EF

performance and VIQ. This is unlikely, as there is at least a third subgroup showing

both ToM and EF deficits (as the incidence of ToM-EF dissociations was not 100%).

Furthermore, there may be more subgroup divisions which vary according to the

specific EF profile displayed. A better way of determining how many subgroups based

on ToM and EF performance there are and how they relate to other measures such as IQ

or symptomatology would be to employ cluster analysis, where the characteristics of

ToM-EF clusters could be examined to determine how the subgroups should be defined

behaviourally. This would require a large sample which varied considerably on IQ and

symptomatology. Conclusions about the validity of the subgroup notion therefore await

further investigations, although subgroups based on symptom domains or symptom

severity were not strongly supported by the current data.

ii) An autism spectrum with multiple dimensions. Rather than proposing several

discrete subgroups, this model conceives of autistic symptomatology as varying on a

more continuous spectrum. In a single primary deficit model, this spectrum could be

unidimensional, but in a multiple deficits model, there would need be more than one

dimension, with ToM and EF deficits each underlying a different dimension. In the

version of this model presented in the introduction, these dimensions corresponded to

31 A PIQ-based division did not reveal any significant subgroup differences in ToM or EF performance.

symptom domains, such that, for example, a ToM deficit was the basis for social

impairment and EF deficits accounted for repetitive behaviours. Thus, each ASD

individual’s profile of ToM and EF deficits would determine the nature and severity of

their symptomatology (so, a more severe ToM deficit would be associated with more

severe social impairment). In this model, the apparent presence of subgroups of “ToM-

impaired, EF-intact” and “EF-impaired, ToM-intact” individuals would be an artefact of

the arbitrary cutoff for “impairment” within a continuous distribution of scores. There

were no bimodal distributions of continuous variables in this study (although ToM

performance was highly skewed), suggesting the dimensional variation notion is

appropriate at least for EF performances. However, the particular version of the model

alluded to above was not supported in this study, with ToM performance showing no

significant correlations with behavioural measures and various EF deficits correlating

significantly with both social/communicative functioning and repetitive behaviours. As

discussed earlier, environmental and developmental factors may have contributed to

these results; however, as it stands, the data are not consistent with this model.

Alternative versions of this model are nevertheless possible. For example, rather

than the dimensions corresponding to symptom domains, there may be one dimension

for number and severity of symptoms and one for level of functioning (as proposed by

Szatmari et al., 2002). Perhaps EF deficits could then be associated with the former

dimension (as they showed greater explanatory value) and a ToM deficit could be

associated with level of functioning (as ToM performance covaried more strongly with

VIQ). The weak and incomplete explanatory value of EF deficits is inconsistent with

this possibility, but the notion of a multidimensional spectrum appears more suited to

the distributions of scores on cognitive tasks (particularly EF tasks) and warrants further

investigation.

iii) A third deficit. As neither ToM nor EF deficits were able to account for the

full range of symptoms displayed by this sample of individuals with ASDs, it is possible

that there is a third (or more) cognitive deficit(s) which could explain those symptoms.

As stated in the introduction, this possibility is compatible with both of the accounts just

described, rather than competing with them. The relative primacy and the relationship

of this third deficit to ToM and EF would be open for investigation. This study does not

allow any inferences about what this deficit might be, but based on current research the

most obvious candidate would be weak central coherence, which has been found to

characterise individuals with autism in a number of studies (Happé, 1994b, 1996, 1997;

Jolliffe & Baron-Cohen, 2000; Shah & Frith, 1983, 1993). Happé (2000) has argued

that weak central coherence is independent from ToM and can explain aspects of autism

which ToM cannot, although another study found that ToM and central coherence were

related (Jarrold, Butler, Cottington, & Jimenez, 2000). The universality, uniqueness,

causal precedence, and in particular the explanatory value of weak central coherence

and its relationship with ToM and EF will be interesting topics for further research.

iv) Different developmental stages. This fourth variant of a multiple primary

deficits model holds that the primacy of ToM and EF deficits in autism may not remain

consistent throughout different stages of development. This would mean, of course, that

the criterion of “causal precedence” would not necessarily be an appropriate index of

primacy. Previous research has generally not found strong EF deficits in young children

with autism, while ToM deficits (or least impairments in the proposed precursors to

ToM) and social abnormalities have been more consistently documented (see Sections

2.1.3 and 2.2.3 in Chapter 2). It could be hypothesised that an impairment in ToM

holds more explanatory value in the early stages of autism, but that deficits in EF

somehow become more primary with age. Furthermore, deficits in the various

components of EF could also change in primacy as they develop; for example,

inhibition impairments could be more important early on (as inhibitory control is

typically one of the first EF components to develop), with planning and generativity

impairments (which typically reach their capacity during adolescence) becoming more

central to autism later in development32. It may be the case that the age at which a

particular capacity usually develops is the age at which its abnormal development has

the most impact on behaviour. The relatively old mean age of the sample could

therefore explain why the correlations between ToM and behavioural measures were not

significant, and the variability in the age of the sample could account for the relatively

small number and size of significant correlations with EF components, as well as the

non-universality of both ToM and EF deficits. This is obviously speculative and relies

on the findings of previous studies given that the early stages of the development of

autism were not studied in this research. This hypothesis would be best assessed using

longitudinal studies of the development of ToM and EF and their relationship with

behaviour throughout development. Notably, its plausibility is supported by previous

findings that in children with Williams syndrome (which is also a genetically based

32 When the sample was divided into younger (5-8 year-old) and older (9-18 year-old) subgroups, the results of group comparisons were consistent with this hypothesis: group differences in verbal inhibition were significant for the younger subgroup but not the older subgroup, and planning and generativity impairments were significant for the older subgroup but only marginally significant for the younger subgroup (these analyses are presented in Appendix D).

developmental disorder), a change in the nature of cognitive deficits is observed at

different developmental stages (Paterson, Brown, Gsodl, Johnson, & Karmiloff-Smith,

1999).

If this developmental account is accepted, is it then possible that ToM and EF

are related processes in younger children with autism (i.e., below the age of five years)?

In Section 2.3.1 of Chapter 2, it was argued that the existence of dissociable

impairments in two abilities at a certain age cannot be used to infer the independence of

the two abilities throughout earlier development. As suggested earlier in regard to

typically developing children, it may be the case that aspects of EF depend on ToM for

their development, as argued by Perner (or vice versa as suggested by Russell, although

this is less likely because of the lack of EF deficits found in young children with autism

as well as EF’s later developmental trajectory), but that the two domains become

independent after the crucial stage of development has passed. However, if this was the

case, it would be unlikely at later ages for deficits in ToM to exist without deficits in EF

(this occurred in a significant proportion of this sample for all EF components). While

double dissociations could occur if impairment in one domain was acquired after the

initial stage of development of the other domain, “ToM-impaired, EF-intact”

dissociations could not occur if ToM was impaired from an early age, as this would

result in abnormal development of EF (and vice versa if early EF impairments caused a

ToM deficit). This suggests that EF deficits in ASDs are not a consequence of an early

ToM deficit. Moreover, the presence of double dissociations in this sample provides

evidence against both emergence accounts of the ToM-EF relationship (as well as the

“common conceptual basis” accounts). The only situation in which an emergence

account may be plausible would be if EF deficits caused ToM to develop abnormally,

but this ToM deficit was not apparent at later ages because the use of compensatory

strategies “masked” the impairment. Nevertheless, it appears that ToM and EF deficits

in ASDs are best explained as occurring independently, most likely linked by their

neurobiological substrates, but possibly varying in primacy according to the age at

which they usually have the most impact on behaviour.

In sum, then, results from this study suggest that deficits in ToM and certain aspects of

EF characterise individuals with ASDs; neither of these deficits meet criteria for a

single primary deficit, but EF deficits (in particular, deficits in verbal inhibition and

generativity) are relatively more primary; and the deficits appear to be independent.

This pattern of results suggests that a “multiple primary cognitive deficits” account best

explains ASDs, but it remains to be seen which version of this model is most

appropriate (or, perhaps, which combination of these models – this is discussed further

in the General Discussion in Chapter 7). Distinguishing between these models relies

fairly heavily on determining the relationship of each cognitive deficit with behavioural

symptom domains; however, the difficulties with measuring the explanatory value of

cognitive deficits in a precise and thorough manner (due to the indirectness and

complexity of cognitive-behavioural relationships, as discussed in Section 4.4.2)

prevented strong conclusions from being made on this basis. Similarly, while the non-

universality of both ToM and EF deficits indicated that neither of them was singularly

primary, the inferior primacy of ToM based on its lack of explanatory value remains

debatable (although its non-significant ability to discriminate ASD from control

individuals supported this inference). Another method of confirming the primacy of

ToM and EF deficits and testing various multiple deficits models is to examine the

prevalence of these deficits and their independent occurrence or co-occurrence in

relatives of individuals with ASDs – thereby determining their potential as independent

subclinical markers of the ASD genotype. That is the focus of Chapters 5 and 6.

CHAPTER 5

Literature Review: The Broad Autism Phenotype

5.1 Autism as a genetic disorder

5.2 The broad phenotype

5.2.1 The behavioural phenotype

5.2.2 The cognitive phenotype

5.2.2.1 General intellectual ability

5.2.2.2 Specific cognitive deficits

As mentioned at the end of Chapter 4, the role of ToM and EF deficits as subclinical

markers of the ASD genotype is another method of determining their primacy to ASDs.

If a cognitive deficit is primary to autism, then its prevalence in individuals who carry

the autism genotype (or at least the genotype for the relevant autistic trait), including

those with a milder or lesser variant who do not meet criteria for an ASD diagnosis,

should be higher than in the normal population (Bailey et al., 1996). An elevated

incidence of a particular cognitive weakness in first-degree relatives of individuals with

autism therefore provides evidence of the centrality of that deficit to autism. The

independent incidence of those weaknesses in certain subgroups of families, and their

relationship with behavioural traits, would also have implications for the various

multiple deficits models presented in the previous chapter (this is discussed further in

the introduction to Study Two in Chapter 6).

This chapter contains a literature review of the genetics of autism and the broad

autism phenotype, as a background for the rationale and methodology developed in

Study Two. As the arguments outlined above depend upon the assumption that autism

is a genetic disorder, the first section of the review presents evidence for that

assumption. The second section of the review describes research on the behavioural and

cognitive characteristics of the broad autism phenotype, including previous studies of

ToM and EF deficits in relatives of individuals with ASDs. Throughout the review, it

will hopefully become evident why it is important to study ToM and EF in the broad

phenotype and what needs to be addressed in further studies.

5.1 Autism as a genetic disorder

Because autism did not appear to run in families (i.e., it was rare for children with

autism to have parents with autism), for several years a genetic basis to the disorder was

rejected in favour of environmental causes such as cold, detached child-rearing practices

or “refrigerator” parenting (Bettelheim, 1967; Eisenberg & Kanner, 1956). Early

reports by Kanner and Asperger themselves of social and communicative difficulties

and obsessional characteristics in parents of children with autism and Asperger

syndrome were commonly interpreted as causing abnormal development in their

children rather than reflecting a genetically based milder phenotype. However, these

notions came under increasing doubt as it was realised that it would be rare for autistic

individuals to develop close relationships and therefore have children, and as studies

failed to find evidence of abnormal parenting styles (see Cantwell, Baker, & Rutter,

1979). Recognition of associations with mental retardation (Lockyer & Rutter, 1969)

and epilepsy (Rutter, 1970) provided further evidence of a biological basis. A key study

by Folstein and Rutter (1977) helped establish autism as a genetic disorder, finding a

significant difference in the concordance rates for autism in monozygotic (MZ; 36%)

versus dizygotic (DZ; 0%) same-sex twins. Furthermore, they found that the majority

of MZ twins who did not have autism showed some type of cognitive deficit, usually

involving language. These findings have since been replicated in two large-scale

studies (Bailey, Le Couteur, Gottesman, & Bolton, 1995; Steffenburg et al., 1989),

although Bailey et al. (1995) found higher MZ concordance rates of 60% for autism and

92% for a broader spectrum of cognitive or social abnormalities (the DZ concordance

for this broad spectrum was also higher at 10%). Based on their results, Bailey et al.

(1995) estimated that the heritability for autism is greater than 90%.

An elevated recurrence risk for autism has also been observed in siblings,

ranging from 2% (Boutin et al., 1997; Minton, Campbell, Green, Jennings, & Samit,

1982) to 6% (Baird & August, 1985) and averaging at around 2.2% across studies

(Szatmari, Jones, Zwaigenbaum, & MacLean, 1998), compared with a population base

rate of around 0.1% (Fombonne, 2003). An increased rate of ASDs more broadly in

twins and other relatives of individuals with autism has also been reported (Bailey et al.,

1995; Bolton et al., 1994; Le Couteur et al., 1996), indicating that the genetic liability is

not restricted to a narrowly defined disorder. Family members with ASDs do not

always covary in diagnostic subtype or symptom severity, with MacLean et al. (1999)

finding no familial aggregation of ASD subtype (i.e., autism, Asperger syndrome, or

PDDNOS), and Le Couteur et al. (1996) finding as much variation in symptom severity

and intellectual ability within MZ twin pairs as between pairs. These findings, along

with the rapid decrease in risk rates from MZ twins to DZ twins and siblings to more

distant relatives (the latter being very low; e.g., Delong & Dwyer, 1988), indicate that

the genetic mechanisms of autism are not simple or Mendelian in nature and are likely

to involve epistatic effects involving interactions among several genes (Pickles et al.,

1995; Rutter, 2000; Szatmari, 1999). Studies involving linkage analysis more directly

indicate the presence of multiple susceptibility loci for autism (e.g., Risch et al., 1999;

Yonan et al., 2003). The existence of MZ twins discordant for autism and high

phenotypic variability within twin pairs also suggests that environmental or other

factors play a role, although it remains unclear what these factors may be. Bailey,

Palferman, Heavey, and Le Couteur (1998) favour genetic instability (e.g., caused by a

somatic mutation), gene-environment interactions, and/or stochastic factors as

explanations for variability in phenotypic expression.

5.2 The broad phenotype

Numerous studies have now demonstrated that milder forms or lesser variants of autistic

symptomatology, which do not meet criteria for a diagnosis of autism, are frequently

exhibited in relatives of individuals with autism (Bolton et al., 1994; Landa et al., 1992;

Le Couteur et al., 1996; Pickles et al., 2000; Piven et al., 1990b, 1991, 1994; Piven,

Palmer, Jacobi, Childress, & Arndt, 1997b), giving rise to the notion of a spectrum of

autistic traits or “broad phenotype” of autism. As described earlier, studying the

characteristics of the broad phenotype in non-autistic relatives is a useful method of

identifying which traits are primary to autism. Exploration of the broad phenotype has

also been helpful in identifying possible genetic mechanisms (e.g., whether traits are

contributed by both parents) and in increasing the power to identify genes linked with

autism. If the broad phenotype is considered to be a collection of individual traits, each

of which could be related to one of the several genes which combine to cause autism,

then using the broader phenotype in linkage analysis can boost the power to find genes

involved in autism by increasing the number of “affected” individuals available for

analysis (Folstein, Bisson, Santangelo & Piven, 1998; Piven, 1999).

5.2.1 The behavioural phenotype

Most studies attempting to identify or define the broad autism phenotype have focussed

on documenting behavioural signs, generally either by conducting family history

interviews about the presence of social and communicative difficulties and repetitive

behaviours in family members, or by conducting more direct interviews and

assessments of personality characteristics and psychiatric disorders. As already noted,

family history studies (which have generally used the Family History Interview, a semi-

structured interview specifically designed to assess the broad autism phenotype) have

consistently found social abnormalities, communicative difficulties, and repetitive

stereotyped behaviours in a substantial minority of relatives of individuals with autism

(Bailey et al., 1995; Bolton et al., 1994; Le Couteur et al., 1996).

The use of personality assessment tools such as the Personality Assessment

Schedule (PAS; Tyrer, 1988) has revealed that parents of children with autism rate

significantly higher than controls on several personality characteristics relating to social

interaction such as aloof, untactful, shy, schizoid, oversensitive to criticism and lacking

in empathy (Murphy et al., 2000; Narayan, Moyes, & Wolff, 1990; Piven et al., 1991,

1994, 1997c; Wolff, Narayan, & Moyes, 1988). Using Baron-Cohen, Wheelwright,

Skinner, Martin, and Clubley’s (2001b) Autism Spectrum Quotient (a self-report

questionnaire designed to assess features of the broad autism phenotype), Bishop et al.

(in press-a) recently found elevated ratings on the “social skills” and “communication”

subscales in parents of children with ASDs1. Abnormal pragmatic communication

styles have also been detected in some parents using both interviews and direct

assessments of narrative discourse (Landa et al., 1992; Wolff et al., 1988), although

structural language skills are usually found to be intact (Bishop et al., in press-b;

Pilowsky, Yirmiya, Shalev, & Gross-Tsur, 2003). A history of language delay appears

to be a more equivocal finding, with most studies reporting language delay in only a

small proportion of relatives (for a review, see Bailey et al., 1998). Obsessional traits

and repetitive behaviours have been found to be relatively less common than social and

communicative impairments in relatives of autistic individuals (Bailey et al., 1995;

Bolton et al., 1994), although Piven et al. (1997c) found fairly high rates (almost 50%)

of the personality trait “rigid” in the parents of multiplex families in their study. As

pointed out by Bailey et al. (1998), the infrequency of behaviours in this category in the

broad phenotype may be a consequence of the insensitivity or inappropriateness of the

measures used rather than reflecting the secondary or unimportant nature of those

symptoms in autism. In support of the importance of repetitive behaviours, Silverman

et al. (2002) found that the severity of repetitive behaviours showed a high level of

familiality in multiplex families, whereas there was little evidence for familiality in

social or verbal communication domains.

The risk of psychiatric disorders other than autism in relatives of autistic

individuals has also been found to be elevated. In particular, increased rates of major

depression have been documented in parents (Bolton, Pickles, Murphy, & Rutter, 1998;

Piven et al., 1990b, 1991; Piven & Palmer, 1999; Smalley, McCracken, & Tanguay,

1995). In all of these studies, the majority of depressive episodes have been found to

occur prior to the birth of the child with autism, suggesting that they cannot be

explained by the burden of caring for a disabled child. The incidence of anxiety

disorders may also be increased in relatives of autistic probands, but these findings have

1 These parents were part of the WAFSASD, and therefore were the parents of the probands and siblings in the current research.

been less consistent. Piven et al. (1991) found an elevated rate of anxiety disorder in

parents, and increased rates of social phobia have also been reported (Piven & Palmer,

1999; Smalley et al., 1995), but other studies have not found evidence of phobic

disorders (Bolton et al., 1998; Piven et al., 1991) or anxiety disorders in general (Bolton

et al., 1998). However, two studies have found higher rates of obsessive-compulsive

disorder in relatives of autistic probands (Bolton et al., 1998; Hollander, King, Delaney,

Smith, & Silverman, 2003) with the recent study by Hollander et al. showing that the

occurrence of obsessive-compulsive traits or disorder in parents of multiplex families

was significantly more likely if the autistic children showed high levels of repetitive

behaviours. There is no consistent evidence for higher rates of other psychiatric

disorders such as schizophrenia or substance abuse (Bolton et al., 1998; Piven et al.,

1991; Smalley et al., 1995).

5.2.2 The cognitive phenotype

While studies of the behavioural features of the broad autism phenotype have been

informative, individual behavioural signs suffer from the problem of low diagnostic

specificity (Bailey et al., 1998), and are therefore of limited utility as unique indicators

of the broad phenotype. In addition, because behavioural phenotypes are multiply

determined and have indirect and complex relationships with underlying genotypes (i.e.,

the same genotype can give rise to different phenotypes, and the same phenotype can

arise from a range of genotypes; Gottesman & Gould, 2003; Karmiloff-Smith et al.,

2002), they are not an ideal basis for identifying genetic mechanisms. Researchers have

therefore concurrently searched for a more basic subclinical marker of the autism

genotype – or “endophenotype” - at the level of cognition. An endophenotype may be

described as an “intermediate phenotype” or “vulnerability marker” which is unseen by

the unaided eye (e.g., a neurophysiological, biochemical, endocrinological, or cognitive

feature – i.e., not at the level of behaviour) and is somewhere between the disorder’s

phenotype and the distal genotype (Gottesman & Gould, 2003). Endophenotypes are

believed to represent a genetic liability to the disorder in unaffected individuals, and

may only be indirectly related to classic symptoms of the disorder (Leboyer et al., 1998;

Skuse, 2001). The identification of endophenotypes for complex genetic disorders may

help address questions about aetiology and establish markers for diagnosis and

classification (Gottesman & Gould, 2003).

The presence of a cognitive endophenotype is suggested when unaffected

relatives of individuals with autism show a raised incidence of a cognitive deficit (or

strength) that is associated with autism, but to a milder degree than in individuals with

autism themselves (Hill & Frith, 2003). Studies of the cognitive phenotype have tended

to focus either on the IQ profiles of relatives of autistic individuals or have investigated

the presence of specific deficits in ToM, EF, and central coherence.

5.2.2.1 General intellectual ability

Because approximately 70% of individuals with autism are mentally retarded

(Fombonne, 2003) and autistic individuals in general tend to have better Performance

than Verbal IQ (e.g., Lockyer & Rutter, 1970), several studies have examined the

possibility that the broad phenotype may be similarly characterised by an increased

incidence of mental retardation and/or a Verbal-Performance IQ discrepancy. Several

small early studies involving very low-functioning autistic probands found a higher rate

of mental retardation in their relatives than in the general population (August, Stewart,

& Tsai, 1981; Baird & August, 1985; Minton et al., 1982). However, larger and more

recent studies have not replicated this result, finding that mental retardation occurs only

in association with autism and not in isolation, or at least at no greater incidence than

for the general population (Bailey et al., 1995; Folstein et al., 1999; Fombonne, Bolton,

Prior, Jordan, & Rutter, 1997; Freeman et al., 1989; Piven et al., 1990b; Smalley &

Asarnow, 1990; Szatmari et al., 1993). This suggests that the genetic liability for autism

is not usually for mental retardation alone (Bailey et al., 1998). The discrepancy

between earlier and later studies may be due to the severe retardation of the probands in

earlier studies. Consistent with this possibility, August et al. (1981), Baird and August

(1985) and Boutin et al. (1997) all reported higher rates of cognitive disabilities

(including language delay, learning disabilities, and mental retardation) in relatives of

low-functioning probands (but see Piven et al., 1990b, and Szatmari et al., 1993, both of

whom found no association between the proband’s IQ and the cognitive and academic

functioning of their relatives; Starr et al., 2001 also found comparable familial loading

for the broad phenotype in low and high IQ autism families).

Although there does not appear to be an increased incidence of mental

retardation, a number of studies have found significantly lower Verbal or Performance

IQs than controls and/or significant discrepancies between verbal and non-verbal ability

in first-degree relatives of individuals with autism. Consistent with the pattern typically

observed in autistic individuals, Minton et al. (1982) found that siblings of autistic

children had significantly lower VIQ than PIQ on the WISC-R and WAIS. Similarly,

Leboyer, Plumet, Goldblum, Perez-Diaz, and Marchaland (1995) found that siblings of

autistic females showed significantly lower verbal abilities than siblings of Down

syndrome controls, but there was no difference in visuospatial abilities across the two

siblings groups. The lower verbal abilities in the siblings of autistic probands appeared

to be due to a proportion of brothers who showed particularly discrepant verbal and

visuospatial abilities. However, other studies examining the IQ profile of relatives of

autistic probands have either reported no IQ differences from controls at all (Freeman et

al., 1989; Ozonoff, Rogers, Farnham, & Pennington, 1993; Szatmari et al., 1993) or

have found exactly the opposite pattern of discrepancy. Three large-scale studies have

found small but significant VIQ-PIQ discrepancies in parents of individuals with autism

whereby VIQ was significantly higher than PIQ (Folstein et al., 1999; Fombonne et al.,

1997; Piven & Palmer, 1997). Fombonne et al. (1997) found this pattern in both parents

and siblings of autistic probands irrespective of the test version used (WISC-R versus

WAIS) and after controlling for SES. Folstein et al. (1999) observed superior VIQ to

PIQ only in parents, finding no difference in siblings. While Fombonne et al. (1997)

and Piven and Palmer (1997) both used Down syndrome controls, in the former study

VIQ was significantly higher in the autism relatives and there was no difference

between the groups in PIQ, whereas in the latter study PIQ was significantly lower in

the Down syndrome relatives and there was no difference in VIQ.

There may be a number of reasons for these inconsistencies regarding the

presence and direction of VIQ-PIQ discrepancies in relatives of autistic individuals.

Firstly, siblings and parents of autistic probands do not appear to demonstrate the same

IQ profile, with most sibling studies finding superior PIQ to VIQ or no difference (with

the exception of Fombonne et al., 1997), while studies involving parents tend to find the

opposite pattern. It has been argued that parents are by definition “selected” for

parenthood in that their social and communicative functioning must be sufficient for

partnership and children, and they may therefore be less impaired than siblings in

domains such as VIQ (e.g., Piven & Palmer, 1997). Secondly, there is some evidence

that there may be at least two subgroups of parents (and possibly siblings) showing

different IQ profiles. In Folstein et al.’s (1999) study, parents with early language

delays demonstrated lower VIQ than parents without language delays and no VIQ-PIQ

discrepancy, leading the authors to suggest that there may be two or more patterns of IQ

performance in parents of autistic probands. Consistent with this, Freeman et al. (1989)

reported that approximately equal numbers of relatives showed VIQ-PIQ discrepancies

in both directions (although this could equally reflect random differences). While

subgroups of parents (and siblings) showing better VIQ than PIQ appear to contradict

the pattern typically found in individuals with autism, several studies have now shown

that children with high-functioning autism and Asperger syndrome often show higher

VIQ than PIQ (Goodman, 1989; Klin, Volkmar, Sparrow, Cicchetti, & Rourke, 1995;

Szatmari et al., 1990). The possibility that parents in general may be less impaired than

siblings is therefore consistent with the finding that parents more often show IQ

discrepancies in favour of VIQ (mirroring the pattern observed in higher-functioning

probands). There is also the possibility that that high-functioning autism is genetically

different to low-functioning autism (MacLean et al., 1999; Szatmari et al., 2002), and is

associated with different IQ profiles in relatives; direct correlations between proband

and relative IQ have generally not been reported, however. Finally, the role of other

methodological differences between studies such as the IQ subtests used, sampling

methods, unit of analysis (aggregation of familial data versus inclusion of individual

sibling scores), age and gender of the probands and/or relatives, and range of ASD

diagnoses included are yet to be clarified.

5.2.2.2 Specific cognitive deficits

Given the variability in studies of IQ profiles, researchers have increasingly turned their

attention to the investigation of specific cognitive deficits as potential endophenotypes

for autism. Studies of the specific cognitive phenotype have been driven by concurrent

research on primary cognitive deficits in autism, focussing on the three main current

cognitive theories: ToM, EF, and weak central coherence. This not only allows more

precise delineation of the broad cognitive phenotype, but also represents a method of

testing the primacy of deficits in those domains to autism.

To date, only three published studies have examined the mentalising abilities of

relatives of individuals with autism, with contrasting results. Ozonoff et al. (1993)

found no significant differences between siblings of autistic individuals and learning-

disabled controls on three ToM tasks. However, they acknowledged that according to

their power analyses, the ToM measures used were not sensitive enough to detect any

deficits in non-autistic siblings, and they suggested the use of higher-level tasks. Baron-

Cohen and Hammer (1997) employed Baron-Cohen et al.’s (1997) Eyes Task with

parents of children with Asperger syndrome. They found that both mothers and fathers

in the Asperger group showed subtle but significant impairment on the task compared

with control mothers and fathers. Using the same task, a recent study by Dorris, Espie,

Knott, & Salt (2004) replicated these findings in siblings of children with Asperger

syndrome, who displayed poorer performance on the task compared with control

siblings.

EF performance in relatives of autistic probands has been addressed in several

studies, most of which have focussed on measures of planning and set-shifting.

Significantly poorer performance by parents of individuals with autism compared with

control parents (including parents of children with learning disabilities and Down

syndrome) on Tower tasks (i.e., the Towers of Hanoi and London and the Stockings of

Cambridge test from the CANTAB battery) was found by Hughes, Leboyer and

Bouvard (1997) and Piven and Palmer (1997), and the same result in siblings was

obtained by Ozonoff et al. (1993) and Hughes, Plumet, and Leboyer (1999). In Hughes

et al.’s (1997) study, the difference in planning ability was restricted to fathers only, and

in both of the studies by Hughes et al. (1997, 1999) a planning deficit was restricted to

a subset of the relatives of autistic probands, with group differences only emerging

clearly when the proportions of participants showing a deficit were compared.

Findings of no group differences on the WCST in parents (Szatmari et al., 1993)

or siblings (Ozonoff et al., 1993) were initially suggestive of intact cognitive flexibility

in relatives of individuals with autism. However, two subsequent studies using the

IDED set-shifting task found that a subset of both parents and siblings of autistic

probands demonstrated difficulties with the extra-dimensional shift stage of the task

(Hughes et al., 1997, 1999). Hughes et al. (1999) postulated that this discrepancy

between the results observed using the WCST and the IDED task may be due to the fact

that the IDED task involves a total change of stimuli at each shift and so perseverative

responses are limited to high-level dimensional shifting difficulties rather than specific

exemplars. However, an argument that the WCST is lower-level than the IDED task is

inconsistent with findings on probands themselves, who generally show difficulties on

the WCST more often than the IDED task. It may be the case that subsets of the

samples tested with the WCST may have shown a deficit as in the Hughes et al. studies,

but this was not directly examined.

The two studies by Hughes et al. (1997, 1999) also incorporated working

memory measures, with different patterns of results for parents and siblings. Both

studies included a spatial working memory task involving a high demand for strategy

use and therefore purportedly the “central executive” (Baddeley, 1986), and a simple

spatial span task with low executive or strategic requirements which served as a control

task. Parents of autistic probands showed intact spatial spans, but fathers made a

significantly higher number of errors than normal control fathers on the more strategic

working memory task (Hughes et al., 1997). However, there was no difference between

the fathers of autistic probands and the fathers of learning disabled controls, indicating

that the deficit was not unique to autism families. By contrast, siblings of autistic

probands showed superior spatial spans to siblings of both developmentally delayed and

normal controls (as well as better verbal short-term memory for recently presented

items), but there were no group differences on the more strategic working memory task

(Hughes et al., 1999). Together, these results suggest that a working memory deficit is

not as reliable or unique a characteristic of the broad phenotype as problems with

planning and set-shifting.

Other components of EF such as inhibition and generativity have not been as

well studied in relatives of autistic probands. Hughes et al. (1999) included a verbal

generativity task (word fluency) in their battery with siblings, and found both a

significant group difference overall in the number of words generated and a higher

proportion of “low fluency” participants in the autism sibling group. This promising

result requires replication and extension to parent samples using both verbal and non-

verbal generativity tasks. Similarly, tests of both verbal and non-verbal inhibition are

yet to be employed with either siblings or parents.

The possibility that weak central coherence may characterise the broad autism

phenotype has also received some attention. No evidence of a relative strength in the

Block Design subtest from the Wechsler scales, which purportedly indicates weak

central coherence (Shah & Frith, 1993; Happé, 1994c), was found by Szatmari et al.

(1993) or Fombonne et al. (1997) in parents or siblings, even when the analysis was

restricted to relatives with the broad phenotype. Using the Embedded Figures Test,

arguably a more direct test of central coherence, Baron-Cohen and Hammer (1997)

found that both mothers and fathers of children with Asperger syndrome were faster to

identify hidden shapes (indicating a tendency for piecemeal, detail-focussed

processing). Happé, Briskman, and Frith (2001) included a larger range of both verbal

and visuospatial measures of central coherence with both parents and siblings, and

found that parents of children with autism – particularly fathers – showed a significant

bias towards piecemeal processing across the four tasks used compared with parents of

children with dyslexia and with no disorder. There were no significant differences

among the sibling groups, however.

Research on specific cognitive deficits has therefore revealed that deficits in

ToM, EF, and central coherence may all be characteristic of the broader phenotype, but

that results across studies are often inconsistent. In all three domains, significant

differences have been found in parents of autistic probands but often not in their

siblings, contrary to the notion that parents should be less impaired than siblings

because of the selection for parenthood described earlier. Even on measures of planning

and set-shifting where significant differences among sibling groups were found, Hughes

et al. (1999) noted that the deficits were not as strong in siblings as in parents. While it

is not clear that the selection for parenthood should extend beyond social and

communicative capabilities to cognitive characteristics, these parent-sibling

discrepancies still require explanation. Hughes et al. (1999) proposed that the use of

computerised tasks may favour young participants over parents, but this does not

account for studies using non-computerised tasks. Happé et al. (2001) suggested that

the tasks used may not be sufficiently sensitive in younger subjects. However, many

studies, including theirs, have found differences in parents (who one would expect to be

at a higher level than their children) using the same tasks as for siblings2. These authors

also suggested that genetically determined cognitive weaknesses may only emerge at a

certain age or become more pronounced with age. This would be an unusual finding in

domains such as ToM and certain aspects of EF, though, which typically develop

relatively early in life. These parent-sibling discrepancies therefore remain difficult to

explain.

A number of issues appear worthy of further investigation in future broad

phenotype studies. Firstly, identification of the profile of EF performance in relatives of

individuals with autism on tasks measuring the full range of EF components is yet to be

achieved and will augment research on EF deficits in probands. Secondly, comparison

of inter-task correlations in relatives of probands with autism versus control relatives

could be informative, with Hughes et al. (1997, 1999) finding unusual associations

between task performances in both parents and siblings of autistic individuals, which

they interpreted as suggesting the use of different strategies in performing the tasks.

Thirdly, given that cognitive deficits are often only found in a subset of relatives, it

remains to be seen whether this subset display both ToM and EF deficits and therefore

represent a general “cognitive impairment” subgroup, or whether there are different

subgroups with different types of cognitive deficit (most studies have only examined

2 Low sensitivity may be a result of floor effects in children as well as ceiling effects, but there is no evidence of floor level performance in the sibling studies reported above.

one cognitive domain). Fourthly, the relationship between performance on cognitive

tasks and the presence of certain behavioural traits is another important issue. Hughes

et al. (1997) found a modest but significant correlation between a composite EF score

and interviewers’ pre-test impressions of social abnormalities in parents of autistic

probands, and Briskman, Happé, and Frith (2001) found that parents of autistic

individuals who reported more preference for nonsocial activities in everyday life

tended to show weaker central coherence on testing. These findings show that the

cognitive weaknesses observed in the broad autism phenotype may hold relevance in

accounting for subtle behavioural traits also displayed by parents and siblings, but the

nature and specificity of these cognitive-behavioural relationships remains unclear, and

could be important for assessing the validity of the “multidimensional spectrum”

version of the multiple primary deficits model of ASDs (see Section 4.4.4 in Chapter 4).

Finally, while several studies have examined relationships between the IQ of the

proband and the behavioural or cognitive traits of family members, no published studies

have directly correlated performances of probands and relatives on the same ToM or EF

tasks. This could be a useful method of identifying which aspects of cognitive

functioning in autism are the most highly familial (and therefore which may be most

strongly coded in the autism genotype). In sum, greater specification of the cognitive

and behavioural characteristics of the broad autism phenotype will hopefully aid

progress in identifying relationships between genotype, endophenotype, and behavioural

phenotype both in autism and its milder variants. These issues are examined in Study

CHAPTER 6

Study Two: Theory of Mind and Executive Function in Siblings of Individuals with Autism Spectrum Disorders

6.1 Introduction 6.1.1 Aims 6.1.2 Hypotheses

6.2 Method

6.2.1 Participants 6.2.2 Procedure

6.3 Results

6.3.1 Sibling group comparisons on ToM and EF tasks 6.3.1.1 False belief tasks 6.3.1.2 Tower of London 6.3.1.3 IDED Set-shifting task 6.3.1.4 Response Inhibition and Load task 6.3.1.5 Opposite Worlds task 6.3.1.6 Pattern Meanings 6.3.1.7 Uses of Objects 6.3.1.8 Stamps task 6.3.1.9 Summary of sibling group comparisons

6.3.2 Comparisons between ASD siblings and ASD probands 6.3.3 Ability of cognitive variables to predict sibling group membership 6.3.4 Proband-sibling relationships within the ASD families

6.3.4.1 Correlations between proband IQ and siblings’ cognitive performances

6.3.4.2 Correlations between probands’ and siblings’ cognitive performances

6.3.5 Prevalence of deficits in ASD siblings 6.3.6 Correlations between ToM and EF 6.3.7 Dissociations between ToM and EF 6.3.8 Results from behavioural measures

6.4 Discussion

6.4.1 Endophenotype status of ToM and EF impairments 6.4.2 Differentiating the multiple deficits models

6.1 Introduction

6.1.1 Aims

As reviewed in Chapter 5, weaknesses in both ToM and EF have been identified in

parents and/or siblings of autistic probands in previous studies, but findings have been

inconsistent and there are several empirical issues yet to be examined. Study Two is an

investigation of ToM and EF deficits in siblings of individuals with ASDs. The main

aims of this study were i) to identify whether ToM or EF performance meets criteria for

an endophenotype or vulnerability marker for the autism genotype, and thereby seek

confirmation of the results of Study One regarding the relative primacy of ToM and EF

in ASDs; and ii) to collect further information relevant to distinguishing the various

multiple deficits models presented in Chapter 4 (Section 4.4.4). These aims were

addressed in several ways.

i) Aim 1: Determining endophenotype status. In this study, those ToM and EF

tasks on which probands with ASDs showed significantly poorer performance than

control probands in Study One were administered to siblings of individuals with ASDs

(“ASD siblings”) and control siblings. These tasks therefore included measures of false

belief understanding, planning, and both verbal and non-verbal inhibition and

generativity (even though non-verbal inhibition was found to be intact in probands using

the RIL task, that task was administered because probands showed difficulties in the

condition combining working memory and inhibitory requirements). Although only

marginal differences were found between proband groups on the IDED set-shifting task,

this task was also administered with siblings to enable comparison with Hughes et al.’s

(1999) previous findings on siblings using the original IDED task. The inclusion of

both verbal and non-verbal inhibition and generativity tasks represents an extension to

previous research.

Gottesman and Gould (2003, p. 639) outline five criteria for the identification of

an endophenotype. The following numbered points list these criteria and describe how

they were tested in the current research.

1. “The endophenotype is associated with illness in the population”. This was

demonstrated in Study One, which showed that both ToM and EF deficits were

associated with having an ASD diagnosis.

2. “The endophenotype is primarily state-independent (manifests in an individual

whether or not illness is active)”. This criterion is somewhat irrelevant in the case

of ASDs, as the disorder is present throughout the lifetime of the affected individual

(unlike, for example, depression or schizophrenia). Of course, one would still

expect the endophenotype to manifest in affected individuals.

3. “Within families, endophenotype and illness co-segregate”. This was indirectly

assessed by examining whether there was an increased incidence of ASDs in

siblings of ASD probands as compared with the normal population (the control

siblings were not used as a comparison group in this situation because having any

child with a clinical diagnosis of an ASD was an exclusion criterion for control

families). However, comparisons between affected siblings and controls on ToM

and EF variables (to assess whether the siblings with ASDs showed a similar

profile of deficits as the ASD probands) were not conducted because the size of the

affected group was too small for meaningful analyses (see Section 6.2.1),

particularly on some tasks which were administered to participants within a

restricted age range.

4. “The endophenotype found in affected family members is found in nonaffected

family members at a higher rate than in the general population”. This was tested by

examining whether there were any group differences in ToM or EF performance

between ASD and control siblings which remained significant after siblings with

ASD diagnoses were excluded. The ability of any deficits to discriminate ASD

siblings from control siblings was also calculated, as any useful endophenotype

should be unique to the disorder in question (Skuse, 2001).

5. “The endophenotype is heritable”. This was assessed by calculating correlations

between the ToM and EF performances of the ASD siblings in Study Two, and i)

the level of intellectual ability and ii) the ToM and EF performances of the ASD

probands in Study One, with the assumption that significant correlations would be

suggestive of a degree of familiality to the trait.

A sixth feature would also be expected, which is:

6. “The severity of the endophenotype in nonaffected family members is milder than

in the affected family members” (Slaats-Willemse, Swaab-Barneveld, de

Sonneville, van der Meulen, & Buitelaar, 2003). Relative severity of any ToM and

EF deficits in unaffected siblings as compared with affected probands was assessed

by comparing the effect sizes of any significant differences between sibling groups

with effect sizes for the proband groups.

Hence, this study addressed criteria 4, 5, and 6. The ability of ToM and/or EF deficits to

sufficiently meet these three criteria would suggest that they could be endophenotypes

for ASDs and therefore that they have some degree of primacy to ASDs. If one deficit

is better able to meet these criteria than the other, this would suggest superior relative

primacy.

ii) Aim 2: Testing multiple deficits models. If ToM and/or EF deficits were

identified in ASD siblings, the pattern of results would have implications for the various

versions of the multiple deficits model presented in Section 4.4.4 of Chapter 4.

Although the “third primary deficit” and “different developmental stages” versions of

the multiple primary deficit model were not tested in this study, it was possible to

examine (somewhat indirectly) the plausibility of the other two versions (the

“subgroups” and “multidimensional spectrum” models). If there were different

subgroups of ASDs displaying different ToM and EF profiles, then assuming these

subgroups corresponded with different ASD genotypes, it would be predicted that

similar subgroupings would be evident in the broad autism phenotype. This would be

demonstrated by results indicating the presence of “ToM-impaired, EF-intact” and “EF-

impaired, ToM-intact” siblings as was the case for probands (although impairments

would be more subtle), and furthermore it would be expected that siblings

demonstrating a particular ToM-EF profile would be the siblings of probands

demonstrating that same profile. These possibilities were examined in several ways.

Firstly, the prevalence of any deficits identified in group comparisons was calculated, to

examine whether they appeared to occur only in a subset of ASD siblings. Secondly,

correlations between ToM and EF in both ASD and control sibling groups were also

conducted to investigate whether ASD siblings showed a similar independence between

ToM and EF as was the case in probands; or if not, whether they may show unusual

patterns of association between the two domains. Thirdly, the incidence of ToM-EF

dissociations was examined. Finally, the correlations between proband and sibling

scores on ToM and EF tasks would be indicative of possible familial aggregation of

ToM-EF profiles, as described earlier.

The version of the multidimensional spectrum model examined in this study was

the one in which ToM and EF were purported to underlie different domains of

symptomatology, although it is acknowledged that other versions (which may be more

plausible based on the results of Study One) are possible. Abnormal social behaviours

and repetitive behaviours in ASD siblings were both measured in this study. Although

it would be expected that unaffected ASD siblings would not show symptoms of autism

even if they showed a cognitive endophenotype, under this version of the spectrum

model it would still be predicted that ToM or EF weaknesses would be associated with

increased levels of symptomatology in the relevant domain, even if that

symptomatology was subclinical. Therefore, this model was examined by first

analysing sibling group differences on behavioural measures (to confirm that some

subclinical symptomatology was present in ASD siblings), and then conducting

correlations between ToM and EF performances and these behavioural measures within

the ASD sibling group. However, it was recognised that this methodology may be

subject to the same problems as cognitive-behavioural correlations conducted in Study

6.1.2 Hypotheses

Predictions for endophenotype status. Given that neither ToM nor EF deficits met all of

the criteria for primacy in Study One, no strong predictions were made with regard to

whether or not either domain would adequately meet all criteria for an endophenotype

of ASDs, although previous research has suggested that both ToM and EF have

potential endophenotype status. More confident predictions could be made in terms of

relative primacy, as EF deficits were found to be relatively more primary in Study One.

On this basis, it would be expected that i) significant weaknesses in ASD siblings on EF

tasks (which are less severe than the deficits displayed by probands) would be more

likely than on ToM tasks, and would be better able to predict group membership; and ii)

correlations between the EF performance of ASD siblings and probands would be

stronger than correlations between ToM performances of ASD siblings and probands.

Furthermore, it would be expected that the EF variables which demonstrated the

strongest evidence of primacy in ASD probands (i.e., verbal inhibition and verbal

generativity) would be the most likely variables on which weaknesses in ASD siblings

would emerge. However, given the concerns with interpretation of some of the findings

relevant to primacy in Study One, the possibility remained open that a ToM deficit may

also meet criteria as an endophenotype to an equal degree as EF deficits.

Predictions for multiple deficits models. Based on the results of Study One and on

previous research, it was predicted that if ToM and EF weaknesses were found, they

would only be evident in a subset of ASD siblings. Beyond that, however, given that

the results of the analyses relevant to the different multiple deficits models were very

much dependent upon the results of analyses relevant to determining endophenotype

status, no specific predictions were made prior to conducting the study. This aspect of

the study may therefore be considered exploratory.

6.2 Method

6.2.1 Participants

Siblings of ASD Group (“ASD siblings”)1. There were 108 siblings in this group,

ranging in age from 4 to 29 years. These siblings came from 68 families, thus in some

cases there was more than one sibling per family. Six siblings had received clinical

diagnoses of ASDs: three with diagnoses of autism, one with Asperger syndrome, and

two with PDDNOS. Three additional siblings had received diagnoses indicating

language impairment. As for the control group in Study One, autistic symptomatology

in siblings was assessed using the ASQ, and the ADI-R was administered for anyone

scoring above 10. All six siblings with clinical diagnoses of ASDs and two of the three

with language impairment met either full or partial criteria for autism on the ADI-R. In

addition, two other siblings without clinical diagnoses met partial criteria on the ADI-R.

Hence, there were 10 siblings altogether who met criteria for an ASD, which is 9.3% of

the ASD sibling group – a 31-fold increase compared to the population prevalence for

ASDs, which is around 0.3% (including ASDs besides autism; Fombonne, 2003).

Exclusion criteria were the same as for Study One (genetic abnormalities or

neurological dysfunction, except for epilepsy). No siblings were excluded for these

reasons. There were 10 ASD siblings with other clinical diagnoses (6 with ADHD, 2

with epilepsy, 1 with dyspraxia, and 1 with dyslexia).

Siblings of Control Group (“Control siblings”). Sixty-seven control siblings ranging in

age from 4 to 24 years participated in the study. These siblings came from 49 families.

No siblings in this group had clinical diagnoses of ASDs or exceeded the cutoff

1 This group included siblings of some probands with ASDs who were not included in Study One of this thesis because they were too low-functioning (but who were recruited as participants in the WAFSASD). Given that the ASD siblings were themselves matched with the siblings of the control group on age and PIQ, the inclusion of the siblings of low-functioning children with ASDs was considered to be valid. The relationship between siblings’ performance on cognitive tasks and the level of functioning of the proband was also examined, as reported in Section 6.3.4.1.

criterion on the ASQ. Three control siblings had other clinical diagnoses (2 with

ADHD and 1 with epilepsy).

Demographic characteristics of each group are presented in Table 22. The ASD and

control siblings were matched on chronological age, t(173) = .90, p > .1, and PIQ,

t(173) = .08, p > .1. All participants had a PIQ and VIQ of 60 or above. ASD siblings

had significantly lower VIQs than control siblings, t(173) = 2.28, p < .05. However,

when ASD siblings who met full or partial ADI-R criteria were excluded, the difference

in VIQs became only marginally significant, t(163) = 1.74, p = .08. There was a higher

proportion of girls in the control sibling group (68.7% vs. 42.6% in the ASD sibling

group), χ2 (1, N = 175) = 11.27, p < .01. This is likely to be due to the fact that in

attempting to select proband samples matched on gender, often the male child in the

family was selected as a control proband, resulting in a higher proportion of female

siblings in the control sibling group. To ensure sibling group comparisons were not

influenced by gender, it was included as an independent variable in all analyses. This

also enabled evaluation of any group by gender interactions (e.g., it may be that brothers

of ASD probands show greater heritability of autistic-like cognitive traits than sisters).

Table 22. Demographic characteristics of the sibling samples

ASD siblings

ASD siblings,

ADI-R subgroup

excluded

Control siblings

N 108 98 67

Age: Mean (SD, range) 11.33 (5.38, 4-29) 11.62 (5.42, 4-29) 10.61 (4.71, 4-24)

Male: Female 62: 46 53: 45 21: 46

PIQ: Mean (SD, range) 107.63

(17.08, 70-149)

(17.7, 70-149)

107.42

(16.18, 58-146)

VIQ: Mean (SD, range) 102.41

(15.02, 66-141)

103.67

(14.37, 66-141)

107.61

(14.14, 72-138)

With an n of 108 in the ASD sibling sample and 67 in the control sibling sample, the

power of the study to detect medium sized effects (i.e., d = .5) at an alpha level of .05

was excellent at .94.

6.2.2 Procedure

The same tests, questionnaires and interviews were used in this study as in Study One2,

all of which are described in Chapter 3. The procedure for this study was also identical

to that used in Study One (see Section 4.2.2 of Chapter 4).

6.3 Results

This section includes analyses addressing i) group comparisons between ASD and

control siblings on ToM and EF tasks, both before and after exclusion of siblings who

met ADI-R criteria for an ASD; ii) the relative severity of any weaknesses in ASD

siblings compared with ASD probands; iii) the ability of task variables to predict

ASD/control sibling group membership; iv) relationships between probands’ and

siblings’ scores; v) the prevalence of deficits in the ASD sibling group; vi) correlations

between ToM and EF variables; vii) dissociations between ToM and EF performances;

and viii) results from behavioural measures. Hence, analyses i) to iv) assess the

endophenotype status of ToM and EF, and v) to viii) are aimed primarily at assessing

the subgroup and multidimensional spectrum models. As for Study One, SPSS Version

10.0.5 was used for all analyses. Data screening was handled in the same way as for

Study One (see Section 4.3.1 in Chapter 4).

6.3.1 Sibling group comparisons on ToM and EF tasks

The consistent approach to group comparisons that was used in Study One (as described

in Section 4.3.2) was also followed in this study. The ASD and control sibling groups

remained matched on age and PIQ for all tests which were administered to participants

within a restricted age range. For some tasks (all false belief tasks, the Opposite Worlds

task, the RIL task, the IDED set-shifting task, and the Stamps task), the two groups of

siblings who completed those tasks, including participants meeting full or partial criteria

on the ADI-R, were also matched on VIQ.

As mentioned in Section 6.2.1, gender was included as an independent variable

in all sibling group comparisons on continuous variables. In the case of dichotomous

2 Although there were no group differences on the Dewey stories and Relational Complexity tasks in the proband study, these tasks were actually also administered to the siblings in this study. Hence, the order and length of task administration was identical across the two studies. Results from these tasks were analysed out of interest and there were no significant differences between the sibling groups.

variables, gender effects were assessed by conducting separate chi-square analyses for

brothers (i.e., ASD brothers versus control brothers) and sisters (ASD sisters versus

control sisters). These separate analyses are only reported if there were significant

group differences for one gender but not another (or if group differences were in

opposite directions for the two genders); otherwise, only the results of overall chi-

square analyses including both genders are reported. For all variables, separate means

and standard deviations (or proportions of high/low scorers) for brothers and sisters are

only reported if there were significant group by gender interactions on the task, or if

displaying separate data for brothers and sisters was meaningful for other reasons.

Similarly, in analyses where age, PIQ and/or VIQ were covaried or controlled, gender

was only included as an independent variable if significant interactions involving

gender had been found in initial group comparisons.

In all sibling group comparisons where there were significant group differences,

analyses were repeated after excluding all ASD siblings who met full or partial criteria

on the ADI-R. Results of these repeat analyses are reported in each of the relevant

sections.

As in Study One, the influence of participants with non-ASD diagnoses (of

which there were 10 in the ASD sibling group and 3 in the control sibling group) was

checked by repeating all group comparisons after excluding these participants from the

sample. This did not affect any of the results (i.e., both non-significant and significant

differences remained so, both before and after exclusion of participants meeting full or

partial ADI-R criteria), with the exception of the Stamps task complexity score. The

change in the result for this variable is reported in Section 6.3.1.8.

6.3.1.1 False belief tasks

As in Study One, a large proportion of participants gained perfect scores for both belief

and control questions on all false belief tasks. All variables were recoded as

dichotomous such that a perfect score was coded as 1 and any other score as 0.

Four participants (two ASD siblings and two control siblings) were not

administered the First-order and Second-order false belief tasks due to equipment

malfunction. These participants had all passed the Simple false belief task and were

therefore assigned the mean value of other participants in their group who had passed

the Simple false belief task. The overall sample size for all false belief tasks (which

were administered to participants within a restricted age range) was 148 (87 ASD

siblings and 61 control siblings). As for Study One, the ns for the memory and reality

questions, as well as the “own belief” questions in the Simple false belief task, were

limited to those who actually did the task (as these questions were not assumed to be

passed or failed according to performance on other false belief tasks, as was the case for

the belief questions). Percentages of participants gaining perfect scores on belief

questions (i.e., “perfect scorers”) in each group for each false belief task are presented

in Table 23.

i) Simple false belief task. Chi-square analyses revealed that there was no

statistically significant difference between the ASD and control siblings on reality

questions, χ2 (1, N = 51) = 1.34, p > .1, belief questions referring to the participant’s

own previous belief, χ2 (1, N = 51) = .84, p > .1, or belief questions referring to other’s

beliefs, χ2 (1, N = 148) = .95, p > .1. There were no significant differences when

brothers and sisters’ results were analysed separately.

Performance on the questions relating to the participant’s own previous belief

was significantly correlated with VIQ, r = .34, p < .05. Performance on others’ belief

questions was significantly correlated with both age, r = .33, p < .001, and VIQ, r = .35,

p < .001. Group remained a non-significant predictor of performance on both own

belief questions, z = .1, p > .1, and others’ belief questions, z = .48, p > .1, when VIQ

(and age in the case of the latter variable) was controlled using logistic regression.

ii) First-order false belief task. The performance of ASD and control

siblings did not differ significantly on reality questions, χ2 (1, N = 113) = .31, p > .1,

memory questions, χ2 (1, N = 113) = .01, p > .1, or belief questions, χ2 (1, N = 148) =

1.20, p > .1. Results were the same for brothers and sisters.

Performance on belief questions was significantly correlated with both age, r =

.51, p < .001, and VIQ, r = .20, p < .05. In a logistic regression with age, VIQ and

group as predictors of performance on belief questions, the independent contribution of

group remained non-significant, z = 1.97, p > .1.

iii) Second-order false belief task. Again, there was no significant difference

between the ASD and control siblings on reality questions, χ2 (1, N = 127) = .06, p > .1,

memory questions, χ2 (1, N = 127) = .09, p > .1, or belief questions, χ2 (1, N = 148) =

.11, p > .1, and no difference in the results when brothers and sisters were examined

separately.

Scores on belief questions were significantly correlated with age, r = .47, p <

.001, and VIQ, r = .34, p < .001. Group was not a significant predictor of performance

on belief questions in a logistic regression with age, VIQ and group as the predictors, z

= .04, p > .1.

iv) Overall false belief performance indices. As for the probands, an

aggregate score and a more lenient alternative aggregate score were calculated for

siblings. There was no significant group difference in the proportion of perfect scorers

on the aggregate score, χ2 (1, N = 148) = .01, p > .1, or in the proportion of high scorers

on the alternative aggregate, χ2 (1, N = 148) = .57, p > .1. Results were the same when

brothers and sisters were analysed separately.

Both aggregate scores were correlated with age (r = .52, p < .001, for the

aggregate score and r = .48, p < .001, for the alternative aggregate) and VIQ (r = .32, p

< .001, for the aggregate score and r = .34, p < .001, for the alternative aggregate).

When logistic regressions were performed with age, VIQ, and group as the predictors,

group was not a significant predictor of either the aggregate score, z = .42, p > .1, or the

alternative aggregate, z = .14, p > .1.

Table 23. False belief task results: Percentage of siblings in each group with perfect

scores [or high scores for the alternative aggregate] on belief questions, and significance

of group comparisons

ASD siblings Control siblings p p with age/

IQ control

Simple false belief:

Own belief 74.2 85.0 - -

Others’ belief 90.8 95.1 - -

First-order false belief 82.8 75.4 - -

Second-order false belief 74.7 77.0 - -

Aggregate score 71.3 70.5 - -

Alternative aggregate [80.5] [85.2] - -

* p < .05; ** p < .01; *** p < .001; - p > .05.

6.3.1.2 Tower of London (ToL)

As in Study One, the number of rule violations per block administered was highly

skewed and was recoded as a dichotomous variable, with participants making 0-1

violations per block being given a score of 0 (“low rule violators”) and participants

making any higher number of violations scored as 1 (“high rule violators”).

Two ASD siblings had missing data on the ToL, and were not included in

analyses. A two-way ANOVA comparing the total adjusted extra move scores of the

ASD siblings and control siblings revealed no significant effect of group or gender, and

no significant interaction; largest F(1, 169) = .06, p > .1 (ASD siblings: M = 21.65, SD

= 8.72; Control siblings: M = 21.99, SD = 9.60). A chi-square analysis also showed that

the proportion of high rule violators in the ASD sibling group (25.5%) did not differ

significantly from the proportion in the control sibling group (29.9%), χ2 (1, N = 173) =

.40, p > .1. This difference was non-significant for both brothers and sisters.

The total adjusted extra moves score correlated significantly with age, r = -.73, p

< .001, and rule violations were significantly correlated with both age, r = -.58, p <

.001, and VIQ, r = -.17, p < .05. An ANCOVA conducted on the total adjusted extra

move score revealed that the group difference remained non-significant when age was

covaried, F(1,170) = .50, p > .1. Group also remained a non-significant predictor of

rule violation status (low/high) when age and VIQ were assessed independently in a

logistic regression, z = .19, p > .1.

6.3.1.3 IDED Set-shifting task

All set-shifting variables were again highly skewed, and all variables were recoded such

that any participant making 0 or 1 errors was assigned a score of 0 (“low error scorers”)

and any participant making a higher number of errors was given a score of 1 (“high

error scorers”).

The overall N for the task (which had a restricted age range) was 129 (81 ASD

siblings and 48 control siblings). Due to computer malfunction, data for the

Perseveration condition from one ASD sibling were invalid and not included in

analyses. The percentage of low error scorers for each stage in each task condition is

displayed in Table 24. There were no significant group differences overall on any

variable. When brothers and sisters were analysed separately, a significant difference

was observed in the SD stage of the Learned Irrelevance condition such that there was a

higher proportion of high error scorers among sisters of ASD probands than among

control sisters, χ2 (1, N = 70) = 5.08, p < .05. There was no significant difference

between the brother groups on this variable, and no discrepancies in the results from

brothers and sisters on other variables.

Errors made on the SD stage of the Perseveration condition correlated

significantly with both age, r = -.24, p < .01, and PIQ, r = -.17, p < .05. Age was also

significantly correlated with errors made in the Learned Irrelevance condition on the SD

stage, r = -.21, p < .05, and the IS stage, r = -.19, p < .05. Group remained a non-

significant predictor of performance on these variables when logistic regressions were

performed with age and group (and PIQ where relevant) as predictors; the largest z =

1.80, p > .1. When brothers and sisters were analysed separately for the Learned

Irrelevance SD stage variable, group was again a significant predictor of performance

on that variable for sisters when age was accounted for, z = 4.14, p < .05. The group

difference also remained significant when sisters meeting full or partial ADI-R criteria

were excluded, χ2 (1, N = 69) = 5.29, p < .05.

Table 24. IDED Set-shifting task results: Percentage of low error scorers in each sibling

group for each stage of each task condition, and significance of group comparisons

IQ control

Perseveration condition:

SD stage 76.3 75.0 - -

SDR stage 57.5 70.8 -

CD stage 71.3 75.0 -

IDS stage 76.3 83.3 -

EDS stage 68.8 75.0 -

Learned Irrelevance condition

SD stage – brothers only 74.3 85.7 - -

SD stage – sisters only 80.0 97.1 * *

SDR stage 77.8 81.3 -

CD stage 70.4 83.3 -

IDS stage 77.8 87.5 - -

EDS stage 27.2 25.0 -

* p < .05; ** p < .01; *** p < .001; - p > .05.

6.3.1.4 Response Inhibition and Load (RIL) task

For all RIL conditions, error variables (representing the percentage of errors made) were

again highly skewed, with many participants making a low percentage of errors. These

variables (with the exception of the shape error score) were recoded such that 0-2%

errors was coded as 0 (a “low error score”), and any higher percentage of errors was

coded as 1 (a “high error score”). However, as for Study One, the main error variables

used in analyses were the inhibition error difference score, load error difference score,

and inhibition + load error difference score. These difference scores were normally

distributed. Five outliers were trimmed: one ASD sibling’s inhibition and load error

difference scores, and the inhibition error difference score for two other ASD siblings

and one control sibling. The distribution of the shape error score was slightly positively

skewed but transformation was not considered necessary.

Median RT variables for all conditions demonstrated roughly normal

distributions, but again, an inhibition RT difference score, load RT difference score, and

inhibition + load RT difference score were also calculated. Five outliers were trimmed:

one control sibling’s inhibition RT difference score, one ASD sibling’s load and

inhibition + load RT difference scores, and the inhibition + load RT difference scores of

one ASD and one control sibling.

The overall N for the task was 126 (79 ASD siblings and 47 control siblings).

Table 25 displays the mean and SD of each group (and the significance of group

comparisons) for error and RT difference scores, and the shape error score. On the error

difference scores, there were no significant main effects of group or gender and no

significant group x gender interactions. There were no significant differences in any of

the individual conditions when error data were examined separately for each condition

(either overall or for brothers or sisters). The shape error score did not differ

significantly between ASD and control siblings, F(1, 122) = 1.90, p > .1, and there was

no significant effect of gender, F(1, 122) = .40, p > .1, and no significant interaction,

F(1, 122) = .01, p > .1. On both the RT difference scores and the separate RT data for

each condition, there were no significant main effects of group or gender and no

significant interactions. In subsequent analyses, only the error and RT difference scores

were used, and separate error and RT data for Conditions 1-3 were not included (nor

was gender included as a factor).

There were a number of significant correlations between age and IQ variables

and both error and RT difference scores from the RIL task. The inhibition + load error

difference score correlated significantly with VIQ, r = -.24, p < .01. The shape error

score correlated significantly with age, r = -.39, p < .001, PIQ, r = -.30, p < .01, and

VIQ, r = -.18, p < .05. The inhibition RT difference score was significantly correlated

with age, r = -.24, p < .01, and VIQ, r = -.24, p < .01. Finally, the inhibition + load RT

difference score correlated significantly with age, r = -.20, p < .05. Group differences

remained non-significant (or group was a non-significant predictor) for all of the above

variables, with the exception of the shape error score, when age and/or IQ variables

were partialled out using ANCOVA or logistic regression. For the shape error score,

group differences became significant when age, PIQ, and VIQ were introduced as

covariates in an ANCOVA (VIQ was not “blocked” and used as an additional IV

because the groups were matched on VIQ for the RIL task), F(1, 121) = 4.40, p < .05.

This indicates that when extraneous variance caused by age and IQ factors was

removed, ASD siblings were found to make a significantly higher number of errors than

control siblings on a measure of working memory on a task where inhibition was

required. Importantly, this group difference in the shape error score remained

significant when siblings who met full or partial ADI-R were excluded from the sample,

F(1, 116) = 4.08, p < .05.

Table 25. RIL task results: Mean (and SD) of each sibling group, and significance of

group comparisons, for error and RT difference scores and the shape error score

IQ control

Error difference scores:

Inhibition 0.76 (2.80) 1.10 (3.39) -

Load 0.73 (3.67) 0.35 (3.47) -

Inhibition + load 1.43 (3.88) 1.49 (3.50) - -

RT difference scores:

Inhibition 159.35 (147.76) 140.92 (141.82) - -

Load 145.44 (132.66) 188.73 (137.83) -

Inhibition + load 304.52 (203.07) 329.05 (216.90) - -

Working memory measure:

Shape error score 13.42 (13.57) 9.50 (12.02) - *

* p < .05; ** p < .01; *** p < .001; - p > .05.

6.3.1.5 Opposite Worlds task

No transformations were required on Opposite Worlds task variables. Two ASD

siblings and two control siblings demonstrated outlying scores (two on the Same World

error score, one on the Same World time score, and one on both the Same and Opposite

World time scores), which were trimmed. Means and SDs for all variables are

The N for the task was 100 (56 ASD siblings and 44 control siblings). For the

error scores, a three-way repeated measures ANOVA was conducted with group and

gender as between-subjects factors and condition (Same World, Opposite World) as the

within-subjects factor. There was a significant main effect of condition, F(1, 96) =

19.77, p < .001, but there was no significant main effect of group, F(1, 96) = .24, p > .1,

or gender, F(1, 96) = 1.80, p > .1. There was a significant interaction between group

and condition, F(1, 98) = 4.63, p < .05, but no other significant interactions. Follow-up

simple effects analyses showed that there was no significant difference between the

groups in the Same World error score, t(98) = 1.31, p > .1, or the Opposite World error

score, t(98) = 1.02, p > .1, however the control siblings demonstrated a significantly

larger error difference score than the ASD siblings (as reflected in the interaction).

Examination of the pattern of results suggested that this was due to a combination of

both the ASD siblings tending to make slightly more Same World errors than the

control siblings, and the control siblings making slightly more Opposite World errors

than ASD siblings.

Time scores were also analysed using a three-way repeated measures ANOVA

with group and gender as the between-subjects factors and condition as the within-

subjects factor. There was a significant main effect of condition, F(1, 96) = 182.56, p <

.001, but no significant effect of group, F(1, 96) = 1.55, p > .1, or gender, F(1, 96) =

.02, p > .1. The interaction between group and gender was not significant, F(1, 96) =

.02, p > .1, however there was a marginally significant interaction between group and

condition, F(1, 96) = 3.93, p = .05, and a significant interaction between gender and

condition, F(1, 96) = 9.51, p < .01. The group x gender x condition interaction was not

significant, F(1, 96) = .27, p > .1. With regard to the group x condition interaction,

follow-up analyses showed that there was a marginally significant difference between

the groups in the Same World time score such that ASD siblings took slightly longer

than control siblings, t(98) = 1.81, p = .07, but no significant difference in the Opposite

World time score, t(98) = .77, p > .1. In terms of the gender x condition interaction,

follow-up analyses indicated that there was no significant gender difference in either the

Same World time score, t(98) = .64, p > .1, or the Opposite World time score, t(98) =

.98, p > .1, but brothers demonstrated a significantly larger time difference score than

sisters (as reflected in the interaction). The pattern of results suggested that this was due

to a tendency both for sisters to take slightly longer in the Same World condition and for

brothers to take slightly longer in the Opposite World condition.

Table 26. Opposite Worlds results: Mean (and SD) and significance of group

comparisons for each sibling group for error/time scores in each condition and

difference scores, and for each gender for time scores

IQ control

Error variables:

Same World error score 1.22 (1.41) 0.87 (1.27) - -

Opposite World error score 1.68 (1.86) 2.05 (1.67) - -

Error difference score 0.42 (1.70) 1.12 (1.56) * *

Time variables:

Opposite World time score 31.75 (8.95) 30.47 (7.39) - -

Time difference score 6.19 (5.60) 6.94 (4.16) - *

Brothers Sisters p p with age/

IQ control

Opposite World time score 32.05 (9.03) 30.42 (7.57) - -

Time difference score 7.97 (5.27) 5.24 (4.42) ** **

* p < .05; ** p < .01; *** p < .001; - p > .05.

Note: The difference scores relate to the interaction term on repeated measures

ANOVAs.

Age was significantly correlated with all task variables: the Same World error score, r =

-.26, p < .01, Opposite World error score, r = -.27, p < .01, Same World time score, r = -

.50, p < .001, and the Opposite World time score, r = -.56, p < .001. VIQ correlated

significantly with the Same World error score, r = -.20, p < .05, and the Opposite World

time score, r = -.27, p < .01. Age and VIQ were introduced as covariates (VIQ was

covaried because the groups were matched on VIQ for this task) in a two-way group x

condition repeated measures ANCOVA on the error scores and three-way ANCOVA on

the time scores (including gender as a between-subjects factor, as there were

interactions involving gender for the time scores). There was no change in any of the

results with the exception that the group x condition interaction on the time scores

increased in significance, F(1, 94) = 6.56, p < .05. This interaction remained significant

when siblings meeting full or partial criteria on the ADI-R were excluded, F(1, 90) =

6.94, p < .05. The interaction between group and condition for the error scores also

remained significant when siblings meeting ADI-R criteria were excluded, F(1, 94) =

6.69, p < .05 (age and VIQ were not covaried in this analysis as there was no difference

in the original result when these variables were accounted for).

6.3.1.6 Pattern Meanings

Individual error types were not analysed in this study (this level of detail was not

considered essential, particularly given that the ASD and control groups in Study One

did not show significant differences in error variables). As for Study One, the sum of

errors variable was skewed and transformed using a logarithm equation. The number of

correct responses variable was normally distributed.

There was no significant difference in the number of correct responses produced

by ASD siblings (M = 26.71, SD = 9.55) and control siblings (M = 26.52, SD = 8.13),

no significant effect of gender on this variable, and no significant group by gender

interaction; largest F(1, 171) = 1.90, p > .1. The sum of errors was not significantly

different between ASD siblings (Median = 4, Range = 0-42, prior to transformation) and

control siblings (Median = 5, Range = 0-56), F(1, 171) = 2.58, p > .1. There was a

trend for brothers to make more error responses than sisters, F(1, 171) = 3.65, p = .06,

but the interaction between group and gender was not significant for the sum of errors

variable, F(1, 171) = .16, p > .1.

The sum of errors was correlated with age, r = -.51, p < .001, and VIQ, r = -.24,

p < .01. Group comparison of the sum of errors remained non-significant in an

ANCOVA with group and VIQ level as the IVs and age as a covariate, F(1, 168) =

2.23, p > .1, and the interaction between group and VIQ level was not significant, F(2,

168) = .06, p > .1.

6.3.1.7 Uses of Objects

As for the Pattern Meanings task, individual error types were not analysed. No

transformation was necessary for the sum of errors variable, but four outliers were

trimmed (for 2 ASD siblings and 2 control siblings). The total number of correct

responses, as well as the number of correct responses for conventional and non-

conventional items separately, were all normally distributed. Table 27 displays means

and SDs for these variables for each sibling group, and the significance of group

comparisons.

One ASD sibling had missing data and was not included in analyses. In a three-

way repeated measures ANOVA on the number of correct responses with group and

gender as the between-subjects factors and condition (conventional, non-conventional)

as the within-subjects factor, there was a significant main effect of condition such that

more correct responses were generated in the non-conventional condition than in the

conventional condition, F(1, 170) = 305.84, p < .001, but no significant main effect of

group, F(1, 170) = .67, p > .1, or gender, F(1, 170) = 2.50, p > .1. There were no

significant interactions between any variables. Separate totals for conventional and

non-conventional items were not used in further analyses. In a two-way group x gender

ANOVA on the sum of errors, there was no main effect of group F(1, 170) = 1.12, p >

.1. Brothers (M = 19.38, SD = 13.56) produced significantly more error responses than

sisters (M = 15.09, SD = 12.60), F(1, 170) = 5.21, p < .05, but there was no significant

interaction between group and gender, F(1, 170) = .01, p > .1.

The number of correct responses was significantly correlated with both age, r =

.51, p < .001 and VIQ, r = .24, p < .01. The sum of errors also correlated significantly

with both age, r = -.36, p < .001, and VIQ, r = -.22, p < .01. Group differences in both

correct responses and the sum of errors remained non-significant in ANCOVAs where

group and VIQ level were the IVs and age was covaried, and there were no significant

interactions between group and VIQ level in either analysis.

Table 27. Uses of Objects results: Mean (and SD) of each sibling group, and

significance of group comparisons

IQ control

Correct responses:

- Total 25.57 (10.32) 27.55 (11.46) - -

- Conventional items 19.57 (7.80) 21.0 (8.99)

- Non-conventional items 22.76 (8.22) 24.34 (9.75)

Sum of errors 16.73 (12.69) 17.72 (14.04) - -

* p < .05; ** p < .01; *** p < .001; - p > .05.

6.3.1.8 Stamps task

Both the rule adherence and restriction scores demonstrated highly skewed distributions

and were recoded as dichotomous variables, in the same way as for Study One. For rule

adherence, a score between 0 and 6 inclusive was coded as 0 and a score of 7 or 8 was

coded as 1. For restriction, a score of 0 was left as 0 and a score between 1 and 8

inclusive was coded as 1. The complexity and originality scores were approximately

normally distributed. Means and SDs for the latter two variables and the proportion of

low scorers for the former two variables, along with the significance of group

comparisons for all scores, are presented in Table 28.

Table 28. Stamps task results: Mean (and SD) of each sibling group [or the percentage

of low scorers for dichotomous variables], and significance of group comparisons

IQ control

Complexity score 18.90 (3.28) 20.18 (3.86) * *

Originality score 4.21 (3.04) 4.34 (2.64) - -

Restriction score [94.6] [88.7] -

Rule adherence score [19.8] [21.3] - -

* p < .05; ** p < .01; *** p < .001; - p > .05.

The N for the task was 154 (92 ASD siblings and 42 control siblings). There was a

significant group difference on the complexity score, F(1, 150) = 4.68, p < .05,

indicating that the ASD siblings produced less complex patterns than control siblings.

The effect of gender was not significant for this variable, F(1, 150) = .10, p > .1, and the

interaction between group and gender was not significant, F(1, 150) = .02, p > .1. In a

two-way ANOVA on the originality scores, there was no significant effect of group or

gender, and no significant interaction, largest F(1, 150) = .67, p > .1. Chi-square

analyses revealed that there was no significant group difference in the percentage of low

scorers on the restriction score, χ2 (1, N = 154) = 1.77, p > .1, or the rule adherence

score, χ2 (1, N = 154) = .05, p > .1. Results were the same when brothers and sisters

were analysed separately.

Originality scores were significantly correlated with both age, r = .46, p < .001,

and VIQ, r = .23, p < .01. Age also correlated significantly with complexity scores, r =

.44, p < .001, and rule adherence scores, r = .35, p < .001. In a two-way ANCOVA

with group and VIQ level as the IVs and age as a covariate, the group difference in the

originality score remained non-significant, F(1, 147) = .03, p > .1. The interaction

between group and VIQ level was not significant, F(2, 147) = .20, p > .1. When a

logistic regression was performed on the rule adherence score, group remained a non-

significant predictor when it was assessed independently of age, z = .01, p > .1. In an

ANCOVA with age as a covariate, the group difference in the complexity score

remained significant, F(1, 151) = 5.78, p < .05. The difference also remained

significant when participants meeting full or partial criteria on the ADI-R were excluded

from the sample, F(1, 142) = 4.0, p < .05. However, when participants with non-ASD

diagnoses were additionally excluded, the group difference in the complexity score

became only marginally significant, F(1, 131) = 2.96, p = .09.

6.3.1.9 Summary of sibling group comparisons

In summary, ASD siblings performed significantly more poorly than control siblings on

tasks measuring working memory (under conditions where inhibition was required) and

non-verbal generativity. When participants with non-ASD diagnoses were excluded,

the group difference on the non-verbal generativity measure became only marginally

significant. Sisters of ASD probands also made more errors than sisters of control

probands on the simple first stage of the IDED set-shifting task Learned Irrelevance

condition. Control siblings showed larger error and time difference scores on the

Opposite Worlds test, which appeared to be a fairly spurious result which was equally

attributable to ASD siblings performing slightly (but not significantly) more poorly on

the Same World condition and control siblings performing slightly (but not

significantly) more poorly on the Opposite World condition. There were no significant

group differences on measures of ToM, planning, set-shifting (i.e., in the EDS stages),

non-verbal inhibition, or verbal generativity.

All of the significant group differences remained significant when ASD siblings

meeting full or partial criteria on the ADI-R were excluded, which indicates that

criterion 4 for endophenotype status was met (see Section 6.1.1). However, as the

poorer performance on the simple first stage of the IDED set-shifting task in sisters of

ASD probands did not correspond with a deficit displayed by the ASD probands

themselves, this suggests that the weakness displayed by ASD sisters on this task was

not indicative of an endophenotype for ASDs (as it violates criterion 1). Therefore, this

variable is not included in subsequent analyses examining other criteria for

endophenotype status. In addition, because the group differences on the Opposite

Worlds error and time difference scores appeared to be spurious outcomes deriving

from two non-significant differences in opposite directions and therefore did not

represent meaningful strengths or weaknesses in the ASD sibling group (i.e., were not

candidates for an endophenotype), these variables were not included in subsequent

analyses either. Hence, the RIL task shape error score and the Stamps task complexity

score were the only two candidate endophenotype variables remaining.

6.3.2 Comparisons between ASD siblings and ASD probands

If these two variables are possible endophenotypes for ASDs, then it would be expected

that the performance displayed by ASD siblings would be poorer than that of control

siblings, but not as poor as that of ASD probands. To compare the severity of deficits

across these three groups, it was decided not to directly compare the scores, as the

groups were not matched on PIQ or VIQ (and therefore any differences could be

attributable to those variables). Instead, the effect sizes of the differences found

between ASD and control siblings were calculated and compared with the effect sizes of

the differences between the two proband groups in Study One. These two sets of effect

sizes are presented in Table 29. It is evident that the effect sizes for the sibling group

differences (both small effects) are smaller than those for the proband group differences

(both medium effects), consistent with predictions.

Table 29. Effect sizes, r (and d), of significant group differences between sibling

groups and between proband groups

Measure ASD versus control

siblings

ASD versus control

probands

Working memory:

RIL task shape error score .15 (.31) .27 (.56)

Generativity:

Stamps task complexity score .18 (.36) .28 (.58)

6.3.3 Ability of cognitive variables to predict group membership

In order to examine if either of two candidate endophenotype variables were able to

discriminate ASD siblings from control siblings, a direct logistic regression was

performed with group as the outcome variable, and the RIL task shape error score and

Stamps task complexity score as the predictors. There were 65 ASD siblings and 42

control siblings with data for both predictor variables, and these limited groups were

matched on age (M = 11.27, SD = 2.62 for ASD siblings; M = 11.59, SD = 2.74 for

control siblings), t(105) = .61, p > .1, PIQ (M = 107.69, SD = 17.74 for ASD siblings; M

= 103.83, SD = 15.81 for control siblings), t(105) = 1.15, p > .1, and VIQ (M = 105.65,

SD = 14.60 for ASD siblings; M = 107.43, SD = 13.33 for control siblings), t(105) =

.64, p > .1. Thus, none of these matching variables were included in the regression.

A test of the full model with both predictors against a constant-only model was

statistically reliable, χ2 (2, N = 107) = 8.69, p < .05, indicating that the two predictors

together reliably distinguished ASD from control siblings. 90.8% of ASD siblings and

35.7% of control siblings were classified correctly by the model, with an overall success

rate of 69.2%. This pattern of results suggests that the model was sensitive to ASD

sibling group membership, but not specific. Table 30 presents regression coefficients,

Wald statistics, odds ratios, and their 95% confidence intervals for each predictor.

According to the Wald criterion, only the Stamps task complexity score was a

significant individual predictor of group membership. Of note, when age, PIQ, and VIQ

were included in the regression, the RIL task shape error score also became marginally

significant as a predictor, z = 2.74, p = .098 (note that the group difference in the shape

error score also became significant only after the age and IQ variables were controlled).

Table 30. Results of logistic regression analysis of sibling group membership

95% C. I. for odds ratio

Wald test ___________________

Variables B (z-ratio) Odds ratio Upper Lower

RIL task:

Shape error score -.02 1.85 .98 .95 1.01

Stamps task:

Complexity score .20 5.16* 1.22 1.03 1.44

*p < .05; ** p < .01; *** p < .001.

6.3.4 Proband-sibling relationships within the ASD families

To examine correlations between ASD probands’ and ASD siblings’ scores on cognitive

measures, data were used from one sibling in each family who was closest in age to the

proband. Correlations were conducted for all ToM and EF variables regardless of

whether or not they were tasks on which ASD siblings demonstrated weaknesses, as it

was possible that sibling performances could covary with proband performances even if

the sibling performances were in the normal range. Because of concerns that age would

strongly mediate relationships between probands’ and siblings’ scores, age-scaled

scores were calculated for use in all correlations. For each variable, the regression

equation: predicted score = slope x age + intercept was calculated using the combined

control proband and sibling samples. If the relationship between the variable and age

was curvilinear, the log of age was used in this equation instead, if it resulted in a more

linear relationship (this was the case for ToL rule violations, IDED set-shifting Learned

Irrelevance condition EDS stage errors, the sum of errors on the Pattern Meanings and

Uses of Objects tasks, and the Stamps task complexity score). ASD participants (both

probands and siblings) then had their scores converted to age-scaled z-scores by

subtracting the predicted score from the obtained score and dividing by the standard

error of the estimate. Because linear regression equations could not be calculated for

dichotomous variables, some of these were not used in the proband-sibling correlations

(all IDED set-shifting variables except the EDS stage in the Learned Irrelevance

condition, and the Stamps task rule adherence and restriction scores). For variables

which were not excessively skewed before dichotomisation, the original continuous

form of the variable was used in correlations (the false belief aggregate score, ToL rule

violations, and the IDED Learned Irrelevance EDS stage errors).

6.3.4.1 Correlations between proband IQ and siblings’ cognitive

performances

The relationship between ASD probands’ level of functioning and their siblings’

performances on cognitive tasks was assessed by calculating correlations between

probands’ IQ scores and siblings’ scores on cognitive measures. This was particularly

important as the probands of the sibling groups were not matched on either VIQ or PIQ

(as probands who were part of the WAFSASD but were too low-functioning to

participate in Study One had siblings who were included in Study Two). The IQ data

from the low-functioning probands who were not included in Study One (but whose

siblings were included in Study Two) were included in these correlations. Age-scaled

scores were used for the siblings’ scores, but no z-scores were calculated for probands’

IQs (as these were already age-scaled). If raw correlations were significant, partial

correlations were also conducted where sibling VIQ and PIQ were controlled, to ensure

that the correlations were not mediated by relationships between proband and sibling IQ

(the correlations between proband and sibling PIQ and VIQ were both significant; PIQ:

r = .25, p < .01; VIQ: r = .30, p < .01).

Table 31. Raw and partial correlations between proband PIQ and VIQ and siblings’ scores on ToM and EF measures Proband IQ score Sibling score on cognitive task PIQ VIQ False belief aggregate score .45** .25* .32* .08 ToL: Adjusted extra moves score .24* .08 .26* .07 Rule violations -.11 -.10 IDED set-shifting Learned Irrelevance condition EDS stage errors

.01 .16

RIL task: Inhibition error difference score -.13 -.29* -.05 Load error difference score -.13 .0 Inhibition + load error diff. score -.20 -.25 Inhibition RT difference score -.15 -.09 Load RT difference score .24 .24 Inhibition + load RT diff. score .06 .10 Shape error score -.05 -.01 Opposite Worlds: Error difference score .05 -.01 Time difference score -.19 .03 Pattern Meanings: Correct responses -.07 -.12 Sum of errors -.22 -.02 Uses of Objects: Correct responses .03 -.06 Sum of errors -.20 -.16 Stamps task: Complexity score -.11 .05 Originality score -.08 -.07 *p < .05; ** p < .01; *** p < .001. Results of these correlations are displayed in Table 31. Siblings’ false belief aggregate

score and ToL adjusted extra moves score both correlated significantly with both the

PIQ and VIQ of probands, and the RIL task inhibition error difference score correlated

significantly with proband VIQ. However, when sibling PIQ and VIQ were controlled,

only the correlation between proband PIQ and siblings’ false belief aggregate score

remained significant (this correlation also remained significant when ASD siblings

meeting full or partial ADI-R criteria were excluded). None of the variables on which

significant differences between sibling groups were observed correlated significantly

with either proband PIQ or VIQ, suggesting that the non-matching of the autistic

probands of Study Two’s participants did not affect the outcome of group comparisons

in this study. Overall, these results indicate that the ToM performance of siblings of

autistic individuals was related to the level of functioning of probands, but EF variables

were not.

6.3.4.2 Correlations between probands’ and siblings’ cognitive

performances

Correlations between probands’ and siblings’ cognitive task performances were limited

to the sample of siblings whose autistic brother or sister participated in Study One.

Only correlations between identical task variables for each family member were

examined (rather than all correlations between different tasks). Surprisingly, there were

no significant raw correlations between probands’ and siblings’ scores on any variable

(therefore, no partial correlations were conducted). This suggests that ToM and EF

performances were not strongly familial.

6.3.5 Prevalence of deficits in ASD siblings

The prevalence of deficits in ASD siblings on the two potential endophenotype

variables was calculated in the same way as the universality of deficits in Study One.

The proportion of ASD siblings scoring below the 16th percentile of control siblings (or

above the 74th percentile in the case of the error variable) was 24.1% on the RIL task

shape error score, and 19.6% on the Stamps task complexity score. Therefore,

impairments on these variables clearly only occurred in a subset of ASD siblings.

6.3.6 Correlations between ToM and EF

Although no ToM deficit was identified in the ASD sibling group as compared with

control siblings, suggesting that the notion of a “ToM-impaired, EF-intact” subgroup

was somewhat invalid, correlations between ToM and EF were still of interest to

determine whether ASD siblings showed an unusual pattern of association between the

two domains (which may suggest that they used different strategies to solve the tasks,

even if no performance decrement was observed). Correlations between ToM and EF

variables were calculated separately for the ASD and control sibling groups. As in

Study One, partial correlations (controlling for the effects of age, VIQ and PIQ) were

also conducted if significant raw correlations were observed.

Table 32 presents the raw and relevant partial correlations between ToM and EF

task variables within control siblings. As for Study One, correlations are displayed

separately for the various false belief variables rather than the overall aggregate score

because the pattern of correlations was different for the three tasks. In the control

sibling group, simple false belief task performance correlated with measures of planning

and verbal generativity (with all correlations in the expected direction, such that poor

false belief performance correlated with poor EF task performance); however when age,

VIQ and PIQ were controlled for, none of these correlations remained significant. First-

order false belief task performance correlated with measures of planning, non-verbal

inhibition (with working memory load), working memory (with inhibition

requirements), verbal inhibition, and both verbal and non-verbal generativity (all in the

expected direction with the exception of the RIL task load error difference score).

However, when age and IQ variables were partialled out, correlations remained

significant only with measures of planning, verbal inhibition, and non-verbal

generativity. Second-order false belief task performance correlated with measures of

planning and verbal and non-verbal generativity (all in the expected direction); with

age, PIQ and VIQ controlled, there were no significant correlations with planning

measures.

Overall, in the control group, ToM variables demonstrated significant

relationships with all EF domains measured except for set-shifting, but several of the

correlations were mediated by age and IQ (there were no significant partial correlations

with non-verbal inhibition measures). All correlations were in the expected direction,

such that poorer performance on EF tasks correlated with poorer false belief task

performance. Ceiling effects on the simple false belief task resulted in a paucity of

significant correlations with EF variables for that task.

Table 32. Raw and partial correlations between ToM and EF variables within control siblings False belief task EF task Simple First-order Second-order ToL (n = 61): Adj.extra move score -.26* -.01 -.61*** -.32* -.55*** -.21 Rule violations -.01 -.49*** -.22 -.37** -.05 IDED Set-shifting Perseveration condition (n = 42): EDS stage errors a -.17 .01 IDED Set-shifting Learned Irrelevance condition (n = 42): EDS stage errors a -.01 -.01 RIL task (n = 41): Error difference scores: Inhibition a -.13 -.09 Load a .32* .31 -.07 Inhibition + load a .18 -.15 RT difference scores: Inhibition a .09 .15 Load a .22 -.01 Inhibition + load a .20 .09 Shape error score a -.35* -.21 -.13 Opposite Worlds (n = 43): Error diff. score a -.34* -.42** -.13 Time diff. score a -.38* -.32* -.12 Pattern Meanings (n = 61): Correct responses .14 .20 .25* .28* Sum of errors -.15 -.34** -.03 -.30* .02 Uses of Objects (n = 61): Correct responses .31* .17 .36** .13 .52*** .34** Sum of errors -.02 -.23 -.19 Stamps task (n = 61): Complexity score .23 .45*** .33* .41** .26* Originality score .06 .28* .07 .24 Restriction score -.16 -.03 -.05 Rule adherence score -.25 -.16 -.19 * p < .05; ** p < .01; *** p < .001. Note: Partial correlations controlled for age, VIQ and PIQ. All tests were two-tailed. Ns listed for each task show the sample size for correlations with the ToM tasks. a = No correlation could be calculated as all participants had perfect scores on the false belief task.

For ASD siblings, correlations were conducted both before and after excluding

individuals meeting ADI-R criteria for an ASD. For the sake of brevity, only

correlations after exclusion of this subgroup are reported, as priority was given to

determining the pattern characteristic of the broad phenotype without any siblings with

ASDs in the sample3. Table 33 displays these raw and partial correlations. In this

group, simple false belief task performance correlated with two measures of non-verbal

generativity (with both correlations in the expected direction), both of which remained

significant when age and IQ variables were partialled out. First-order false belief task

performance correlated with variables from all EF domains tested except for set-shifting

and verbal generativity (all in the expected direction), but only the correlations with one

planning measure, non-verbal inhibition (with a working memory load), and one non-

verbal generativity measure were significant when age and IQ were controlled. Second-

order false belief task performance correlated with measures of planning, non-verbal

inhibition, and verbal and non-verbal generativity (all in the expected direction), but

only one correlation with a non-verbal inhibition measure remained significant when

age and ability variables were partialled out.

Overall, the ASD siblings showed a fairly similar pattern of raw correlations as

the control siblings. However, partial correlations showed a different pattern from

controls, with the ASD siblings showing no significant partial correlations between

ToM measures and verbal inhibition or verbal generativity variables, but demonstrating

significant partial correlations of ToM variables with measures of non-verbal inhibition.

Like control siblings, ASD siblings also showed significant partial correlations between

ToM variables and measures of planning and non-verbal generativity.

Table 34 presents a summary of the significant partial correlations between ToM

and EF domains in the control and ASD sibling groups as well as the control and ASD

proband groups from Study One. The most striking aspect of this table is the clear

relative absence of significant ToM-EF correlations in ASD probands compared with all

other groups. It also shows that while the pattern of correlations displayed by ASD

siblings did not mirror that demonstrated by the ASD probands, it was also qualitatively

different to the pattern displayed by control siblings. Nevertheless, it is additionally

evident that the control groups from both studies did not show identical patterns of

correlation (this is discussed further in Section 6.4.2).

3 When siblings with ASDs were included, the pattern of correlations was similar.

Table 33. Raw and partial correlations between ToM and EF variables within ASD siblings False belief task EF task Simple First-order Second-order ToL (n = 78): Adj.extra move score -.15 -.29** -.04 -.34** -.17 Rule violations -.20 -.24* -.23* -.13 IDED Set-shifting Perseveration condition (n = 58): EDS stage errors a .16 -.04 IDED Set-shifting Learned Irrelevance condition (n = 59): EDS stage errors a .01 .15 RIL task (n = 57): Error difference scores: Inhibition a .0 -.34** -.31* Load a -.41** -.42** .11 Inhibition + load a -.42** -.40** -.20 RT difference scores: Inhibition a -.07 -.02 Load a .20 .04 Inhibition + load a .09 .02 Shape error score a -.32* -.07 -.19 Opposite Worlds (n = 45): Error diff. score a -.09 -.19 Time diff. score a -.29* -.09 -.25 Pattern Meanings (n = 79): Correct responses .14 -.02 -.01 Sum of errors -.09 -.21 -.15 Uses of Objects (n = 79): Correct responses .15 .21 .30** .08 Sum of errors -.10 -.19 -.08 Stamps task (n = 76): Complexity score .33** .24* .38** .30* .24* .09 Originality score .22 .30** .15 .32** .16 Restriction score .05 -.10 -.16 Rule adherence score -.35** -.28* -.28* -.22 -.08 * p < .05; ** p < .01; *** p < .001. Note: Partial correlations controlled for age, VIQ and PIQ. All tests were two-tailed. Ns listed for each task show the sample size for correlations with the ToM tasks. a = No correlation could be calculated as all participants had perfect scores on the false belief task.

Table 34. Summary of partial correlations between ToM and EF variables in the

control and ASD probands and siblings

EF domain Control

siblings

Control

probands

Planning

Set-shifting

Inhibition – Non-verbal *

Inhibition – Verbal

Working Memory

Generativity – Verbal

Generativity – Non-verbal

* Correlations marked with an asterisk were in the opposite direction than expected.

Note: Each tick represents one significant correlation between a false belief and an EF

variable in that domain.

6.3.7 Dissociations between ToM and EF

The presence of ToM-EF dissociations in the ASD sibling group was assessed in the

same way as for Study One. It should again be noted that because ToM was not

impaired in ASD siblings relative to the control siblings, the presence of any “ToM-

impaired, EF-intact” dissociations is somewhat misleading in that ToM was not

impaired in the group as a whole. However, these calculations were still of interest as it

may have been the case that those siblings showing EF deficits were also more likely to

be low scorers on ToM tasks. As in Study One, the false belief alternative aggregate

score was used as the measure of ToM performance (14.8% of control siblings were low

scorers on this variable, indicating that defining ASD siblings with low scores as

“impaired” was comparable with the definition of impairment for continuous variables).

The two candidate endophenotype EF variables were analysed separately. The results

of these calculations are displayed in Table 35, and demonstrate that ToM and EF

impairments did not always co-occur in the same ASD siblings. Rather, the EF-

impaired siblings were equally or more likely to show intact ToM than impaired ToM.

However, it is also notable that both the ToM-impaired and the EF-impaired siblings

were more likely than the sibling group as a whole to demonstrate impairments in the

other domain (e.g., 55.6% of ASD siblings showing impaired non-verbal generativity

also scored poorly on the false belief aggregate, compared with 19.5% of the ASD

sibling group as a whole).

Table 35. The incidence of ToM-EF dissociations in the ASD siblings

EF measure

% of ToM-impaired ASD siblings

with unimpaired EF

RIL task shape error score 50.0 4

Stamps task complexity score 52.9 9

% of EF-impaired ASD siblings

with unimpaired ToM

RIL task shape error score 88.9 18

Stamps task complexity score 44.4 18

6.3.8 Results from behavioural measures

Both the SBQ (which measures current social behaviours) and the RBQ (which

measures the lifetime presence of repetitive behaviours) were completed by parents of

siblings (both are described in Chapter 3, Section 3.5). The ASQ was not used as a

measure of subclinical behavioural traits as it was not considered valid for use in this

way, being designed to discriminate individuals with autism from typically developing

individuals rather than measure the severity of any autistic-like symptomatology in

typically developing individuals. As the RBQ was originally intended only as a

screening measure (not enough siblings demonstrated an adequate number of repetitive

behaviours for analyses using the RBI to be meaningful), only the overall sum was used

rather than composite scores for each behavioural category.

Comparisons between ASD and control siblings on the SBQ and RBQ were

conducted to assess whether there was an increased incidence of behavioural

symptomatology in ASD siblings. These comparisons were conducted both with the

overall group and with siblings meeting full or partial ADI-R criteria excluded. The

summary variables from both measures were highly skewed and were not amenable to

transformation, so non-parametric tests (Mann-Whitney U) were used for group

comparisons. Two ASD siblings and three control siblings had missing data on all three

questionnaires and were not included in analyses.

On the SBQ overall sum, there was no significant difference between ASD and

control siblings, U = 3176.5, N1 = 106, N2 = 64, p > .1, and the difference remained

non-significant when siblings meeting full or partial ADI-R criteria were excluded, U =

2710.0, N1 = 97, N2 = 64, p > .1. On the RBQ, however, there was a trend for parents

to report more repetitive behaviours in control siblings than in ASD siblings, U =

2818.0, N1 = 106, N2 = 64, p = .06, and this difference became significant when

siblings meeting ADI-R criteria for an ASD were excluded, U = 2320.5, N1 = 97, N2 =

64, p < .01. One explanation for these unexpected findings4 could be that parents of a

child with autism were more likely to under-report autistic-like symptomatology in their

non-autistic children, as their benchmark for comparison (e.g., what might be

considered to be repetitive use of language) was set much higher. This seems more

likely than ASD siblings actually displaying less behavioural symptomatology than

control siblings, given previous research on the broad behavioural phenotype (see

Chapter 5, Section 5.2.1). Of note, ASD siblings meeting full or partial ADI-R criteria

scored significantly higher than remaining ASD siblings on both the SBQ and RBQ, as

would be expected, which suggests that these measures successfully discriminated

individuals with ASD diagnoses from those without ASD diagnoses, but were not

accurate measures of symptom severity in individuals without ASD diagnoses.

While it was initially intended to use data from these two questionnaire

measures to examine correlations between cognitive and behavioural measures within

the ASD sibling group, the outcomes of these group comparisons suggest that the

behavioural data are not likely to be a valid indicator of behavioural severity, making

correlations difficult to interpret. When these correlations were conducted with the

overall ASD sibling sample, there were a number of significant correlations between

both ToM and EF variables and behavioural measures (most of which remained

significant when age, PIQ, and VIQ were partialled out), but when siblings meeting full

or partial ADI-R criteria were excluded, many of these correlations became non-

significant. Therefore, because of concerns about their interpretation, these correlations

are not reported.

4 When results from the ASQ were analysed, a similar pattern emerged: when siblings with ASD diagnoses were excluded, it was found that control siblings scored more highly than ASD siblings in both the communication and interests domains as well as overall.

6.3 Discussion

6.4.1 Endophenotype status of ToM and EF impairments

In the introduction to this chapter, six features of endophenotypes were described, with

criteria 4, 5, and 6 being tested in the current study. Did any ToM or EF variables meet

these three criteria?

i) Criterion 4. The first criterion tested in this study was that the

endophenotype should be found in siblings of probands with ASDs at a higher rate than

in the general population (or in this case, a higher rate than siblings of control

probands). Group comparisons between siblings of individuals with ASDs and

siblings of controls revealed few significant differences. There were no significant

group differences on measures of ToM, planning, set-shifting, non-verbal inhibition, or

verbal generativity. However, weaknesses in working memory (within an inhibition

task) and non-verbal generativity emerged as the two main candidate endophenotypes.

Importantly, group differences on these variables remained significant when siblings

with ASDs were excluded. Although the group difference in non-verbal generativity

became only marginally significant when individuals with non-ASD diagnoses were

excluded, this does not necessarily mean that a non-verbal generativity deficit in the

broad phenotype is an unimportant artefact of pathology unrelated to ASDs, but may

reflect the possibility that the broad phenotype is itself characterised by higher rates of

non-ASD diagnoses and these individuals also display a more abnormal cognitive

profile5.

Sisters of ASD probands also performed significantly more poorly than control

sisters on the simple first stage of the IDED set-shifting task (in the Learned Irrelevance

condition). As there were no significant differences observed in latter stages of the task

which involve shifting set, this difference probably reflects either attentional or

motivational differences rather than a deficit in a component of EF. As previously

stated, the fact that ASD probands did not display a deficit on this simple first stage

variable further indicates that it is not likely to represent a useful endophenotype for

ASDs. It is unclear why the difference occurred only in sisters and only in one of the

task conditions, but it may have been that ASD sisters were more prone to fatigue (the

Learned Irrelevance condition was administered after the Perseveration condition).

5 Of the ASD sibling group, 9.3% had a non-ASD diagnosis compared with 4.5% of control siblings.

There were no other significant interactions between group and gender, suggesting that

brothers and sisters of ASD probands were equally susceptible to any inherited

cognitive weaknesses.

Unexpectedly, control siblings showed significantly larger error and time

difference scores on the Opposite Worlds task, which is superficially suggestive of a

strength in verbal inhibition in ASD siblings. However, this difference resulted from a

non-significant weakness in ASD siblings in the control condition combined with a non-

significant weakness in control siblings in the inhibition condition. As the control

siblings were not actually significantly poorer in the inhibition condition, it would be

difficult to argue that the result reflects a strength in inhibition in the ASD siblings, and

it appears more likely that it was a rather spurious outcome of two non-meaningful but

additive differences.

Overall, then, the results relevant to criterion 4 were consistent with the

prediction, based on the results of Study One, that EF deficits would be more likely to

demonstrate superior relative primacy over a ToM deficit as measured by their presence

in siblings of individuals with ASDs. Non-autistic siblings of ASD probands exhibited

a broad cognitive phenotype characterised by weaknesses in the non-verbal generation

of novel ideas and working memory performance in situations combining working

memory and inhibitory requirements, but no impairment in ToM. However, there are a

number of caveats and additions to this conclusion. Firstly, the sensitivity of the ToM

tasks used in this study may not have been sufficient to detect subtle weaknesses in

mentalising abilities (although while ceiling effects were observed on the simple false

belief task, there was a significant proportion of both ASD and control siblings who

showed unstable performance on the other tasks). This limitation is particularly

pertinent given the wide age range of participants in this study, which was larger than in

Study 1. The use of more advanced and naturalistic ToM tasks would strengthen future

broad phenotype studies (previous findings using higher-level ToM tasks are discussed

further below). Nevertheless, it is apparent that the broad autism phenotype is not

characterised by a significant impairment in basic ToM abilities. Secondly, the

prediction that the EF variables which showed the strongest evidence of primacy in

Study One (i.e., verbal inhibition and verbal generativity) would be the most likely to

emerge as endophenotypes in siblings was not borne out. ASD siblings did not show

impairments in either verbal inhibition or verbal generativity (or in planning, which was

also found to be impaired in probands). These negative findings call into question the

primacy of those EF domains to ASDs – although alternatively, it is possible either that

i) the tasks used in these domains were also lacking in sensitivity, or ii) impairments in

these domains as well as in ToM always result in an ASD phenotype, and therefore are

not seen in a milder form in unaffected relatives. Thirdly, only non-verbal generativity

performance significantly predicted membership in the ASD sibling group (although the

RIL task shape error score also became a marginally significant predictor when age and

IQ variables were included in the regression); and while the two variables together

successfully predicted membership of the ASD sibling group in 90.8% of cases, they

also misclassified 64.3% of control siblings. Hence, their utility as endophenotypes is

limited by their poor uniqueness or specificity as markers of genetic vulnerability.

How do the results of sibling group comparisons compare with previous studies?

As reviewed in Section 5.2.2.2 of Chapter 5, studies on cognitive deficits in siblings of

autistic probands have generally found fewer and smaller differences than studies with

parents, and in that sense this study is consistent with other sibling studies6. Only one

previous study has employed similar false belief tasks with siblings of children with

autism (Ozonoff et al., 1993), which also found no evidence of mentalising deficits.

However, the ToM tasks used in both Ozonoff et al.’s and the current study may not

have been difficult or high-level enough to detect subtle weaknesses, especially in older

siblings. Dorris et al.’s (2004) finding of impaired performance on the higher-level

Eyes Task in siblings of children with Asperger syndrome suggests that ToM

difficulties may indeed be revealed if more advanced ToM tasks are used, although the

“purity” and validity of the Eyes task as a measure of ToM is questionable (Dorris et al.,

2004).

EF in siblings of probands with ASDs has been investigated in two previous

studies (Hughes et al., 1999; Ozonoff et al., 1993). Neither the interaction between

working memory and inhibition or non-verbal generativity were tested in these two

studies, so the positive results in these domains in the current study are new findings.

Unlike this study, planning difficulties in ASD relatives were reported in both of the

two previous studies, and Hughes et al. (1999) also found evidence of weaknesses in

set-shifting and verbal generativity in their ASD sibling sample. The sample size was

much larger for the current study than for either of these two previous studies (108

siblings of probands with ASDs compared with 18 in Ozonoff et al. and 31 in Hughes et

al.), ruling out power as an explanation for these discrepancies. As was the case in this

study, Ozonoff et al. also included siblings of probands with other ASDs besides

6 Moreover, the parents of probands with ASDs who were tested as part of the WAFSASD did show more EF deficits than the siblings (Wong, Maybery, Bishop, Maley, & Hallmayer, in preparation).

autism, however Hughes et al. included only siblings of probands with a full diagnosis

of autism, and the mean IQ of the probands in that study was also lower than in this

study (even though siblings of the lower-functioning probands in WAFSASD were

included, the mean PIQ and VIQ of the probands was still higher in this study than in

Hughes et al.’s study). It is possible that the broad cognitive phenotype is expressed

more strongly when the proband has a full autism diagnosis and is lower-functioning,

therefore explaining the increased incidence of planning, set-shifting and verbal

generativity deficits in the siblings in that study. However, contrary to this explanation,

there were no significant partial correlations between proband IQ and sibling

performance on planning, set-shifting or verbal generativity tasks in this study.

Another possible explanation for the discrepancies in the EF results obtained

across sibling studies is that the tasks employed differed in such a way as to favour the

siblings in this study. As discussed in Section 4.4.1 of Chapter 4, the administration of

the ToL in this study differed slightly from most other studies in that forward planning

was actively encouraged, which may have bolstered any weaknesses. While Hughes et

al. (1999) used the original version of the IDED set-shifting task which had also been

used to demonstrate set-shifting deficits in probands (Hughes et al., 1994), the modified

version used in this study did not reveal deficits in probands either in Study One of this

research or in high-functioning probands in previous research (Turner, 1997), making it

unsurprising that siblings demonstrated intact performance on it in this study. It is also

interesting, however, that Ozonoff et al. (1993) did not find any evidence of a deficit in

cognitive flexibility in siblings using the WCST. The lack of any significant differences

on the verbal generativity tasks used in this study was a little more surprising, as Turner

(1999) found that autistic probands actually showed more striking deficits on ideational

fluency tasks (used in this study) than on the word fluency task used by Hughes et al.

(1999). However, while only 90s were allowed to generate responses in this study,

Hughes et al. allowed 120s, which may have been the extra time needed to reveal

generativity difficulties in siblings of autistic probands.

Of additional note, Hughes et al. (1999) did not find overall group differences in

their sibling groups on either of their continuous measures of planning or set-shifting,

but found differences only when particular variables were dichotomised and the

proportions of siblings classified as passers or failers were compared. In this study, set-

shifting variables were already dichotomised (although using a different criterion from

Hughes7), but no differences in the proportion of poor performers were found. When

the current ToL data were re-analysed using a dichotomous pass/fail performance

criterion (such that a “failer” was anyone who completed less than 50% of the problems

in the minimum number of moves), again no group differences were revealed. It

therefore does not appear that existing group differences were merely “hidden” by the

methods of analysis used in this study; furthermore, the fact that Hughes et al. were only

able to find significant group differences on certain variables after dichotomising their

data suggests that the deficits observed lacked robustness and were not highly prevalent.

ii) Criterion 5. The familiality of ToM and EF abilities was assessed by

calculating correlations between the cognitive functioning of ASD siblings and the IQ

and cognitive functioning of ASD probands. The only significant relationship to

emerge (after the mediating effect of sibling IQ was controlled) was between proband

PIQ and siblings’ false belief performance, indicating that siblings were more likely to

perform poorly on false belief tasks if the proband with autism was low-functioning

(non-verbally). Interestingly, this implies a small degree of familiality of ToM

performance in ASD siblings, even though they did not display evidence of a ToM

impairment and the correlation between proband and sibling false belief performance

was not itself significant. There were no significant relationships between siblings’ EF

performances and proband IQ or EF performances. This suggests that siblings’ EF

abilities were not strongly familial, even on the tasks on which they displayed

significant weaknesses. Hence, the results did not support the prediction that EF

performances would be more likely to show evidence of familiality than ToM

performance. Impairments in non-verbal generativity and working memory (in an

inhibitory context), which were identified as potential endophenotypes on the basis of

sibling group comparisons, did not meet the criterion of heritability.

Do these results indicate that autism is not a genetic disorder, that there is no

broad autism phenotype, or that the cognitive impairments displayed by ASD siblings

were random and unrelated to genetic vulnerability? There are several alternative

explanations. It is possible that i) different genetic factors underpin the variation found

in siblings than in probands - for example, many genes are known to cause mental

retardation, but these genes do not influence the normal variation of IQ; ii) the measures

used lacked sensitivity to the milder deficits displayed by siblings, thereby weakening

7 When Hughes’ criterion was used (i.e., achieving six consecutive correct responses on the EDS stage), there were still no significant group differences, although on this task version there were ceiling effects using this method (i.e., the large majority of siblings achieved criterion in both conditions).

proband-sibling correlations; or iii) non-shared environmental factors contributed a

significant amount of variance, therefore reducing the size of correlations. It is also of

note that even under a perfect model of a monogenic disorder, the correlation between

siblings would only be 0.5. The general lack of significant proband-sibling

relationships in this study is consistent with previous studies by Piven et al. (1990) and

Szatmari et al. (1993), both of which found no association between proband IQ and the

cognitive functioning of first-degree relatives. No previous studies have reported direct

correlations between probands’ and siblings’ performances on specific cognitive tasks.

iii) Criterion 6. The notion that any endophenotype displayed in nonaffected

family members would be less severe than in affected probands was not so much a

criterion, but rather an expected feature of endophenotypes. The comparison of effect

sizes of significant differences in this study with the proband differences in Study One

confirmed this expectation, with smaller effect sizes displayed for both candidate

endophenotype variables in this study. However, it should be noted that the smaller

effect sizes could have been caused by a smaller proportion of siblings than probands

showing a deficit, rather than the severity of the deficit being milder across the sibling

sample. It was not possible to directly compare the performances of the probands and

siblings, as they were not matched on PIQ or VIQ8. The proportion of siblings showing

a deficit is discussed further below, in Section 6.4.2.

Concluding comments on endophenotype status. In summary, weaknesses in

working memory (in a context where inhibition was required) and non-verbal

generativity were identified in ASD siblings when compared with control siblings,

making these two variables candidates for endophenotypes of ASDs (ASD sisters also

showed poorer performance on a simple discrimination stage of the IDED set-shifting

task, but this was ruled out as a potential endophenotype because probands did not show

a deficit on that variable). However, while these two variables also met criterion 6 (i.e.,

deficits in those domains were less severe in siblings than in probands), they failed to

meet criterion 5 (heritability). They also lacked specificity as predictors of ASD sibling

group membership (misclassifying a high proportion of control siblings), and they did

not show the strongest evidence of primacy in Study One. Therefore, the evidence for

their validity and utility as endophenotypes for ASDs was not strong or consistent.

8 It was possible to compare the performances of only those probands and siblings who showed a deficit (as defined by a score worse than 1 SD from the mean) on the RIL task shape error score, as the proband and sibling samples were matched on age, PIQ, and VIQ in this case. Consistent with expectation, the ASD siblings showed significantly better performance than ASD probands on that variable when z-scores were compared, t(34) = 2.76, p < .01.

Overall, while EF deficits demonstrated superior relative primacy than a ToM deficit,

neither ToM nor EF showed convincing evidence of their primacy in this study (these

results are fairly consistent with Study One, but were even less compelling). This

outcome occurs in the context of frequent inconsistencies across sibling studies in the

autism field and between studies of probands, siblings, and parents, and probably

reflects the likelihood that genotype-phenotype relationships in the autism spectrum are

complex and indirect, even when the phenotype is at the level of cognition (this is

discussed further in Chapter 7). Nevertheless, further studies incorporating higher-

level, more sensitive tasks may still prove useful in identifying more subtle weaknesses

in family members.

6.4.2 Differentiating the multiple deficits models

The “subgroups” and “multidimensional spectrum” versions of the multiple primary

deficits model of ASDs were both indirectly examined in this study. The lack of any

ToM impairment in ASD siblings was in itself inconsistent with both of these models,

indicating that the notion of a ToM-impaired sibling subgroup or the idea that a ToM

impairment was the basis for certain types of subclinical symptomatology both lacked

support. The prediction that any cognitive deficits would only occur in a subset of ASD

siblings was confirmed, however, with impairments in working memory (in an

inhibition context) and non-verbal generativity being demonstrated by 24.1% and 19.6%

of siblings respectively. This is consistent with previous studies of cognitive abilities in

relatives of individuals with ASDs, and suggests that endophenotypes or markers of

genetic vulnerability are only expressed in a certain subgroup of relatives (although, it is

possible that impairments in other domains not measured here may turn out to be more

prevalent among relatives).

The results of analyses examining the presence of ToM-EF dissociations further

suggested that there may be more than one of these subgroups, each with a different

cognitive profile. While impairment in ToM did not occur with a significantly greater

frequency in ASD siblings than in control siblings, those ASD siblings who did display

an unstable ToM were also more likely to show EF deficits than the ASD sibling group

as a whole (and vice versa, those with a non-verbal generativity deficit were also more

likely to show unstable ToM performance). However, there was also a high frequency

of ToM-EF dissociations in both directions. This pattern of results suggests that within

the subgroup of relatives expressing the endophenotype, there were two further

subgroups: an “EF-impaired, ToM-intact” group and a “both ToM and EF impaired”

group. However, it is also possible that these do not represent valid subgroups, but

instead that ToM and EF performance varied on a more continuous spectrum (with the

two spectrums covarying to some degree), and only some siblings fell on the “impaired”

side of the arbitrary cutoff for impairment. It would be interesting to test the validity of

the various sibling “subgroups” by investigating whether there are systematic

differences between their genotypes (or, perhaps, gene-environment interactions; see

Bauminger & Yirmiya, 2001).

Correlations between ToM and EF in siblings of individuals with ASDs were

also investigated in this study, although given the lack of ToM impairment this set of

analyses did not really address the “subgroups”-driven idea that ToM and EF

impairments may be independent in ASD siblings (this would not be expected, given

that the ASD siblings did not display a ToM deficit and the account outlined in Study 1

proposes that the relative independence between ToM and EF in probands occurs partly

because their ToM deficit is caused by ToM-specific factors). Instead, the results of

ToM-EF correlations were more relevant to the ancillary issue of whether ASD siblings

showed unusual patterns of association between the two domains. Results demonstrated

that, compared to control siblings, ASD siblings did show a different pattern of

associations between ToM and EF performances. Unlike control siblings, ASD siblings

showed no significant partial correlations between measures of ToM and verbal

inhibition or verbal generativity, but showed three significant partial correlations

between ToM and non-verbal inhibition measures. Hughes et al. (1999) found similar

evidence of unusual associations between tasks for ASD siblings (although the

correlations were between EF tasks in that study). This is consistent with the notion

that ASD siblings may use different strategies to solve cognitive tasks than siblings of

children without autism, even though their overall level of performance may not be

impaired. Nevertheless, it must be noted that control siblings and control probands also

showed a different pattern of correlations between ToM and EF. One explanation for

this may be that the siblings in this study were older, on average, than the probands in

Study One and the age range was larger. If the hypothesis that the ToM-EF relationship

changes with development is correct, these differences in correlations across the two

control samples would be expected. As the ASD and control sibling groups were

matched on age and similar in age range, the difference in the pattern of correlations

between these two groups more meaningfully suggests that ASD siblings are

characterised by unusual associations between ToM and EF. However, it is also

possible that the different patterns of ToM-EF correlations displayed in different control

samples could simply reflect the fact that ToM-EF relationships are weak or even

spurious in some cases, resulting in variable outcomes in different samples. The

hypothesis that siblings (and probands) use unconventional strategies to solve ToM (or

EF) tasks would be better addressed by systematically varying the problem-solving

demands of ToM tasks and observing the differential effects of these manipulations in

ASD and control samples.

It was initially intended to examine the “multidimensional spectrum” idea in this

study by calculating correlations between cognitive and behavioural measures within

the ASD sibling group. Unfortunately, the behavioural measures of social impairment

and repetitive behaviours did not appear to be valid indicators of behavioural severity in

siblings without ASD diagnoses, with the higher levels of symptoms reported in control

siblings probably reflecting a tendency for parents of children with ASDs to under-

report subtle autistic-like symptomatology in non-autistic siblings (e.g., the parent of a

child with autism may answer the question “Does your child pace or move around

repetitively?” negatively with regard to their non-autistic child as compared with their

child with autism, whereas a control parent may answer positively because many

children display occasional restlessness). This meant that correlations between

cognitive and behavioural measures could not be calculated, as the behavioural

measures lacked validity. It was interesting, however, that this under-reporting was not

evident on the measure of social behaviour, which may be an indication of an increased

incidence of social impairment in ASD siblings (although this is highly speculative).

Future investigations of cognitive-behavioural relationships in relatives of individuals

with ASDs may benefit from the employment of observational measures of behaviour or

other more direct measures which do not rely on parental report (this is discussed

further in Chapter 7).

In sum, then, the results from this study were not able to contribute as much as

was hoped to the question of which multiple deficits model may be the most appropriate

for ASDs. Indeed, the results relevant to endophenotype status did not identify multiple

deficits in siblings (i.e., deficits in both ToM and EF domains). It was nevertheless

evident that the two EF deficits showing the most promise as endophenotypes only

characterised a subgroup of ASD siblings, and that the presence of an additional ToM

deficit may represent a more subsidiary subgroup. However, it was not possible to

determine whether these represented valid subgroups (as opposed to ends of a spectrum)

or whether they were also associated with increased levels of subclinical behavioural

symptomatology. The role of ToM and EF in ASDs therefore remains a question with a

somewhat nebulous answer.

CHAPTER 7

General Discussion: Constructing an Explanatory Model for ASDs

7.1 Summary of the findings

7.2 Methodological strengths and limitations

7.3 Conclusions on constructing an explanatory model for ASDs

7.4 Future directions

7.1 Summary of the findings

The major findings of this research may be summarised as follows:

1. Individuals with ASDs demonstrated a profile of spared and impaired cognitive

abilities which differed in important ways from previous research. They showed

impairments in ToM, planning, verbal inhibition, working memory (when inhibitory

control was also required), and both verbal and non-verbal generativity, but intact

performance on tests of awareness of social norms, set-shifting, non-verbal

inhibition and relational reasoning. Deficits on verbal tasks were more common

than on non-verbal tasks, and several task performances were mediated by VIQ.

The deficits in verbal inhibition and in working memory in an inhibitory context

were new findings which suggested that the previously proposed “typical EF

profile” of individuals with ASDs, in which inhibition is spared (e.g., Ozonoff &

Jensen, 1999), may need revision.

2. Results did not support a single primary cognitive deficit model of ASDs. Neither

ToM nor EF deficits met the criteria of universality or explanatory value. This

confirms and supports the findings of several previous studies which have

demonstrated similar outcomes (as described in Chapter 2).

3. However, EF deficits showed superior relative primacy compared with a ToM

deficit, as judged by their superior ability to discriminate individuals with ASDs

from controls, and the higher number of significant correlations with aspects of

behavioural symptomatology. In particular, deficits in verbal inhibition and verbal

generativity appeared to be the most primary.

4. ToM and EF were found to be largely independent deficits in ASDs, as measured by

the paucity of significant correlations between the two domains and the

dissociability of the impairments in both directions. This was the first study to

demonstrate this independence of the two deficits in ASDs. It indicated that

although EF deficits were relatively more primary, they could not explain or

subsume ToM as a secondary deficit. These findings were also inconsistent with

both the “common conceptual bases” and “emergence” accounts of the ToM-EF

relationship in typical development, thereby providing the most support for either

the “common neuroanatomical bases” and/or “expression” accounts.

5. No ToM or EF variables demonstrated strong or consistent potential as

endophenotypes for ASDs, although EF deficits showed better potential than a ToM

deficit. Weaknesses in working memory (in a context where inhibition was

required) and in non-verbal generativity were identified in ASD siblings when

compared with control siblings; however, performance in these domains was not

strongly familial and the variables lacked specificity as predictors of ASD sibling

versus control group membership.

Hence, in sum, ToM and EF were found to be independently impaired in ASDs, but

neither impairment was universal, showed strong relationships with symptoms, or was a

useful candidate for an endophenotype for ASDs. EF deficits consistently showed

superior primacy in comparison with ToM. The results of both studies indicated that a

multiple primary deficit model is more suitable for ASDs than a single primary deficit

model, but it was not possible to determine which type of multiple deficits model was

the most appropriate. There appeared to be different subgroups of both ASD probands

and ASD siblings demonstrating different cognitive profiles, but results were also

compatible with the notion of a more continuous spectrum (with the apparent subgroups

an artefact of the arbitrary cutoff for the definition of “impairment”). The way in which

these subgroups or spectrums should be defined behaviourally was also unclear,

although results were more consistent with a classification system based on level of

functioning as opposed to symptom domains or symptom severity. The possibility that

the primacy of deficits changes with development also remains open, and there may be

other equally primary deficits which were not measured in this research.

7.2 Methodological strengths and limitations

As described in Chapters 4 and 6, the current research incorporated several

methodological improvements upon previous studies. One of the major strengths was

the use of relatively process-pure tests of a range of EF components, several of which

included in-built control conditions allowing isolation of the relevant ability. Tests

requiring both verbal and non-verbal responses were used, and most tasks had several

levels of difficulty in order to be suitable for individuals of a wide range of ages. Large

sample size was also a significant strength of this research; for example, the number of

siblings who participated in Study Two was more than three times higher than the

number for the largest previous study on ToM or EF in siblings. Statistical approaches

were thorough and the effects of potentially confounding variables such as age and IQ

were carefully examined and accounted for throughout all analyses.

One of the methodological weaknesses of this research was the limited range of

ToM measures employed. While the Dewey Stories task was included as a higher-level

social cognition measure, it was of questionable validity as a measure of ToM. The

otherwise exclusive choice of false belief tasks was the result of i) the fact that theories

of the ToM-EF relationship have been based largely around false belief, ii) the need to

constrain the length of the test battery, and iii) the fact that the few high-level, advanced

ToM tasks that exist suffer from the problem of a lack of process purity (this is

discussed further in Section 7.2). Based on previous research (e.g., Baron-Cohen,

1989b), it was also expected that the failure rate of ASD probands on the false belief

tasks used (particularly the second-order tasks) would be higher than was found in this

research – with the current result indicating that a ToM deficit is not as severe or

prevalent in ASDs than some authors have argued (e.g., Baron-Cohen, 1995; Leslie &

Roth, 1993). The high success rates of control probands and both ASD and control

siblings on the false belief tasks caused difficulties for the interpretation of task results

as indicative of a lack of ToM impairment in ASD siblings or of a mild severity of

impairment in ASD probands. However, as discussed in Section 4.4.2 of Chapter 4, the

lack of discriminative ability (or “uniqueness”) and explanatory value of ToM in Study

One was not easily dismissable as a consequence of the level of difficulty of false belief

tasks, as i) ToM and EF deficits were of roughly equal prevalence in the ASD group, ii)

a significant proportion of individuals showed impaired performance on ToM tasks but

unimpaired performance on EF tasks, and iii) performance on all of the false belief tasks

was far from the ceiling in the ASD group – as confirmed by the significant medium-

level correlations between false belief performance and VIQ (which also suggests that

the measures have some reliability). In addition, although ToM performance in the

control group and in the two sibling groups was high, it was not at ceiling. The use of

an additional higher-level, more sensitive ToM task would nevertheless have

strengthened this research, particularly for the detection of any subtle weaknesses in

mentalising ability in ASD siblings, especially those in middle childhood and older.

Another possible limitation was the use of parental questionnaires and

interviews as indices of the presence and severity of behavioural symptomatology.

Parental report is subjective and dependent upon the individual parent’s framework for

judging abnormality. More direct and objective methods such as systematic behavioural

observation techniques may have provided more valid measures of behavioural

variation in individuals without ASDs and possibly resulted in stronger relationships

with underlying cognitive deficits. Bishop and Norbury (2002) found that diagnostic

measures of autism based on parental interview (the ADI-R) and direct observation (the

ADOS-G: Autism Diagnostic Observation Schedule – Generic; Lord et al., 2000)

resulted in widely discrepant outcomes for several children, and they noted that it is

usually recommended that information from parental report and observational

techniques be combined. However, observational methods have their own limitations,

such as time-intensiveness and the possibility of poor ecological validity (as behaviour

can only be observed for a limited time and in a restricted range of situations, and the

presence of an observer may alter the nature of the behaviour displayed). There are also

essentially no observational scales available for social/communicative functioning and

repetitive behaviours which are appropriate for recording both normal and abnormal

variation in these behavioural domains, rather than being directed at diagnosing

pathology.

While the characteristics of the ASD proband sample (e.g., age, level of

functioning, range of symptom severity) were not considered to be a major limitation as

the sample was generally appropriate for the research aims and the tasks used, it should

be recognised that the sample characteristics limit the scope of the conclusions.

Szatmari and colleagues have suggested that low-functioning autism may arise from

different genetic mechanisms from high-functioning autism (e.g., Szatmari, 1999;

Szatmari et al., 2002), therefore the high-functioning nature of the ASD probands in this

research may limit the generalisability of the findings. However, the inclusion of low-

functioning probands would have required substantial alterations to the design of the

study, as many of the tasks were inappropriate for individuals with mental retardation.

Similarly, the relatively old age of the sample limited the conclusions that could be

drawn particularly with regard to the possibility of changes in the primacy of and

relationship between ToM and EF impairments with development, but again the

inclusion of participants below the age of five years would have caused difficulties for

task selection. The non-matching of the ASD and control probands on VIQ was not

ideal, but this is a somewhat inevitable aspect of autism research given the typical PIQ-

VIQ discrepancy displayed by individuals with ASDs (making it difficult to match

controls on both PIQ and VIQ), and VIQ was taken into account in all analyses. The

analysis of ASD probands with different ASD diagnoses (e.g., autism, Asperger

syndrome) together as one sample may also be subject to criticism, but its validity was

attested by the finding that significant differences between probands meeting full ADI-

R criteria for autism and those meeting partial ADI-R criteria occurred only on one task.

Nevertheless, the variability of the sample is likely to have increased standard

deviations which may have reduced the likelihood of finding significant or large

differences from controls in group comparisons (marginal differences were found on

several tasks, and several significant differences were attenuated when age and/or IQ

variables were controlled).

Finally, although there were a small number of control probands with mild

mental retardation, the control proband group consisted largely of typically developing

individuals, matched to ASD probands on age and PIQ. Without having a control group

of individuals with other disabilities (e.g., Down’s syndrome), it is not possible to test

whether simply having any developmental disability may have resulted in some of the

deficits observed. This distinction is necessary for any deficit to meet the uniqueness

criterion for primacy, and it is also relevant for the interpretation of deficits in ASD

siblings, who may show adverse effects of living with a sibling with a disability

(although it is difficult to see why this would affect certain EF components and not

others; for further discussion, see Bauminger & Yirmiya, 2001).

7.3 Conclusions on constructing an explanatory model for ASDs

Taking into account these constraints, what broad conclusions about explanatory models

of ASDs can be made on the basis of this research? As already stated, we can fairly

confidently reject a conceptualisation of autism as a unitary syndrome with a single

primary cognitive deficit. Notwithstanding the possibility that there is another cognitive

deficit which was not measured in this research and which could explain both ToM and

EF deficits as secondary to it (which is unlikely as ToM and EF were found to be

unrelated impairments), the current findings clearly and consistently demonstrate that

ASDs can not be explained by a single primary deficit.

These findings consolidate recent research on cognitive impairments in ASDs,

which has increasingly moved away from the notion of a single primary deficit.

Psychologists studying cognitive deficits in ASDs have to some extent lagged behind

other researchers focussing on genetic and neurobiological aetiologies, who have been

arguing for some time that “any attempt to demonstrate a single cause for all cases of

autism appears to be futile” (Gillberg & Coleman, 1992, p. 283). This lag was driven

by the hope that the identification of a single cognitive deficit would provide a

diagnostic marker for autism and a unified explanation for the range of unusual

behaviours displayed by individuals with ASDs. However, it could be argued that the

failure to find such a cognitive marker was somewhat predictable given the

heterogeneity evident at the genetic, neurobiological, and behavioural levels of

explanation. This highlights the importance of an integrated approach to ASD research,

where findings from all levels of explanation constrain and inform each other (see

Bailey et al., 1996; Tager-Flusberg, 1999a).

Nevertheless, recognition of this need for integration is only the first step

towards the discovery of which kind of multiple deficits model may best explain ASDs.

Several key questions remain. Are ASDs best conceptualised as a group of distinct

subtypes or a multidimensional spectrum? How should these subgroups or spectrums

be defined and operationalised? Are they associated with different genotypes or

neuropathologies? Are there other cognitive impairments besides ToM and EF which

may be equally primary in ASDs, and if so what are they? Do the various cognitive

impairments change in primacy or causal status with development? It may be the case

that a combination of the various multiple deficits models will end up forming the best

explanatory paradigm; for example, there may be a multidimensional autism spectrum

within which certain clusters often occur (this kind of model has been proposed

previously by Beglinger & Smith, 2001), where more than two cognitive deficits are

present and these deficits change in primacy throughout development. The problem

with this sort of model is that its complexity makes it very difficult to test empirically.

The methodological and conceptual difficulties with determining which

integrated causal model can explain ASDs are characteristic of research on complex

genetic disorders in general, especially when dealing with disorders of development.

The task of beginning with behaviour and tracing the causal chain back through

cognition to the level of biology is full of hazards. The mapping of genotype to

phenotype is neither direct nor specific (Karmiloff-Smith et al., 2002), and a

neurological abnormality which occurs early in development can trigger a complex

chain of both structural and functional changes. This means that diverse pathogenic

processes may lead to similar behavioural phenotypes, and conversely, similar

pathogenic processes may lead to divergent behavioural symptoms (Courchesne,

Townsend, & Chase, 1995). In the words of Gottesman and Gould (2003):

In diseases with classic or Mendelian genetics as their distal causes, genotypes are usually

indicative of phenotypes. However, this degree of genetic certainty does not exist for diseases

with complex genetics. Genetic probabilism aptly describes the process by which a particular

genotype gives rise to phenotype. Epigenetic factors may also be of critical importance for

modifying the development of phenotypes, and such modifications may be influenced by

genotype or environment or be entirely stochastic in origin. Thus, models of complex genetic

disorders predict a ballet choreographed interactively over time among genotype, environment,

and epigenetic factors, which gives rise to a particular phenotype (p. 636 – references not

included).

The recognition of this multilayered complexity will be necessary to make progress in

the construction of an explanatory model for ASDs. While admirable attempts at

integrated models have been made (e.g., Courchesne et al., 1995; Dawson et al., 2002b;

Waterhouse et al., 1996), we are still far from understanding how to tie together the

diverse array of often inconsistent findings across all levels of explanation. This not

least in part because our understanding of the interactions between genes, neurobiology,

cognition and behaviour in typical development is crude and fragmentary at best.

7.4 Future directions

An integrative approach to autism research ideally requires both clarity within each

level of explanation and consistency and integration between the various levels of

explanation. Therefore, further research needs to address remaining questions at the

cognitive level of explanation, as well as the integration of cognitive findings with

research on behavioural outcomes, neurobiological substrates, and genetic mechanisms.

So which issues at the cognitive level of explanation deserve further attention?

Although neither ToM or EF impairments appear to be a core marker for autism, the

study of their nature, primacy and relationship can still inform research on causal

models of ASDs as well as the study of ToM, EF, and their relationship in typical

development. Firstly, there is a clear need to develop more high-level, ecologically

valid measures of ToM. It is evident that ToM develops beyond the ability to

understand false belief, and yet there are few tasks available for investigating more

advanced ToM development. Those that are available (e.g., the Eyes task, Strange

Stories) suffer from a lack of process purity, relying heavily on other abilities such as

face perception, emotion recognition, and verbal comprehension. Ecological validity is

an important task property as attempts to rigorously control the conditions of the task

can end up altering the essence of the phenomenon under study (Volkmar et al., 2004),

but the search for ecological validity often comes at the expense of task purity. The

challenge is therefore to develop ecologically valid tasks which have in-built control

conditions that allow isolation of the target ability. Such tasks would allow more

precise examination of the extent of higher-level ToM development in individuals with

ASDs and their first-degree relatives. They would also aid the investigation of the

nature of the ToM-EF relationship in children and adults over the age of five, which will

be important both for the extension of theory and empirical findings on the ToM-EF

relationship at later ages and for the interpretation of findings on the ToM-EF

relationship in clinical samples in this older age range.

The development of ToM tasks which incorporate the component process

approach may also help uncover any compensatory strategies which may be used to aid

ToM performance. The use of alternative strategies often becomes a “default”

explanation for intact performance on ToM tasks, yet this hypothesis has not been

directly tested, relying only on indirect evidence such as neuroimaging data. More

direct tests could involve systematic manipulation of the problem-solving requirements

of high-level ToM tasks to examine how performance is affected with certain strategies

cannot be used. Such multiple-condition tasks may also represent a method of

investigating the validity of one of the proposed explanations for the intriguing lack of

correlations between ToM and EF in individuals with ASDs (i.e., the possibility that the

lack of significant correlations between ToM and EF in individuals with ASDs who

show EF impairment is due to the use of alternative strategies for ToM performance; see

Section 4.4.3 in Chapter 4).

A number of aspects of EF in ASDs also merit further investigation. The

findings of this research suggest that deficits in generativity play a key role in ASDs.

The impairment in verbal generativity displayed by ASD probands demonstrated the

largest effect size, was one of the most prevalent deficits, and was a significant

discriminator between the ASD and control groups, and an impairment in non-verbal

generativity was one of the few significant deficits to emerge in ASD siblings. Further

investigation of generativity with individuals with ASDs of a larger age range and using

a wider range of tasks therefore appears worthwhile, although it awaits the development

of generativity tasks appropriate for young children. It is interesting to note that

generativity tasks are generally the most unstructured of EF tasks, as they require the

participant to produce novel responses, as opposed to reacting to stimuli presented to

them as part of a structured task. It would therefore also be interesting to see whether

the apparent severity of impairment displayed on generativity tasks is due to a specific

problem with generativity, or whether EF impairments in general are better detected and

therefore more severe on unstructured tasks which are more representative of many real-

life situations (i.e., have higher ecological validity), regardless of which EF component

is involved. This could be addressed by designing more unstructured tests of other EF

components.

Impairments in verbal inhibition and on tasks requiring a combination of

inhibitory and working memory requirements were also new findings in this research

which await replication. If further studies confirm the existence of an inhibitory

impairment in ASDs, this raises additional questions about the discriminant validity of

the EF profile in ASDs as compared with other disorders such as ADHD. Even if EF

impairments are not singularly primary in autism and therefore do not strictly need to

meet the uniqueness criterion, the question remains as to why similar EF impairments

result in such different behaviours in different disorders. Is it the case that EF

impairments must co-occur with certain other cognitive impairments, or emerge at a

particular point in development - or both - in order to produce the unique behaviours

displayed by individuals with ASDs?

The integration of cognitive and behavioural levels of explanation is the next

challenge in constructing an integrated explanatory model of ASDs. To begin with, in

order to examine the relationships between cognition and behaviour in a more precise

manner, we first require more accurate measures of behaviour. As mentioned

previously, parental report and observational techniques each have their own set of

problems. A combination of approaches may be necessary to gain a complete picture of

the nature and severity of behavioural symptomatology (Bishop & Norbury, 2002).

However, it will first be necessary to develop observational scales which are appropriate

for capturing both the normal range of variation in behaviour and the extremes of

abnormality. Without this, it remains unclear whether ToM and EF impairments do

actually underlie the behaviours that they are commonly purported to – or indeed

whether they hold any explanatory value at all. If ToM and/or EF do not show strong

relationships with behaviour, it remains possible that they are simply pleiotropic effects

– that is, they may be related to the genetic mechanisms which cause autism but

unrelated to its behavioural phenotype. This does not seem plausible given the results

of previous research and our knowledge about the behavioural effects of cognitive

deficits such as EF impairment in other disorders (e.g., individuals with frontal lobe

damage), but it is a possibility which needs to be ruled out using valid behavioural

measures.

The need for longitudinal studies which track the development of cognition and

behaviour in children with ASDs from an early age is a theme which has recurred

throughout this thesis as well as autism research in general. These studies will be

crucial for i) determining whether ToM or EF impairments have causal precedence, ii)

examining relationships between ToM and EF impairments and their proposed

precursors such as joint attention (e.g., Leekam & Moore, 2001; Mundy, 2003) and

imitation (Rogers, 1999; Rogers & Pennington, 1991), and iii) investigating how early

cognitive deficits affect the nature and severity of both early and later behavioural

symptomatology. Until recently, cognitive theories of ASDs have largely ignored the

process of development and instead proposed essentially static impairments which

supposedly persist throughout the affected individual’s lifetime. However, the

importance of considering developmental factors when conducting research on

developmental psychopathologies is being increasingly emphasised (e.g., Bishop, 1997;

Karmiloff-Smith, 1992; Tager-Flusberg, 1999a; Thomas & Karmiloff-Smith, 2002).

For example, Karmiloff-Smith (1997) recommended six changes of approach for

research in developmental cognitive neuroscience:

1. The recognition that plasticity is the rule, not simply a specialised response to injury.

2. The identification of constraints on plasticity.

3. A focus on the dynamics of development at multiple levels.

4. The recognition that specialisation within some brain regions is the product of development,

not its starting point.

5. A focus not only on the end state but also how the child progressively develops to the end

state.

6. The in-depth analysis of the different processes by which seemingly normal surface

behaviour can be produced by a brain that has developed differently from the outset. (p.

Increased recognition of the role of interactive developmental processes in cognitive

performance and behavioural outcomes in the field of autism has paralleled this more

general shift (e.g., Bowler, 2001; Burack, Charman, Yirmiya, & Zelazo, 2001;

Courchesne et al., 1995; Happé, 2001; Steele, Joseph, & Tager-Flusberg, 2003; Tager-

Flusberg, 2001). However, while the need for longitudinal studies is often discussed, it

is rarely enacted. This is partly because autism is still difficult to diagnose early,

although recent progress in early diagnosis (see Charman & Baird, 2002) may facilitate

the ease with which longitudinal studies can be conducted. Targeting newborn siblings

of individuals with ASDs, who have an increased likelihood of developing an ASD, is

another method of identifying possible participants early for longitudinal monitoring.

The ability to conduct studies of cognitive impairment from a very young age has also

been hampered by the lack of appropriate tasks for young children, although again such

tasks are becoming increasingly available.

How might findings at the cognitive level of explanation inform research on the

neurobiological substrates of ASDs? One possible avenue results from the finding that

ToM and EF deficits were independent in ASDs, which indicates that their co-

occurrence is most likely to be explained by their neuroanatomical proximity. This

suggests that the functioning of the prefrontal cortex, both in its ventromedial and

dorsolateral aspects, is disrupted in individuals with ASDs. However, there is no clear

evidence of structural frontal abnormality in ASDs. Instead, it is possible that cortical

networks involving frontal regions may have been disrupted during development, or that

neurotransmitters which are particularly active in these regions are deficient1 (e.g., a

dopaminergic deficit may underlie the range of cognitive deficits displayed by

individuals with autism, as suggested by Pennington et al., 1997). It therefore appears

that investigations of the development of cortical networks involving the prefrontal

cortex and neurotransmitter systems which heavily populate frontal areas would be

worthwhile targets for neurobiological studies of ASDs.

The relationship between cognitive (and behavioural and neurobiological)

findings with underlying genetic mechanisms will arguably be one of the most

important and fruitful links in the search for the aetiology of autism. This research

suggested that there may be different subgroups of individuals with ASDs, possibly

defined by their level of functioning, which have different ToM and EF profiles.

Similarly, relatives of individuals with ASDs who demonstrated endophenotypes or

cognitive vulnerability markers also appeared to show a variety of cognitive profiles.

However, in both probands and siblings, it was also possible (and perhaps even more

likely) that cognitive performances varied on a more continuous spectrum, with the

apparent subgroups a result of classifying individuals scoring below a certain point as

“impaired”. One method of distinguishing between these two possibilities would be to

conduct a cluster analysis (based on cognitive performances) on a large sample of

individuals with ASDs and their relatives, and test firstly whether any meaningful

clusters emerged which showed unique behavioural characteristics, and secondly

whether any such clusters showed distinct genotypic markers and/or neuropathologies.

This kind of research has been conducted previously with promising results (Dawson,

Klinger, Panagiotides, Lewy, & Vastelloe, 1995; this study used the subgroup

classification system proposed by Wing and Gould (1979) based on differences in social

1 However, note that if the severity of ToM and EF deficits depended directly on the extent of neurotransmitter deficiency, then significant correlations between ToM and EF might be expected in the ASD population.

behaviour rather than using cluster analysis). The notion of different subgroups of

individuals with ASDs characterised by different genetic mechanisms has been

previously proposed to explain the heterogeneity evident at all levels of explanation

(e.g., Szatmari, 1999; Tager-Flusberg & Joseph, 2003). However, it remains to be seen

how these subgroups should be defined, or indeed, whether the notion of subgroups

holds any validity at all. The kind of multi-level approach proposed above seems the

most appropriate way of approaching this problem, although cluster analysis is limited

by the absence of objective rules for defining the boundaries of each subgroup (Lorr,

1994).

The current research has consolidated and extended previous work by

demonstrating that i) neither ToM or EF impairments meet criteria for a single primary

deficit in ASDs, ii) ToM and EF impairments are independent and do not explain each

other, and iii) multiple deficits models involving subgroups or spectrums which are

probably not based on symptom domains or severity, and where deficits are not

considered static and unchanging, are the best place to focus future research efforts.

Studies of cognitive mechanisms in ASDs and their relationship with behaviour and

biological substrates should move away from attempting to find a specific core

cognitive deficit which could “explain autism” and instead focus upon mapping the

profile of deficits and examining how these deficits change over time and interact with

the other levels of explanation. The challenge will be to develop creative methods and

strategies for implementing an integrated, developmental approach which recognises the

complexities and dynamics of the genotype-phenotype interactions that underlie autism.

REFERENCES

Abu-Akel, A. (2003). A neurobiological mapping of theory of mind. Brain Research

Reviews, 43, 29-40.

Akshoomoff, N., Pierce, K., & Courchesne, E. (2002). The neurobiological basis of

autism from a developmental perspective. Development & Psychopathology,

14(3), 613-634.

Alexander, M. P. (2002). Disorders of language after frontal lobe injury: Evidence for

the neural mechanisms of assembling language. In D. T. Stuss & R. T. Knight

(Eds.), Principles of frontal lobe function (pp. 159-167). London: Oxford

University Press.

American Psychiatric Association. (1994). Diagnostic and statistical manual of mental

disorders. (4th ed.). Washington, DC: Author.

Ames, D., Cummings, J. L., Wirshing, W. C., Quinn, B., & Mahler, M. (1994).

Repetitive and compulsive behavior in frontal lobe degenerations. Journal of

Neuropsychiatry & Clinical Neurosciences, 6(2), 100-113.

Anderson, C. V., Bigler, E. D., & Blatter, D. D. (1995). Frontal lobe lesions, diffuse

damage, and neuropsychological functioning in traumatic brain-injured patients.

Journal of Clinical & Experimental Neuropsychology, 17(6), 900-908.

Anderson, P. (2002). Assessment and development of executive function (EF) during

childhood. Child Neuropsychology, 8(2), 71-82.

Anderson, P., Anderson, V., & Lajoie, G. (1996). The Tower of London Test:

Validation and standardization for pediatric populations. Clinical

Neuropsychologist, 10(1), 54-65.

Anderson, S. W., Bechara, A., Damasio, H., Tranel, D., & Damasio, A. R. (1999).

Impairment of social and moral behavior related to early damage in human

prefrontal cortex. Nature Neuroscience, 2(11), 1032-1037.

Anderson, S. W., Damasio, H., Jones, R., & Tranel, D. (1991). Wisconsin Card Sorting

Test performance as a measure of frontal lobe damage. Journal of Clinical &

Experimental Neuropsychology, 13(6), 909-922.

Anderson, V. (1998). Assessing executive functions in children: Biological,

psychological, and developmental considerations. Neuropsychological

Rehabilitation, 8(3), 319-349.

Anderson, V. A., Anderson, P., Northam, E., Jacobs, R., & Catroppa, C. (2001).

Development of executive functions through late childhood and adolescence in

an Australian sample. Developmental Neuropsychology, 20(1), 385-406.

Anderson, V., Levin, H. S., & Jacobs, R. (2002). Executive functions after frontal lobe

injury: A developmental perspective. In D. T. Stuss & R. T. Knight (Eds.),

Principles of frontal lobe function (pp. 504-527). London: Oxford University

Press.

Astington, J. W., Harris, P. L., & Olson, D. R. (Eds.). (1988). Developing theories of

mind. Cambridge: Cambridge University Press.

Atkinson, R., & Shiffrin, R. (1968). Human memory: A proposed system and its control

processes. In K. W. Spence & J. T. Spence (Eds.), The psychology of learning

and motivation . New York: Academic Press.

August, G. J., Stewart, M. A., & Tsai, L. (1981). The incidence of cognitive disabilities

in the siblings of autistic children. British Journal of Psychiatry, 138, 416-422.

Bach, L., Davies, S., Colvin, C., Wijeratne, C., Happé, F., & Howard, R. (1998). A

neuropsychological investigation of theory of mind in an elderly lady with

frontal leucotomy. Cognitive Neuropsychiatry, 3(2), 139-159.

Bach, L. J., Happé, F., Fleminger, S., & Powell, J. (2000). Theory of mind:

Independence of executive function and the role of the frontal cortex in acquired

brain injury. Cognitive Neuropsychiatry, 5(3), 175-192.

Bachevalier, J. (1994). Medial temporal lobe structures and autism: A review of clinical

and experimental findings. Neuropsychologia, 32(6), 627-648.

Bachevalier, J., & Loveland, K. A. (2003). Early orbitofrontal-limbic dysfunction and

autism. In D. Cicchetti & E. Walker (Eds.), Neurodevelopmental mechanisms in

psychopathology (pp. 215-236). New York: Cambridge University Press.

Baddeley, A. D. (1986). Working memory. Oxford: Oxford University Press.

Baddeley, A. (1996). Exploring the central executive. Quarterly Journal of

Experimental Psychology A, 49A(1), 5-28.

Baddeley, A. (2002). Fractionating the central executive. In D. T. Stuss & R. T. Knight

(Eds.), Principles of frontal lobe function (pp. 246-260). London: Oxford

University Press.

Baddeley, A. D., & Hitch, G. (1974). Working memory. In G. H. Bower (Ed.), The

psychology of learning and motivation (Vol. 8). New York: Academic Press.

Bailey, A., Le Couteur, A., Gottesman, I., & Bolton, P. (1995). Autism as a strongly

genetic disorder: Evidence from a British twin study. Psychological Medicine,

25(1), 63-77.

Bailey, A., Palferman, S., Heavey, L., & Le Couteur, A. (1998). Autism: The phenotype

in relatives. Journal of Autism & Developmental Disorders, 28(5), 369-392.

Bailey, A., Phillips, W., & Rutter, M. (1996). Autism: Towards an integration of

clinical, genetic, neuropsychological, and neurobiological perspectives. Journal

of Child Psychology and Psychiatry, 37(1), 89-126.

Baird, G., Charman, T., Baron-Cohen, S., Cox, A., Swettenham, J., Wheelwright, S., &

Drew, A. (2000). A screening instrument for autism at 18 months of age: A 6-

year follow-up study. Journal of the American Academy of Child and Adolescent

Psychiatry, 39(6), 694-702.

Baird, T. D., & August, G. J. (1985). Familial heterogeneity in infantile autism. Journal

of Autism & Developmental Disorders, 15(3), 315-321.

Baron-Cohen, S. (1988). Social and pragmatic deficits in autism: Cognitive or affective?

Journal of Autism & Developmental Disorders, 18(3), 379-402.

Baron-Cohen, S. (1989a). Are autistic children "behaviorists"? An examination of their

mental-physical and appearance-reality distinctions. Journal of Autism &

Developmental Disorders, 19(4), 579-600.

Baron-Cohen, S. (1989b). The autistic child's theory of mind: A case of specific

developmental delay. Journal of Child Psychology & Psychiatry & Allied

Disciplines, 30(2), 285-297.

Baron-Cohen, S. (1989c). Do autistic children have obsessions and compulsions?

British Journal of Clinical Psychology, 28(3), 193-200.

Baron-Cohen, S. (1989d). Perceptual role taking and protodeclarative pointing in

autism. British Journal of Developmental Psychology, 7(2), 113-127.

Baron-Cohen, S. (1991a). Do people with autism understand what causes emotion?

Child Development, 62, 385-395.

Baron-Cohen, S. (1991b). Precursors to a theory of mind: Understanding attention in

others. In A. Whiten (Ed.), Natural theories of mind: Evolution, development

and simulation of everyday mindreading (pp. 233-251). Oxford: Blackwell.

Baron-Cohen, S. (1991c). The theory of mind deficit in autism: How specific is it?

British Journal of Developmental Psychology, 9(2), 301-314.

Baron-Cohen, S. (1992). Debate and argument: On modularity and development in

autism: A reply to Burack. Journal of Child Psychology & Psychiatry & Allied

Disciplines, 33(3), 623-629.

Baron-Cohen, S. (1994). How to build a baby that can read minds: Cognitive

mechanisms in mindreading. Cahiers de Psychologie Cognitive, 13(5), 513-552.

Baron-Cohen, S. (1995). Mindblindness: An essay on autism and theory of mind.

Cambridge, MA: MIT Press.

Baron-Cohen, S. (1998). Does the study of autism justify minimalist innate modularity?

Learning & Individual Differences, 10(3), 179-191.

Baron-Cohen, S. (2000). Theory of mind and autism: A fifteen year review. In S.

Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other

minds: Perspectives from developmental cognitive neuroscience (2nd ed., pp. 3-

20). London: Oxford University Press.

Baron-Cohen, S., Allen, J., & Gillberg, C. (1992). Can autism be detected at 18 months?

The needle, the haystack, and the CHAT. British Journal of Psychiatry, 161,

839-843.

Baron-Cohen, S., Campbell, R., Karmiloff-Smith, A., Grant, J., & Walker, J. (1995).

Are children with autism blind to the mentalistic significance of the eyes?

British Journal of Developmental Psychology, 13, 379-398.

Baron-Cohen, S., & Goodhart, F. (1994). The "seeing-leads-to-knowing" deficit in

autism: The Pratt and Bryant probe. British Journal of Developmental

Psychology, 12(3), 397-401.

Baron-Cohen, S., & Hammer, J. (1997). Parents of children with Asperger syndrome:

What is the cognitive phenotype? Journal of Cognitive Neuroscience, 9(4), 548-

Baron-Cohen, S., Jolliffe, T., Mortimore, C., & Robertson, M. (1997). Another

advanced test of theory of mind: Evidence from very high functioning adults

with autism or Asperger Syndrome. Journal of Child Psychology & Psychiatry

& Allied Disciplines, 38(7), 813-822.

Baron-Cohen, S., Leslie, A. M., & Frith, U. (1985). Does the autistic child have a

"theory of mind"? Cognition, 21(1), 37-46.

Baron-Cohen, S., Leslie, A. M., & Frith, U. (1986). Mechanical, behavioural and

Intentional understanding of picture stories in autistic children. British Journal

of Developmental Psychology, 4(2), 113-125.

Baron-Cohen, S., & Ring, H. (1994). A model of the mindreading system:

Neuropsychological and neurobiological perspectives. In C. Lewis & P. Mitchell

(Eds.), Children's early understanding of mind: Origins and development (pp.

183-207). Hove, UK: Lawrence Erlbaum Associates.

Baron-Cohen, S., Ring, H., Moriarty, J., Schmitz, B., Costa, D., & Ell, P. (1994).

Recognition of mental state terms: Clinical findings in children with autism and

a functional neuroimaging study of normal adults. British Journal of Psychiatry,

165(5), 640-649.

Baron-Cohen, S., Ring, H. A., Wheelwright, S., Bullmore, E. T., Brammer, M. J.,

Simmons, A., & Williams, S. C. R. (1999a). Social intelligence in the normal

and autistic brain: An fMRI study. European Journal of Neuroscience, 11, 1891-

Baron-Cohen, S., & Robertson, M. M. (1995). Children with either autism, Gilles de la

Tourette syndrome or both: Mapping cognition to specific syndromes.

Neurocase: Case Studies in Neuropsychology, Neuropsychiatry, & Behavioural

Neurology, 1(2), 101-104.

Baron-Cohen, S., & Swettenham, J. (1997). Theory of mind in autism: Its relationship

to executive function and central coherence. In D. J. Cohen & F. R. Volkmar

(Eds.), Handbook of autism and pervasive developmental disorders (2nd ed., pp.

880-893). New York: John Wiley & Sons.

Baron-Cohen, S., Wheelwright, S., Hill, J., Raste, Y., & Plumb, I. (2001a). The

"Reading the mind in the eyes" Test revised version: A study with normal adults,

and adults with Asperger syndrome or high-functioning autism. Journal of Child

Psychology & Psychiatry & Allied Disciplines, 42(2), 241-251.

Baron-Cohen, S., Wheelwright, S., Skinner, R., Martin, J., & Clubley, E. (2001b). The

Autism-Spectrum Quotient (AQ): Evidence from Asperger syndrome/high-

functioning autism, males and females, scientists and mathematicians. Journal of

Autism & Developmental Disorders, 31(1), 5-17.

Baron-Cohen, S., Wheelwright, S., Stone, V., & Rutherford, M. (1999b). A

mathematician, a physicist and a computer scientist with Asperger syndrome:

Performance on folk psychology and folk physics tests. Neurocase, 5(6), 475-

Bartsch, K., & Wellman, H. (1989). Young children's attribution of action to beliefs and

desires. Child Development, 60(4), 946-964.

Bartsch, K., & Wellman, H. M. (1995). Children talk about the mind. New York:

Oxford University Press.

Bauman, M. L. (1999). Autism: Clinical features and neurobiological observations. In

H. Tager-Flusberg (Ed.), Neurodevelopmental disorders: Developmental

cognitive neuroscience (pp. 383-399). Cambridge, MA: The MIT Press.

Bauman, M. L., & Kemper, T. L. (1994). Neuroanatomic observations of the brain in

autism. In M. L. Bauman & T. L. Kemper (Eds.), The neurobiology of autism

(pp. 119-145). Baltimore, MA: John Hopkins.

Bauminger, N., & Kasari, C. (1999). Brief report: Theory of mind in high-functioning

children with autism. Journal of Autism & Developmental Disorders, 29(1), 81-

Bauminger, N., & Yirmiya, N. (2001). The functioning and well-being of siblings of

children with autism: Behavioral-genetic and familial contributions. In J. A.

Burack, T. Charman, N. Yirmiya, & P. R. Zelazo (Eds.), The development of

autism: Perspectives from theory and research (pp. 61-80). Mahwah, NJ:

Lawrence Erlbaum Associates.

Beglinger, L. J., & Smith, T. H. (2001). A review of subtyping in autism and proposed

dimensional classification model. Journal of Autism & Developmental

Disorders, 31(4), 411-422.

Bell, M. A., & Fox, N. A. (1992). The relations between frontal brain electrical activity

and cognitive development during infancy. Child Development, 63(5), 1142-

Bennetto, L., Pennington, B. F., & Rogers, S. J. (1996). Intact and impaired memory

functions in autism. Child Development, 67, 1816-1835.

Benson, G., Abbeduto, L., Short, K., Bibler-Nuccio, J., & Maas, F. (1993).

Development of a theory of mind in individuals with mental retardation.

American Journal on Mental Retardation, 98(3), 427-433.

Berger, H. J., Aerts, F. H., van Spaendonck, K. P., Cools, A. R., & Teunisse, J.-P.

(2003). Central coherence and cognitive shifting in relation to social

improvement in high-functioning young adults with autism. Journal of Clinical

& Experimental Neuropsychology, 25(4), 502-511.

Berger, H. J., Van Spaendonck, K. P., Horstink, M. W., Buytenhuijs, E. L., Lammers, P.

W. J. M., & Cools, A. R. (1993). Cognitive shifting as a predictor of progress in

social understanding in high-functioning adolescents with autism: A prospective

study. Journal of Autism & Developmental Disorders, 23(2), 341-359.

Bertrand, J., Mars, A., Boyle, C., Bove, F., Yeargin-Allsopp, M., & Decoufle, P. (2001).

Prevalence of autism in a United States population: The Brick Township, New

Jersey, investigation. Pediatrics, 108(5), 1155-61.

Berument, S. K., Rutter, M., Lord, C., Pickles, A., & Bailey, A. (1999). Autism

screening questionnaire: Diagnostic validity. British Journal of Psychiatry, 175,

444-451.

Bettelheim, B. (1967). The empty fortress: Infantile autism and the birth of the self.

New York: Free Press.

Beveridge, M., Jarrold, C., & Pettit, E. (2002). An experimental approach to executive

fingerprinting in young children. Infant & Child Development, 11(2), 107-123.

Biro, S., & Russell, J. (2001). The execution of arbitrary procedures by children with

autism. Development & Psychopathology, 13(1), 97-110.

Bishop, D. V. (1993). Annotation: Autism, executive functions and theory of mind: A

neuropsychological perspective. Journal of Child Psychology & Psychiatry &

Allied Disciplines, 34(3), 279-293.

Bishop, D. V. M. (1997). Cognitive neuropsychology and developmental disorders:

Uncomfortable bedfellows. Quarterly Journal of Experimental Psychology A,

50A(4), 899-923.

Bishop, D. V. M. (2000). What's so special about Asperger syndrome? The need for

further exploration of the borderlands of autism. In A. Klin & F. R. Volkmar

(Eds.), Asperger syndrome (pp. 254-277). New York: Guilford Press.

Bishop, D. V. M., Maybery, M., Maley, A., Wong, D., Hill, W., & Hallmayer, J. (in

press-a). Using self-report to identify the broad phenotype in parents of children

with autistic spectrum disorders: A study using the Autism-Spectrum Quotient.

Journal of Child Psychology & Psychiatry.

Bishop, D. V. M., Maybery, M., Wong, D., Maley, A., Hill, W., & Hallmayer, J. (in

press-b). Are phonological processing deficits part of the broad autism

phenotype? American Journal of Medical Genetics (Neuropsychiatric Genetics).

Bishop, D. V., & Norbury, C. F. (2002). Exploring the borderlands of autistic disorder

and specific language impairment: A study using standardised diagnostic

instruments. Journal of Child Psychology & Psychiatry & Allied Disciplines,

43(7), 917-929.

Bjorklund, D. F., & Harnishfeger, K. K. (1990). The resources construct in cognitive

development: Diverse sources of evidence and a theory of inefficient inhibition.

Developmental Review, 10(1), 48-71.

Blair, J., Sellars, C., Strickland, I., Clark, F., Williams, A., Smith, M., & Jones, L.

(1996). Theory of mind in the psychopath. Journal of Forensic Psychiatry, 7(1),

15-25.

Bolton, P., Macdonald, H., Pickles, A., Rios, P., Goode, S., Crowson, M., Bailey, A., &

Rutter, M. (1994). A case-control family history study of autism. Journal of

Child Psychology & Psychiatry & Allied Disciplines, 35(5), 877-900.

Bolton, P., Pickles, A., Murphy, M., & Rutter, M. (1998). Autism, affective and other

psychiatric disorders: Patterns of familial aggregation. Psychological Medicine,

28(2), 385-395.

Boone, K. B., Ponton, M. O., Gorsuch, R. L., Gonzalez, J. J., & Miller, B. L. (1998).

Factor analysis of four measures of prefrontal lobe functioning. Archives of

Clinical Neuropsychology, 13(7), 585-595.

Boucher, J. (1988). Word fluency in high-functioning autistic children. Journal of

Autism & Developmental Disorders, 18(4), 637-645.

Boucher, J. (1996). What could possibly explain autism? In P. Carruthers & P. K. Smith

(Eds.), Theories of theories of mind (pp. 223-241). Cambridge: Cambridge

University Press.

Boutin, P., Maziade, M., Merette, C., Mondor, M., Bedard, C., & Thivierge, J. (1997).

Family history of cognitive disabilities in first-degree relatives of autistic and

mentally retarded children. Journal of Autism & Developmental Disorders,

27(2), 165-176.

Bowler, D. M. (1992). "Theory of mind" in Asperger's syndrome. Journal of Child

Bowler, D. M. (2001). Autism: Specific cognitive deficit or emergent end point of

multiple interacting systems? In J. A. Burack, T. Charman, N. Yirmiya, & P. R.

Zelazo (Eds.), The development of autism: Perspectives from theory and

research (pp. 219-235). Mahwah, NJ: Lawrence Erlbaum Associates.

Bowler, D. M., & Briskman, J. A. (2000). Photographic cues do not always facilitate

performance on false belief tasks in children with autism. Journal of Autism &

Brian, J. A., Tipper, S., Weaver, B., & Bryson, S. (2003). Inhibitory mechanisms in

autism spectrum disorders: Typical selective inhibition of location versus

facilitated perceptual processing. Journal of Child Psychology & Psychiatry &

Briskman, J., Happé, F., & Frith, U. (2001). Exploring the cognitive phenotype of

autism: Weak "central coherence" in parents and siblings of children in autism:

II. Real-life skills and preferences. Journal of Child Psychology & Psychiatry &

Brothers, L. (1996). Brain mechanisms of social cognition. Journal of

Psychopharmacology, 10(1), 2-8.

Brown, R., Hobson, R., Lee, A., & Stevenson, J. (1997). Are there "autistic-like"

features in congenitally blind children? Journal of Child Psychology &

Psychiatry & Allied Disciplines, 38(6), 693-703.

Bruner, J., & Feldman, C. (1993). Theories of mind and the problem of autism. In S.

minds: Perspectives from autism. Oxford: Oxford University Press.

Brunet, E., Sarfati, Y., Hardy-Baylé, M. C., & Decety, J. (2000). A PET investigation of

the attribution of intentions with a non-verbal task. NeuroImage, 11, 157-166.

Bryson, S. E., Landry, R., & Wainwright, J. A. (1997). A componential view of

executive dysfunction in autism: Review of recent evidence. In J. A. Burack & J.

T. Enns (Eds.), Attention, development, and psychopathology (pp. 232-255).

New York: The Guilford Press.

Buitelaar, J. K., Swaab, H., van der Wees, M., Wildschut, M., & van der Gaag, R. J.

(1996). Neuropsychological impairments and deficits in theory of mind and

emotion recognition in a non-autistic boy. European Child & Adolescent

Psychiatry, 5(1), 44-51.

Burack, J. A. (1994). Selective attention deficits in persons with autism: Preliminary

evidence of an inefficient attentional lens. Journal of Abnormal Psychology,

103(3), 535-543.

Burack, J. A., Charman, T., Yirmiya, N., & Zelazo, P. R. (Eds.). (2001). The

development of autism: Perspectives from theory and research. Mahwah, NJ:

Burgess, P. W. (1997). Theory and methodology in executive function research. In P.

Rabbitt (Ed.), Methodology of frontal and executive function (pp. 81-116). Hove,

UK: Psychology Press.

Burgess, P. W. (2000). Strategy application disorder: The role of the frontal lobes in

human multitasking. Psychological Research, 63(3-4), 279-288.

Burgess, P. W., Alderman, N., Evans, J., Emslie, H., & Wilson, B. A. (1998). The

ecological validity of tests of executive function. Journal of the International

Neuropsychological Society, 4(6), 547-558.

Cabeza, R., & Nyberg, L. (2000). Imaging cognition II: An empirical review of 275

PET and fMRI studies. Journal of Cognitive Neuroscience, 12(1), 1-47.

Cantwell, D. P., Baker, L., & Rutter, M. (1979). Families of autistic and dysphasic

children: I. Family life and interaction patterns. Archives of General Psychiatry,

36(6), 682-687.

Capps, L., Kehres, J., & Sigman, M. (1998). Conversational abilities among children

with autism and children with developmental delays. Autism, 2(4), 325-344.

Carlin, D., Bonerba, J., Phipps, M., Alexander, G., Shapiro, M., & Grafman, J. (2000).

Planning impairments in frontal lobe dementia and frontal lobe lesion patients.

Neuropsychologia, 38(5), 655-665.

Carlson, S. M., & Moses, L. J. (2001). Individual differences in inhibitory control and

children's theory of mind. Child Development, 72(4), 1032-1053.

Carlson, S. M., Moses, L. J., & Breton, C. (2002). How specific is the relation between

executive function and theory of mind? Contributions of inhibitory control and

working memory. Infant & Child Development, 11(2), 73-92.

Carlson, S. M., Moses, L. J., & Hix, H. R. (1998). The role of inhibitory processes in

young children's difficulties with deception and false belief. Child Development,

69(3), 672-691.

Carpenter, M., Pennington, B. F., & Rogers, S. J. (2001). Understanding of others'

intentions in children with autism. Journal of Autism & Developmental

Disorders, 31(6), 589-599.

Carruthers, P. (1996). Autism as mind-blindness: An elaboration and partial defence. In

P. Carruthers & P. K. Smith (Eds.), Theories of theories of mind (pp. 257-273).

Cambridge: Cambridge University Press.

Carruthers, P., & Smith, P. K. (Eds.). (1996). Theories of theories of mind. Cambridge:

Cambridge University Press.

Casanova, M. F., Buxhoeveden, D. P., Switala, A. E., & Roy, E. (2002). Minocolumnar

pathology in autism. Neurology, 58(3), 428-432.

Case, R. (1985). Intellectual development: From birth to adulthood. New York:

Academic Press.

Castelli, F., Frith, C., Happé, F., & Frith, U. (2002). Autism, Asperger syndrome and

brain mechanisms for the attribution of mental states to animated shapes. Brain,

125, 1839-1849.

Chakrabarti, S., & Fombonne, E. (2001). Pervasive developmental disorders in

preschool children. Jama: Journal of the American Medical Association,

285(24), 3093-3099.

Chandler, M. J., Fritz, A. S., & Hala, S. M. (1989). Small scale deceit: Deception as a

marker of 2-, 3- and 4-year-olds' early theories of mind. Child Development, 60,

1263-1277.

Chandler, M., & Hala, S. (1994). The role of personal involvement in the assessment of

early false belief skills. In C. Lewis & P. Mitchell (Eds.), Children's early

understanding of mind: Origins and development (pp. 403-425). Hillsdale, NJ:

Lawrence Erlbaum.

Channon, S., & Crawford, S. (2000). The effects of anterior lesions on performance on a

story comprehension test: Left anterior impairment on a theory of mind-type

task. Neuropsychologia, 38(7), 1006-1017.

Channon, S., Flynn, D., & Robertson, M. M. (1992). Attentional deficits in Gilles de la

Tourette syndrome. Neuropsychiatry, Neuropsychology, & Behavioral

Neurology, 5(3), 170-177.

Charman, T. (2000). Theory of mind and the early diagnosis of autism. In S. Baron-

Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds:

Perspectives from developmental cognitive neuroscience (2nd ed., pp. 422-441).

London: Oxford University Press.

Charman, T., & Baird, G. (2002). Practitioner review: Diagnosis of autism spectrum

disorder in 2- and 3-year-old children. Journal of Child Psychology &

Charman, T., & Baron-Cohen, S. (1992). Understanding drawings and beliefs: A further

test of the metarepresentation theory of autism: A research note. Journal of

Charman, T., & Baron-Cohen, S. (1995). Understanding photos, models, and beliefs: A

test of the modularity thesis of theory of mind. Cognitive Development, 10(2),

287-298.

Charman, T., & Campbell, A. (1997). Reliability of theory of mind task performance by

individuals with a learning disability: A research note. Journal of Child

Charman, T., Carroll, F., & Sturge, C. (2001). Theory of mind, executive function and

social competence in boys with ADHD. Emotional & Behavioural Difficulties,

6(1), 31-49.

Charman, T., & Lynggaard, H. (1998). Does a photographic cue facilitate false belief

performance in subjects with autism? Journal of Autism & Developmental

Disorders, 28(1), 33-42.

Chelune, G. J., & Baer, R. A. (1986). Developmental norms for the Wisconsin Card

Sorting Test. Journal of Clinical & Experimental Neuropsychology, 8(3), 219-

Christensen, K. J., Kim, S. W., Dysken, M. W., & Hoover, K. M. (1992).

Neuropsychological performance in obsessive-compulsive disorder. Biological

Psychiatry, 31(1), 4-18.

Cicerone, K. D., & Tanenbaum, L. N. (1997). Disturbance of social cognition after

traumatic orbitofrontal brain injury. Archives of Clinical Neuropsychology,

12(2), 173-188.

Ciesielski, K. T., & Harris, R. J. (1997). Factors related to performance failure on

executive tasks in autism. Child Neuropsychology, 3(1), 1-12.

Clark, P., & Rutter, M. (1981). Autistic children's responses to structure and to

interpersonal demands. Journal of Autism & Developmental Disorders, 11(2),

201-217.

Cohen, J. (1988). Statistical power analysis for the behavioural sciences. (2nd ed.).

Hillsdale, NJ: Lawrence Erlbaum.

Cohen, J., & Cohen, P. (1983). Applied multiple regression/correlation analysis for the

behavioral sciences. (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum.

Collette, F., & van der Linden, M. (2002). Brain imaging of the central executive

component of working memory. Neuroscience & Biobehavioral Reviews, 26(2),

105-125.

Collette, F., van der Linden, M., & Salmon, E. (1999). Executive dysfunction in

Alzheimer's disease. Cortex, 35(1), 57-72.

Colvert, E., Custance, D., & Swettenham, J. (2002). Rule-based reasoning and theory of

mind in autism: A commentary on the work of Zelazo, Jacques, Burack and

Frye. Infant & Child Development, 11(2), 197-200.

Corcoran, R. (2000). Theory of mind in other clinical conditions: Is a selective 'theory

of mind' deficit exclusive to autism? In S. Baron-Cohen, H. Tager-Flusberg, &

D. J. Cohen (Eds.), Understanding other minds: Perspectives from

developmental cognitive neuroscience (2nd ed., pp. 391-421). London: Oxford

University Press.

Corcoran, R., Mercer, G., & Frith, C. D. (1995). Schizophrenia, symptomatology and

social influence: Investigating "theory of mind" in people with schizophrenia.

Schizophrenia Research, 17(1), 5-13.

Courchesne, E. (1997). Brainstem, cerebellar and limbic neuroanatomical abnormalities

in autism. Current Opinion in Neurobiology, 7(2), 269-278.

Courchesne, E., Townsend, J., Akshoomoff, N. A., Saitoh, O., Yeung-Courchesne, R.,

Lincoln, A. J., James, H. E., Haas, R. H., Schreibman, L., & Lau, L. (1994).

Impairment in shifting attention in autistic and cerebellar patients. Behavioral

Neuroscience, 108(5), 848-865.

Courchesne, E., Townsend, J., & Chase, C. (1995). Neurodevelopmental principles

guide research on developmental psychopathologies. In D. Cicchetti & D. J.

Cohen (Eds.), Developmental psychopathology (Vol. 1: Theory and methods, pp.

195-226). Oxford: John Wiley & Sons.

Cox, C. S., Fedio, P., & Rapoport, J. L. (1989). Neuropsychological testing of

obsessive-compulsive adolescents. In J. L. Rapoport (Ed.), Obsessive-

compulsive disorder in children and adolescents (pp. 73-85). Washington, DC:

American Psychiatric Press.

Craig, J., & Baron-Cohen, S. (1999). Creativity and imagination in autism and Asperger

syndrome. Journal of Autism & Developmental Disorders, 29(4), 319-326.

Cripe, L. I. (1996). The ecological validity of executive function testing. In R. J.

Sbordone & C. J. Long (Eds.), Ecological validity of neuropsychological testing

(pp. 171-202). Delray Beach, FL: GR Press/St Lucie Press.

Culbertson, W. C., & Zillmer, E. A. (1998a). The construct validity of the Tower of

LondonDX as a measure of the executive functioning of ADHD children.

Assessment, 5(3), 215-226.

Culbertson, W. C., & Zillmer, E. A. (1998b). The Tower of LondonDX: A standardized

approach to assessing executive functioning in children. Archives of Clinical

Neuropsychology, 13(3), 285-301.

Dadds, M. R., Schwartz, S., Adams, T., & Rose, S. (1988). The effects of social context

and verbal skill on the stereotypic and task-involved behaviour of autistic

children. Journal of Child Psychology & Psychiatry & Allied Disciplines, 29(5),

669-676.

Dagher, A., Owen, A. M., Boecker, H., & Brooks, D. J. (1999). Mapping the network

for planning: A correlational PET activation study with the Tower of London

ask. Brain, 122(10), 1973-1987.

Dahlgren, S., Dahlgren Sandberg, A., & Hjelmquist, E. (2003). The non-specificity of

theory of mind deficits: Evidence from children with communicative disabilities.

European Journal of Cognitive Psychology, 15(1), 129-155.

Dahlgren, S. O., & Trillingsgaard, A. (1996). Theory of mind in non-retarded children

with autism and Asperger's syndrome: A research note. Journal of Child

Damasio, A. R., & Maurer, R. G. (1978). A neurological model for childhood autism.

Archives of Neurology, 35, 777-786.

Davis, H. L., & Pratt, C. (1995). The development of children's theory of mind: The

working memory explanation. Australian Journal of Psychology, 47(1), 25-31.

Dawson, G. (1991). A psychobiological perspective on the early socio-emotional

development of children with autism. In D. Cicchetti & S. L. Toth (Eds.),

Rochester symposium on developmental psychopathology (Vol. 3: Models and

integrations, pp. 207-234). Rochester, NY: University of Rochester Press

Dawson, G., & Adams, A. (1984). Imitation and social responsiveness in autistic

children. Journal of Abnormal Child Psychology, 12(2), 209-225.

Dawson, G., & Fernald, M. (1987). Perspective-taking ability and its relationship to the

social behavior of autistic children. Journal of Autism and Developmental

Disorders, 17, 487-498.

Dawson, G., Klinger, L. G., Panagiotides, H., Lewy, A., & Vastelloe, P. (1995).

Subgroups of autistic children based on social behavior display distinct patterns

of brain activity. Journal of Abnormal Child Psychology, 23(5), 569-583.

Dawson, G., & Lewy, A. (1989). Arousal, attention, and the socioemotional

impairments of individuals with autism. In G. Dawson (Ed.), Autism: Nature,

diagnosis, and treatment (pp. 49-74). New York: The Guildford Press.

Dawson, G., Meltzoff, A. N., Osterling, J., & Rinaldi, J. (1998). Neuropsychological

correlates of early symptoms of autism. Child Development, 69(5), 1276-1285.

Dawson, G., Munson, J., Estes, A., Osterling, J., McPartland, J., Toth, K., Carver, L., &

Abbott, R. (2002a). Neurocognitive function and joint attention ability in young

children with autism spectrum disorder versus developmental delay. Child

Development, 73(2), 345-358.

Dawson, G., Osterling, J., Rinaldi, J., Carver, L., & McPartland, J. (2001). Brief report:

Recognition memory and stimulus-reward associations: Indirect support for the

role of ventromedial prefrontal dysfunction in autism. Journal of Autism &

Dawson, G., Webb, S., Schellenberg, G. D., Dager, S., Friedman, S., Aylward, E., &

Richards, T. (2002b). Defining the broader phenotype of autism: Genetic, brain,

and behavioral perspectives. Development & Psychopathology, 14(3), 581-611.

de Villiers, J. (2000). Language and theory of mind: What are the developmental

relationships? In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.),

Understanding other minds: Perspectives from developmental cognitive

neuroscience (2nd ed., pp. 83-123). London: Oxford University Press.

de Villiers, J. G., & de Villiers, P. A. (2000). Linguistic determinism and the

understanding of false beliefs. In P. Mitchell & K. J. Riggs (Eds.), Children's

reasoning and the mind (pp. 191-228). Hove, UK: Psychology Press.

Deb, S., & Thompson, B. (1998). Neuroimaging in autism. British Journal of

Psychiatry, 173, 299-302.

Delis, D. C., Squire, L. R., Bihrle, A., & Massman, P. J. (1992). Componential analysis

of problem-solving ability: Performance of patients with frontal lobe damage

and amnesic patients on a new sorting test. Neuropsychologia, 30(8), 683-697.

DeLong, G., & Dwyer, J. T. (1988). Correlation of family history with specific autistic

subgroups: Asperger's syndrome and bipolar affective disease. Journal of Autism

& Developmental Disorders, 18(4), 593-600.

Dempster, F. N. (1992). The rise and fall of the inhibitory mechanism: Toward a unified

theory of cognitive development and aging. Developmental Review, 12(1), 45-

Dempster, F. (1993). Resistance to interference: Developmental changes in a basic

processing mechanism. In M. L. Howe & R. Pasnak (Eds.), Emerging themes in

cognitive development (Vol. 1: Foundations, pp. 3-27). New York: Springer-

Verlag.

Dennett, D. C. (1978). Beliefs about beliefs. The Behavioral and Brain Sciences, 4, 568-

Dewey, M. (1991). Living with Asperger's syndrome. In U. Frith (Ed.), Autism and

Asperger syndrome. (pp. 184-206). Cambridge: Cambridge University Press.

Diamond, A. (1985). Development of the ability to use recall to guide action, as

indicated by infants' performance on AB. Child Development, 56(4), 868-883.

Diamond, A. (2002). Normal development of prefrontal cortex from birth to young

adulthood: Cognitive functions, anatomy, and biochemistry. In D. T. Stuss & R.

T. Knight (Eds.), Principles of frontal lobe function (pp. 466-503). London:

Diamond, A., & Goldman-Rakic, P. S. (1989). Comparison of human infants and rhesus

monkeys on Piaget's AB task: Evidence for dependence on dorsolateral

prefrontal cortex. Experimental Brain Research, 74, 24-40.

Diamond, A., Prevor, M. B., Callender, G., & Druin, D. P. (1997). Prefrontal cortex

cognitive deficits in children treated early and continuously for PKU.

Monographs of the Society for Research in Child Development, 62(4), 1-205.

Diamond, A., & Taylor, C. (1996). Development of an aspect of executive control:

Development of the abilities to remember what I said and to "Do as I say, not as

I do". Developmental Psychobiology, 29(4), 315-334.

Donnellan, A. M., Anderson, J. L., & Mesaros, R. A. (1984). An observational study of

stereotypic behavior and proximity related to the occurrence of autistic child-

family member interactions. Journal of Autism & Developmental Disorders,

14(2), 205-210.

Dorris, L., Espie, C. A. E., Knott, F., & Salt, J. (2004). Mind-reading difficulties in the

siblings of people with Asperger's syndrome: Evidence for a genetic influence in

the abnormal development of a specific cognitive domain. Journal of Child

Psychology & Psychiatry, 45(2), 412-418.

Downes, J. J., Roberts, A. C., Sahakian, B. J., Evenden, J. L., Morris, R. G., & Robbins,

T. W. (1989). Impaired extra-dimensional shift performance in medicated and

unmedicated Parkinson’s disease: Evidence for a specific attentional

dysfunction. Neuropsychologia, 27, 1329-1343.

Drewe, E. (1974). The effect of type and area of brain lesion on Wisconsin Card Sorting

Test performance. Cortex, 10(2), 159-170.

Drewe, E. A. (1975). An experimental investigation of Luria's theory on the effects of

frontal lobe lesions in man. Neuropsychologia, 13, 421-429.

Duncan, J., Burgess, P., & Emslie, H. (1995). Fluid intelligence after frontal lobe

lesions. Neuropsychologia, 33(3), 261-268.

Duncan, J., Emslie, H., Williams, P., Johnson, R., & Freer, C. (1996). Intelligence and

the frontal lobe: The organization of goal-directed behavior. Cognitive

Psychology, 30(3), 257-303.

Eisenberg, L., & Kanner, L. (1956). Early infantile autism, 1943-55. American Journal

of Orthopsychiatry, 26, 556-566.

Eisenmajer, R., & Prior, M. (1991). Cognitive linguistic correlates of "theory of mind"

ability in autistic children. British Journal of Developmental Psychology, 9(2),

351-364.

Elliott, R., McKenna, P., Robbins, T., & Sahakian, B. (1995). Neuropsychological

evidence for frontostriatal dysfunction in schizophrenia. Psychological

Medicine, 25(3), 619-630.

Eslinger, P. J. (1996). Conceptualizing, describing, and measuring components of

executive function: A summary. In G. R. Lyon & N. A. Krasnegor (Eds.),

Attention, memory, and executive function (pp. 367-395). Baltimore, MD: Paul

H. Brookes.

Eslinger, P. J. (1998). Neurological and neuropsychological bases of empathy.

European Neurology, 39(4), 193-199.

Eslinger, P. J., Biddle, K. R., & Grattan, L. M. (1997). Cognitive and social

development in children with prefrontal cortex lesions. In N. A. Krasnegor, G.

R. Lyon, & P. S. Goldman-Rakic (Eds.), Development of the prefrontal cortex:

Evolution, neurobiology, and behavior (pp. 295-335). Baltimore, MD: Paul H.

Brookes.

Eslinger, P. J., & Damasio, A. R. (1985). Severe disturbance of higher cognition after

bilateral frontal lobe ablation: Patient EVR. Neurology, 35(12), 1731-1741.

Eslinger, P. J., Grattan, L. M., Damasio, H., & Damasio, A. R. (1992). Developmental

consequences of childhood frontal lobe damage. Archives of Neurology, 49, 764-

Espy, K. A., Kaufmann, P. M., Glisky, M. L., & McDiarmid, M. (2001). New

procedures to assess executive functions in preschool children. Clinical

Neuropsychologist, 15(1), 46-58.

Espy, K. A., Kaufmann, P. M., McDiarmid, M. D., & Glisky, M. L. (1999). Executive

functioning in preschool children: Performance on A-not-B and other delayed

response format tasks. Brain & Cognition, 41(2), 178-199.

Fein, D., Stevens, M., Dunn, M., Waterhouse, L., Allen, D., Rapin, I., & Feinstein, C.

(1999). Subtypes of pervasive developmental disorder: Clinical characteristics.

Child Neuropsychology, 5(1), 1-23.

Fine, C., Lumsden, J., & Blair, R. (2001). Dissociation between "theory of mind" and

executive functions in a patient with early left amygdala damage. Brain, 124(2),

287-298.

Flavell, J. H., Flavell, E. R., & Green, F. L. (1983). Development of the appearance-

reality distinction. Cognitive Psychology, 15(1), 95-120.

Fletcher, P. C., Happé, F., Frith, U., Baker, S. C., Dolan, R. J., Frackowiak, R. S. J., &

Frith, C. D. (1995). Other minds in the brain: A functional imaging study of

"theory of mind" in story comprehension. Cognition, 57, 109-128.

Fodor, J. A. (1983). The modularity of mind. Cambridge, MA: MIT Press.

Folstein, S. E., Bisson, E., Santangelo, S. L., & Piven, J. (1998). Finding specific genes

that cause autism: A combination of approaches will be needed to maximize

power. Journal of Autism & Developmental Disorders, 28(5), 439-445.

Folstein, S., & Rutter, M. (1977). Infantile autism: A genetic study of 21 twin pairs.

Journal of Child Psychology & Psychiatry & Allied Disciplines, 18(4), 297-321.

Folstein, S. E., Santangelo, S. L., Gilman, S. E., Piven, J., Landa, R., Lainhart, J., Hein,

J., & Wzorek, M. (1999). Predictors of cognitive test patterns in autism families.

Journal of Child Psychology & Psychiatry & Allied Disciplines, 40(7), 1117-

Fombonne, E. (2003). Epidemiological surveys of autism and other pervasive

developmental disorders: An update. Journal of Autism & Developmental

Disorders, 33(4), 365-382.

Fombonne, E., Bolton, P., Prior, J., Jordan, H., & Rutter, M. (1997). A family study of

autism: Cognitive patterns and levels in parents and siblings. Journal of Child

Fonagy, P., Steele, M., Steele, H., Leigh, T., Kennedy, R., Mattoon, G., & Target, M.

(1995). Attachment, the reflective self, and borderline states: The predictive

specificity of the Adult Attachment Interview and pathological emotional

development. In S. Goldberg, R. Muir, & J. Kerr (Eds.), Attachment theory:

Social, developmental, and clinical perspectives (pp. 233-278). New York:

Analytic Press.

Freeman, B., Ritvo, E. R., Mason-Brothers, A., Pingree, C., Yokota, A., Jenson, W.,

McMahon, W., Peterson, B., Mo, A., & Schroth, P. (1989). Psychometric

assessment of first-degree relatives of 62 autistic probands in Utah. American

Journal of Psychiatry, 146(3), 361-364.

Freeman, N. H., & Lacohée, H. (1995). Making explicit 3-year-olds' implicit

competence with their own false beliefs. Cognition, 56(1), 31-60.

Freeman, N. H., Lewis, C., & Doherty, M. (1991). Preschoolers' grasp of a desire for

knowledge in false-belief prediction: Practical intelligence and verbal report.

British Journal of Developmental Psychology, 9(1), 139-157.

Frith, C. D. (1992). The cognitive neuropsychology of schizophrenia. Hillsdale, NJ:

Lawrence Erlbaum.

Frith, C., & Frith, U. (2000). The physiological basis of theory of mind: Functional

neuroimaging studies. In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen

(Eds.), Understanding other minds: Perspectives from developmental cognitive

Frith, U. (1972). Cognitive mechanisms in autism: Experiments with color and tone

sequence production. Journal of Autism and Childhood Schizophrenia, 2(2),

160-173.

Frith, U., & Happé, F. (1994). Autism: Beyond "theory of mind". Cognition, 50(1-3),

115-132.

Frith, U., Happé, F., & Siddons, F. (1994). Autism and theory of mind in everyday life.

Social Development, 3(2), 108-124.

Frith, U., Morton, J., & Leslie, A. M. (1991). The cognitive basis of a biological

disorder: Autism. Trends in Neurosciences, 14, 433-438.

Frye, D. (1999). Development of intention: The relation of executive function to theory

of mind. In P. D. Zelazo, J. W. Astington, & D. Olson (Eds.), Developing

theories of intention: Social understanding and self-control (pp. 119-132).

Mahwah, NJ: Lawrence Erlbaum Associates.

Frye, D. (2000). Theory of mind, domain specificity, and reasoning. In P. Mitchell & K.

J. Riggs (Eds.), Children's reasoning and the mind (pp. 149-167). Hove, UK:

Psychology Press.

Frye, D., & Zelazo, P. D. (1998). Complexity: From formal analysis to final action.

Behavioural and Brain Sciences, 21(6), 836-837.

Frye, D., Zelazo, P. D., Brooks, P. J., & Samuels, M. C. (1996). Inference and action in

early causal reasoning. Developmental Psychology, 32(1), 120-131.

Frye, D., Zelazo, P. D., & Burack, J. A. (1998). Cognitive complexity and control: I.

Theory of mind in typical and atypical development. Current Directions in

Psychological Science, 7(4), 116-121.

Frye, D., Zelazo, P. D., & Palfai, T. (1995). Theory of mind and rule-based reasoning.

Cognitive Development, 10(4), 483-527.

Fuster, J. M. (2000). Prefrontal neurons in networks of executive memory. Brain

Research Bulletin, 52(5), 331-336.

Gallagher, H. L., & Frith, C. D. (2003). Functional imaging of 'theory of mind'. Trends

in Cognitive Sciences, 7(2), 77-83.

Gallagher, H., Happé, F., Brunswick, N., Fletcher, P., Frith, U., & Frith, C. (2000).

Reading the mind in cartoons and stories: an fMRI study of 'theory of the mind'

in verbal and nonverbal tasks. Neuropsychologia, 38(1), 11-21.

Garcia-Villamisar, D., & Della Sala, S. (2002). Dual-task performance in adults with

autism. Cognitive Neuropsychiatry, 7(1), 63-74.

Garner, C., Callias, M., & Turk, J. (1999). Executive function and theory of mind

performance of boys with fragile-X syndrome. Journal of Intellectual Disability

Research, 43(6), 466-474.

Garretson, H. B., Fein, D., & Waterhouse, L. (1990). Sustained attention in children

with autism. Journal of Autism & Developmental Disorders, 20(1), 101-114.

George, M. S., Costa, D. C., Kouris, K., Ring, H. A., & Ell, P. J. (1992). Cerebral blood

flow abnormalities in adults with infantile autism. Journal of Nervous & Mental

Disease, 180(7), 413-417.

German, T. P., & Leslie, A. M. (2000). Attending to and learning about mental states. In

P. Mitchell & K. J. Riggs (Eds.), Children's reasoning and the mind (pp. 229-

252). Hove, UK: Psychology Press.

Gerstadt, C. L., Hong, Y. J., & Diamond, A. (1994). The relationship between cognition

and action: Performance of children 31/2-7 years old on a Stroop-like day-night

test. Cognition, 53(2), 129-153.

Gillberg, C., & Coleman, M. (1992). The biology of the autistic syndromes. (2nd ed.).

London: Mac Keith Press.

Gilotty, L., Kenworthy, L., Sirian, L., Black, D. O., & Wagner, A. E. (2002). Adaptive

skills and executive function in autism spectrum disorders. Child

Godefroy, O., Cabaret, M., Petit-Chenal, V., Pruvo, J.-P., & Rousseaux, M. (1999).

Control functions of the frontal lobes: Modularity of the central-supervisory

system? Cortex, 35(1), 1-20.

Goel, V., Grafman, J., Sadato, N., & Hallett, M. (1995). Modeling other minds.

Neuroreport, 6(13), 1741-1746.

Goldberg, M., Lasker, A., Zee, D., Garth, E., Tien, A., & Landa, R. (2002). Deficits in

the initiation of eye movements in the absence of a visual target in adolescents

with high functioning autism. Neuropsychologia, 40(12), 2039-2049.

Golden, C. J. (1981). The Luria-Nebraska Children's Battery: Theory and formulation.

In G. W. Hynd & J. E. Obrzut (Eds.), Neuropsychological assessment of the

school-aged child (pp. 277-302). New York: Grune & Stratton.

Goldman-Rakic, P. S. (1995). Architecture of the prefrontal cortex and the central

executive. In J. Grafman, K. J. Holyoak, & F. Boller (Eds.), Structure and

functions of the human prefrontal cortex (pp. 71-83). New York: New York

Academy of Sciences.

Goldman-Rakic, P. S., & Leung, H.-C. (2002). Functional architecture of the

dorsolateral prefrontal cortex in monkeys and humans. In D. T. Stuss & R. T.

Knight (Eds.), Principles of frontal lobe function (pp. 85-95). London: Oxford

University Press.

Goldstein, G., Johnson, C. R., & Minshew, N. J. (2001). Attentional processes in

autism. Journal of Autism & Developmental Disorders, 31(4), 433-440.

Goodman, R. (1989). Infantile autism: A syndrome of multiple primary deficits?

Gopnik, A. (1993). How we know our minds: The illusion of first-person knowledge of

intentionality. Behavioral & Brain Sciences, 16(1), 1-14, 29-113.

Gopnik, A., & Astington, J. W. (1988). Children's understanding of representational

change and its relation to the understanding of false belief and the appearance-

reality distinction. Child Development, 59(1), 26-37.

Gopnik, A., & Meltzoff, A. N. (1997). Words, thoughts, and theories. Cambridge, MA:

MIT Press.

Gopnik, A., & Wellman, H. M. (1994). The theory theory. In L. A. Hirschfeld & S. A.

Gelman (Eds.), Mapping the mind: Domain specificity in cognition and culture

(pp. 257-293). Cambridge: Cambridge University Press.

Gordon, A. C. L., & Olson, D. R. (1998). The relation between acquisition of a theory

of mind and the capacity to hold in mind. Journal of Experimental Child

Psychology, 68(1), 70-83.

Gottesman, I. I., & Gould, T. D. (2003). The endophenotype concept in psychiatry:

Etymology and strategic intentions. American Journal of Psychiatry, 160(4),

636-645.

Grafman, J. (1994). Alternative frameworks for the conceptualization of prefrontal lobe

functions. In F. Boller & J. Grafman (Eds.), Handbook of Neuropsychology

(Vol. 9, pp. 187-201). Amsterdam: Elsevier Science.

Grant, C. M., Grayson, A., & Boucher, J. (2001). Using tests of false belief with

children with autism: How valid and reliable are they? Autism, 5(2), 135-145.

Grant, D. A., & Berg, E. (1948). A behavioral analysis of degree of reinforcement and

ease of shifting to new responses in a Weigl-type card-sorting problem. Journal

of Experimental Psychology, 38, 404-411.

Grattan, L. M., Bloomer, R. H., Archambault, F. X., & Eslinger, P. J. (1994). Cognitive

flexibility and empathy after frontal lobe lesion. Neuropsychiatry,

Neuropsychology, and Behavioral Neurology, 7(4), 251-259.

Gregory, C., Lough, S., Stone, V., Erzinclioglu, S., Martin, L., Baron-Cohen, S., &

Hodges, J. R. (2002). Theory of mind in patients with frontal variant

frontotemporal dementia and Alzheimer's disease: Theoretical and practical

implications. Brain, 125(4), 752-764.

Griffith, E. M., Pennington, B. F., Wehner, E. A., & Rogers, S. J. (1999). Executive

functions in young children with autism. Child Development, 70(4), 817-832.

Grodzinsky, G. M., & Diamond, R. (1992). Frontal lobe functioning in boys with

attention-deficit hyperactivity disorder. Developmental Neuropsychology, 8(4),

427-445.

Grossman, M. (2002). Frontotemporal dementia: A review. Journal of the International

Hala, S., Hug, S., & Henderson, A. (2003). Executive function and false-belief

understanding in preschool children: Two tasks are harder than one. Journal of

Cognition & Development, 4(3), 275-298.

Hala, S., & Russell, J. (2001). Executive control within strategic deception: A window

on early cognitive development? Journal of Experimental Child Psychology,

80(2), 112-141.

Halford, G. S. (1993). Children's understanding: The development of mental models.

Hillsdale, NJ: Lawrence Erlbaum.

Halford, G. S., Wilson, W. H., & Phillips, S. (1998). Processing capacity defined by

relational complexity: Implications for comparative, developmental, and

cognitive psychology. Behavioral & Brain Sciences, 21(6), 803-864.

Happé, F. G. E. (1994a). An advanced test of theory of mind: Understanding of story

characters' thoughts and feelings by able autistic, mentally handicapped, and

normal children and adults. Journal of Autism & Developmental Disorders,

24(2), 129-154.

Happé, F. G. E. (1994b). Annotation: Current psychological theories of autism: The

"Theory of Mind" account and rival theories. Journal of Child Psychology &

Happé, F. G. E. (1994c). Wechsler IQ profile and theory of mind in autism: A research

note. Journal of Child Psychology & Psychiatry & Allied Disciplines, 35(8),

1461-1471.

Happé, F. G. (1995). The role of age and verbal ability in the theory of mind task

performance of subjects with autism. Child Development, 66(3), 843-855.

Happé, F. G. E. (1996). Studying weak central coherence at low levels: Children with

autism do not succumb to visual illusions. Journal of Child Psychology and

Psychiatry, 37(7), 873-877.

Happé, F. G. E. (1997). Central coherence and theory of mind in autism: Reading

homographs in context. British Journal of Developmental Psychology, 15, 1-12.

Happé, F. (1999). Understanding assets and deficits in autism: Why success is more

interesting than failure. Psychologist, 12(11), 540-546.

Happé, F. (2000). Parts and wholes, meaning and minds: Central coherence and its

relation to theory of mind. In S. Baron-Cohen, H. Tager-Flusberg, & D. J.

Cohen (Eds.), Understanding other minds: Perspectives from developmental

cognitive neuroscience (2nd ed., pp. 203-221). London: Oxford University

Press.

Happé, F. (2001). Social and nonsocial development in autism: Where are the links? In

J. A. Burack, T. Charman, N. Yirmiya, & P. R. Zelazo (Eds.), The development

of autism: Perspectives from theory and research (pp. 237-253). Mahwah, NJ:

Happé, F., Briskman, J., & Frith, U. (2001). Exploring the cognitive phenotype of

autism: Weak "central coherence" in parents and siblings of children with

autism: I. Experimental tests. Journal of Child Psychology & Psychiatry &

Happé, F., Ehlers, S., Fletcher, P., Frith, U., Johansson, M., Gillberg, C., Dolan, R.,

Frackowiak, R., & Frith, C. (1996). 'Theory of mind' in the brain. Evidence from

a PET scan study of Asperger syndrome. Neuroreport, 8(1), 197-201.

Happé, F., & Frith, U. (1995). Theory of mind in autism. In E. Schopler & G. B.

Mesibov (Eds.), Learning and Cognition in Autism (pp. 177-197). New York:

Plenum Press.

Happé, F., & Frith, U. (1996). The neuropsychology of autism. Brain, 119, 1377-1400.

Happé, F., Malhi, G. S., & Checkley, S. (2001). Acquired mind-blindness following

frontal lobe surgery? A single case study of impaired 'theory of mind' in a

patient treated with stereotactic anterior capsulotomy. Neuropsychologia, 39(1),

83-90.

Harnishfeger, K. K., & Bjorklund, D. F. (1993). The ontogeny of inhibition

mechanisms: A renewed approach to cognitive development. In M. L. Howe &

R. Pasnak (Eds.), Emerging themes in cognitive development (Vol. 1:

Foundations, pp. 28-49). New York: Springer-Verlag.

Harris, P. (1993). Pretending and planning. In S. Baron-Cohen, H. Tager-Flusberg, & D.

J. Cohen (Eds.), Understanding other minds: Perspectives from autism (pp. 228-

246). Oxford: Oxford University Press.

Harris, P. L., & Leevers, H. J. (2000). Pretending, imagery, and self-awareness in

autism. In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.),

Understanding other minds: Perspectives from developmental cognitive

Head, D., Bolton, D., & Hymas, N. (1989). Deficit in cognitive shifting ability in

patients with obsessive-compulsive disorder. Biological Psychiatry, 25(7), 929-

Heaton, R. K., Grant, I., & Matthews, C. G. (1991). Comprehensive norms for an

expanded Halstead-Reitan battery: Demographic corrections, research findings,

and clinical applications. Odessa, FL: Psychological Assessment Resources.

Heilman, K., Watson, R., & Valenstein, E. (1993). Neglect and related disorders. In K.

Heilman & E. Valenstein (Eds.), Clinical Neuropsychology (3rd ed., pp. 279-

336). New York: Oxford University Press.

Hermelin, B., & O'Connor, N. (1970). Psychological experiments with autistic children.

New York: Pergamon.

Hill, E. L. (2004). Executive dysfunction in autism. Trends in Cognitive Sciences, 8(1),

26-32.

Hill, E. L., & Frith, U. (2003). Understanding autism: Insights from mind and brain.

Philosophical Transactions of the Royal Society of London B, 358(1430), 281-

Hill, E. L., & Russell, J. (2002). Action memory and self-monitoring in children with

autism: Self versus other. Infant & Child Development, 11(2), 159-170.

Hobson, R. P. (1989). Beyond cognition: A theory of autism. In G. Dawson (Ed.),

Autism: Nature, diagnosis, and treatment (pp. 22-48). New York: The Guilford

Press.

Hobson, R. P. (1993). Understanding persons: The role of affect. In S. Baron-Cohen, H.

Tager-Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives

from autism (pp. 204-227). Oxford: Oxford University Press.

Hoff, A. L., & Kremen, W. S. (2003). Neuropsychology in schizophrenia: An update.

Current Opinion in Psychiatry, 16(2), 149-155.

Hollander, E., King, A., Delaney, K., Smith, C. J., & Silverman, J. M. (2003).

Obsessive-compulsive behaviors in parents of multiplex autism families.

Psychiatry Research, 117(1), 11-16.

Holroyd, S., & Baron-Cohen, S. (1993). Brief report: How far can people with autism

go in developing a theory of mind? Journal of Autism & Developmental

Disorders, 23(2), 379-385.

Hughes, C. (1996a). Brief report: Planning problems in autism at the level of motor

control. Journal of Autism & Developmental Disorders, 26(1), 99-107.

Hughes, C. (1996b). Control of action and thought: Normal development and

dysfunction in autism: A research note. Journal of Child Psychology &

Hughes, C. (1998a). Executive function in preschoolers: Links with theory of mind and

verbal ability. British Journal of Developmental Psychology, 16, 233-253.

Hughes, C. (1998b). Finding your marbles: Does preschoolers' strategic behavior

predict later understanding of mind? Developmental Psychology, 34(6), 1326-

Hughes, C. (2001). Executive dysfunction in autism: Its nature and implications for the

everyday problems experienced by individuals with autism. In J. A. Burack, T.

Charman, N. Yirmiya, & P. R. Zelazo (Eds.), The development of autism:

Perspectives from theory and research (pp. 255-275). Mahwah, NJ: Lawrence

Erlbaum Associates.

Hughes, C., Adlam, A., Happé, F., Jackson, J., Taylor, A., & Caspi, A. (2000). Good

test-retest reliability for standard and advanced false-belief tasks across a wide

range of abilities. Journal of Child Psychology & Psychiatry & Allied

Disciplines, 41(4), 483-490.

Hughes, C., Dunn, J., & White, A. (1998). Trick or treat? Uneven understanding of

mind and emotion and executive dysfunction in "hard-to-manage" preschoolers.

Journal of Child Psychology and Psychiatry, 39(7), 981-994.

Hughes, C., & Graham, A. (2002). Measuring executive functions in childhood:

Problems and solutions? Child & Adolescent Mental Health, 7(3), 131-142.

Hughes, C., Leboyer, M., & Bouvard, M. (1997). Executive function in parents of

children with autism. Psychological Medicine, 27, 209-220.

Hughes, C., Plumet, M.-H., & Leboyer, M. (1999). Towards a cognitive phenotype for

autism: Increased prevalence of executive dysfunction and superior spatial span

amongst siblings of children with autism. Journal of Child Psychology &

Hughes, C., & Russell, J. (1993). Autistic children's difficulty with mental

disengagement from an object: Its implications for theories of autism.

Developmental Psychology, 29(3), 498-510.

Hughes, C., Russell, J., & Robbins, T. W. (1994). Evidence for executive dysfunction in

autism. Neuropsychologia, 32(4), 477-492.

Hughes, C., Soares-Boucaud, I., Hochmann, J., & Frith, U. (1997). Social behaviour in

pervasive developmental disorders: Effects of informant, group and "theory of

mind". European Child & Adolescent Psychiatry, 6(4), 191-198.

Hutt, S., & Hutt, C. (1968). Stereotypy, arousal and autism. Human Development,

11(4), 277-286.

Huttenlocher, P. R., & Dabholkar, A. S. (1997). Developmental anatomy of prefrontal

cortex. In N. A. Krasnegor, G. R. Lyon, & P. S. Goldman-Rakic (Eds.),

Development of the prefrontal cortex: Evolution, neurobiology and behavior

(pp. 69-83). Baltimore, MD: Paul H. Brookes.

Jacques, S., Zelazo, P. D., Kirkham, N. Z., & Semcesen, T. K. (1999). Rule selection

versus rule execution in preschoolers: An error-detection approach.

Jarrold, C., Boucher, J., & Smith, P. K. (1994a). Executive function deficits and the

pretend play of children with autism: A research note. Journal of Child

Jarrold, C., Boucher, J., & Smith, P. K. (1996). Generativity defects in pretend play in

autism. British Journal of Developmental Psychology, 14(3), 275-300.

Jarrold, C., Butler, D. W., Cottington, E. M., & Jimenez, F. (2000). Linking theory of

mind and central coherence bias in autism and in the general population.

Jarrold, C., Smith, P., Boucher, J., & Harris, P. (1994b). Comprehension of pretense in

children with autism. Journal of Autism & Developmental Disorders, 24(4),

433-455.

Jenkins, J. M., & Astington, J. W. (1996). Cognitive factors and family structure

associated with theory of mind development in young children. Developmental

Psychology, 32(1), 70-78.

Johnson, M. H., Siddons, F., Frith, U., & Morton, J. (1992). Can autism be predicted on

the basis of infant screening tests? Developmental Medicine & Child Neurology,

34(4), 316-320.

Jolliffe, T., & Baron-Cohen, S. (2000). Linguistic processing in high-functioning adults

with autism or Asperger's syndrome. Is global coherence impaired?

Psychological Medicine, 30(5), 1169-1187.

Kain, W., & Perner, J. (2003). Do children with ADHD not need their frontal lobes for

theory of mind? A review of brain imaging and neuropsychological studies. In

M. Brüne, H. Ribbert, & W. Schiefenhövel (Eds.), The social brain: Evolution

and pathology (pp. 197-230). Chichester, UK: John Wiley & Sons.

Kane, M. J., & Engle, R. W. (2002). The role of prefrontal cortex in working-memory

capacity, executive attention, and general fluid intelligence: An individual-

differences perspective. Psychonomic Bulletin & Review, 9(4), 637-671.

Karmiloff-Smith, A. (1992). Beyond modularity: A developmental perspective on

cognitive science. Cambridge, MA: The MIT Press.

Karmiloff-Smith, A. (1997). Crucial differences between developmental cognitive

neuroscience and adult neuropsychology. Developmental Neuropsychology,

13(4), 513-524.

Karmiloff-Smith, A., Scerif, G., & Ansari, D. (2003). Double dissociations in

developmental disorders? Theoretically misconceived, empirically dubious.

Cortex, 39(1), 161-163.

Karmiloff-Smith, A., Scerif, G., & Thomas, M. (2002). Different approaches to relating

genotype to phenotype in developmental disorders. Developmental

Psychobiology, 40(3), 311-322.

Keenan, T. (1998). Memory span as a predictor of false belief understanding. New

Zealand Journal of Psychology, 27(2), 36-43.

Keenan, T., Olson, D. R., & Marini, Z. (1998). Working memory and children's

developing understanding of mind. Australian Journal of Psychology, 50(2), 76-

Kerr, N., Dunbar, R. I., & Bentall, R. P. (2003). Theory of mind deficits in bipolar

affective disorder. Journal of Affective Disorders, 73(3), 253-259.

Kimberg, D. Y., & Farah, M. J. (1993). A unified account of cognitive impairments

following frontal lobe damage: The role of working memory in complex,

organized behaviour. Journal of Experimental Psychology: General, 122, 411-

Kleinman, J., Marciano, P. L., & Ault, R. L. (2001). Advanced theory of mind in high-

functioning adults with autism. Journal of Autism and Developmental Disorders,

31(1), 29-36.

Klin, A., & Volkmar, F. (1993). The development of individuals with autism:

Implications for the theory of mind hypothesis. In S. Baron-Cohen, H. Tager-

Flusberg, & D. J. Cohen (Eds.), Understanding other minds: Perspectives from

autism (pp. 317-331). Oxford: Oxford University Press.

Klin, A., Volkmar, F. R., & Sparrow, S. S. (1992). Autistic social dysfunction: Some

limitations of the theory of mind hypothesis. Journal of Child Psychology &

Klin, A., Volkmar, F., Sparrow, S., Cicchetti, D., & Rourke, B. (1995). Validity and

neuropsychological characterization of Asperger syndrome: Convergence with

nonverbal learning disabilities syndrome. Journal of Child Psychology &

Kochanska, G., Murray, K., & Coy, K. C. (1997). Inhibitory control as a contributor to

conscience in childhood: From toddler to early school age. Child Development,

68(2), 263-277.

Koenig, K., Tsatsanis, K. D., & Volkmar, F. R. (2001). Neurobiology and genetics of

autism: A developmental perspective. In J. A. Burack, T. Charman, N. Yirmiya,

& P. R. Zelazo (Eds.), The development of autism: Perspectives from theory and

research (pp. 81-101). Mahwah, NJ: Lawrence Erlbaum Associates.

Krikorian, R., Bartok, J., & Gay, N. (1994). Tower of London procedure: A standard

method and developmental data. Journal of Clinical & Experimental

Landa, R., Piven, J., Wzorek, M. M., Gayle, J. O., Chase, G. A., & Folstein, S. E.

(1992). Social language use in parents of autistic individuals. Psychological

Medicine, 22(1), 245-54.

Lang, B., & Perner, J. (2002). Understanding of intention and false belief and the

development of self-control. British Journal of Developmental Psychology,

20(1), 67-76.

Le Couteur, A., Bailey, A., Goode, S., Pickles, A., Robertson, S., Gottesman, I., &

Rutter, M. (1996). A broader phenotype of autism: The clinical spectrum in

twins. Journal of Child Psychology & Psychiatry & Allied Disciplines, 37(7),

785-801.

Le Couteur, A., Rutter, M., Lord, C., Rios, P., Robertson, S., Holdgrafer, M., &

McLennan, J. D. (1989). Autism Diagnostic Interview: A semi-structured

interview for parents and caregivers of autistic persons. Journal of Autism &

Leboyer, M., Bellivier, F., Nosten-Bertrand, M., Jouvent, R., Pauls, D., & Mallet, J.

(1998). Psychiatric genetics: Search for phenotypes. Trends in Neurosciences,

21(3), 102-105.

Leboyer, M., Plumet, M.-H., Goldblum, M.-C., Perez-Diaz, F., & Marchaland, C.

(1995). Verbal versus visuospatial abilities in relatives of autistic females.

Developmental Neuropsychology, 11(1), 139-155.

Leekam, S. R., & Moore, C. (2001). The development of attention and joint attention in

children with autism. In J. A. Burack, T. Charman, N. Yirmiya, & P. R. Zelazo

(Eds.), The development of autism: Perspectives from theory and research (pp.

105-129). Mahwah, NJ: Lawrence Erlbaum Associates.

Leekam, S. R., & Perner, J. (1991). Does the autistic child have a metarepresentational

deficit? Cognition, 40(3), 203-218.

Leekam, S. R., & Prior, M. (1994). Can autistic children distinguish lies from jokes? A

second look at second-order belief attribution. Journal of Child Psychology &

Leslie, A. M. (1987). Pretense and representation: The origins of "theory of mind".

Psychological Review, 94(4), 412-426.

Leslie, A. M. (1991). The theory of mind impairment in autism: Evidence for a modular

mechanism of development? In A. Whiten (Ed.), Natural theories of mind:

Evolution, development and simulation of everyday mindreading (pp. 63-78).

Oxford: Blackwell.

Leslie, A. M. (1994a). Pretending and believing: Issues in the theory of ToMM.

Cognition, 50(1-3), 211-238.

Leslie, A. M. (1994b). ToMM, ToBy, and Agency: Core architecture and domain

specificity. In L. A. Hirschfeld & S. A. Gelman (Eds.), Mapping the mind:

Domain specificity in cognition and culture (pp. 119-148). Cambridge:

Cambridge University Press.

Leslie, A. M., & Frith, U. (1988). Autistic children's understanding of seeing, knowing

and believing. British Journal of Developmental Psychology, 6, 315-324.

Leslie, A. M., & Happé, F. (1989). Autism and ostensive communication: The relevance

of metarepresentation. Development & Psychopathology, 1(3), 205-212.

Leslie, A. M., & Polizzi, P. (1998). Inhibitory processing in the false belief task: Two

conjectures. Developmental Science, 1(2), 247-253.

Leslie, A., & Roth, D. (1993). What autism teaches us about metarepresentation. In S.

minds: Perspectives from autism (pp. 83-111). Oxford: Oxford University Press.

Leslie, A. M., & Thaiss, L. (1992). Domain specificity in conceptual development:

Neuropsychological evidence from autism. Cognition, 43(3), 225-251.

Levin, H. S., Culhane, K. A., Hartmann, J., Evankovich, K., Mattson, A. J., Harward,

H., Ringholz, G., Ewing-Cobbs, L., & Fletcher, J. M. (1991). Developmental

changes in performance on tests of purported frontal lobe functioning.

Levin, H. S., Fletcher, J. M., Kufera, J. A., Lilly, M. A., Mendelsohn, D., Bruce, D., &

Eisenberg, H. M. (1996). Dimensions of cognition measured by the Tower of

London and other cognitive tasks in head-injured children and adolescents.

Levin, H. S., Mendelsohn, D. B., Lilly, M. A., Fletcher, J. M., Culhane, K. A.,

Chapman, S. B., Harward, H., Kusnerik, L., Bruce, D., & Eisenberg, H. M.

(1994). Tower of London performance in relation to Magnetic Resonance

Imaging following closed head injury in children. Neuropsychology, 8(2), 171-

Levine, B., Stuss, D. T., Milberg, W. P., Alexander, M. P., Schwartz, M., &

MacDonald, R. (1998). The effects of focal and diffuse brain damage on strategy

application: Evidence from focal lesions, traumatic brain injury and normal

aging. Journal of the International Neuropsychological Society, 4(3), 247-264.

Lewis, C., & Mitchell, P. (Eds.). (1994). Children's early understanding of mind:

Origins and development. Hove, UK: Lawrence Erlbaum.

Lewis, V., & Boucher, J. (1988). Spontaneous, instructed and elicited play in relatively

able autistic children. British Journal of Developmental Psychology, 6(4), 325-

Lewis, V., & Boucher, J. (1991). Skill, content and generative strategies in autistic

children's drawings. British Journal of Developmental Psychology, 9(3), 393-

Lewis, V., & Boucher, J. (1995). Generativity in the play of young people with autism.

Lezak, M. D. (1993). Newer contributions to the neuropsychological assessment of

executive functions. Journal of Head Trauma Rehabilitation, 8(1), 24-31.

Lezak, M. D. (1995). Neuropsychological assessment. (3rd ed.). New York: Oxford

University Press.

Liss, M., Fein, D., Allen, D., Dunn, M., Feinstein, C., Morris, R., Waterhouse, L., &

Rapin, I. (2001). Executive functioning in high-functioning children with

autism. Journal of Child Psychology & Psychiatry & Allied Disciplines, 42(2),

261-270.

Lockyer, L., & Rutter, M. (1969). A five- to fifteen-year follow-up study of infantile

psychosis: III. Psychological aspects. British Journal of Psychiatry, 115(525),

865-882.

Lockyer, L., & Rutter, M. (1970). A five- to fifteen-year follow-up study of infantile

psychosis: IV. Patterns of cognitive ability. British Journal of Social & Clinical

Psychology, 9(2), 152-163.

Lord, C., Risi, S., Lambrecht, L., Cook, E. H., Leventhal, B. L., DiLavore, P. C.,

Pickles, A., & Rutter, M. (2000). The Autism Diagnostic Observation Schedule-

Generic: A standard measure of social and communication deficits associated

with the spectrum of autism. Journal of Autism & Developmental Disorders,

30(3), 205-223.

Lord, C., Rutter, M., & Le Couteur, A. (1994). Autism Diagnostic Interview--Revised:

A revised version of a diagnostic interview for caregivers of individuals with

possible pervasive developmental disorders. Journal of Autism & Developmental

Disorders, 24(5), 659-685.

Lorr, M. (1994). Cluster analysis: Aims, methods, and problems. In S. Strack & M. Lorr

(Eds.), Differentiating normal and abnormal personality (pp. 179-195). New

York: Springer Publishing Co.

Lough, S., Gregory, C., & Hodges, J. R. (2001). Dissociation of social cognition and

executive function in frontal variant frontotemporal dementia. Neurocase,

7(2,Pt2), 123-130.

Lough, S., & Hodges, J. R. (2002). Measuring and modifying abnormal social cognition

in frontal variant frontotemporal dementia. Journal of Psychosomatic Research,

53(2), 639-646.

Lowe, C., & Rabbitt, P. (1998). Test/re-test reliability of the CANTAB and ISPOCD

neuropsychological batteries: Theoretical and practical issues.

Luciana, M., & Nelson, C. A. (1998). The functional emergence of prefrontally-guided

working memory systems in four- to eight-year-old children. Neuropsychologia,

36(3), 273-293.

Luna, B., Minshew, N., Garver, K., Lazar, N., Thulborn, K., Eddy, W., & Sweeney, J.

(2002). Neocortical system abnormalities in autism: An fMRI study of spatial

working memory. Neurology, 59(6), 834-840.

Luria, A. R. (1966). Higher cortical functions in man. New York: Basic Books.

Macintosh, K. E., & Dissanayake, C. (2004). Annotation: The similarities and

differences between autistic disorder and Asperger's disorder: A review of the

empirical evidence. Journal of Child Psychology and Psychiatry, 45(3), 421-

MacLean, J. E., Szatmari, P., Jones, M. B., Bryson, S. E., Mahoney, W. J., Bartolucci,

G., & Tuff, L. (1999). Familial factors influence level of functioning in

pervasive developmental disorder. Journal of the American Academy of Child &

Adolescent Psychiatry, 38(6), 746-753.

Manly, T., Anderson, V., Nimmo-Smith, I., Turner, A., Watson, P., & Robertson, I. H.

(2001). The differential assessment of children's attention: The Test of Everyday

Attention for Children (TEA-Ch), normative sample and ADHD performance.

Manly, T., Robertson, I. H., Anderson, V., & Nimmo-Smith, I. (1998). The Test of

Everyday Attention for Children (TEA-Ch). Thames Valley Test Company.

Marlowe, W. B. (1992). The impact of a right prefrontal lesion on the developing brain.

Brain & Cognition, 20(1), 205-213.

Maxwell, S. E., & Delaney, H. D. (1990). Designing experiments and analysing data: A

model comparison perspective. Belmont, CA: Wadsworth.

Mayes, L. C., Klin, A., Tercyak, K. P., Cicchetti, D. V., & Cohen, D. J. (1996). Test-

retest reliability for false-belief tasks. Journal of Child Psychology & Psychiatry

Mazza, M., De Risio, A., Surian, L., Roncone, R., & Casacchia, M. (2001). Selective

impairments of theory of mind in people with schizophrenia. Schizophrenia

Research, 47(2-3), 299-308.

McEvoy, R. E., Rogers, S. J., & Pennington, B. F. (1993). Executive function and social

communication deficits in young autistic children. Journal of Child Psychology

& Psychiatry & Allied Disciplines, 34(4), 563-578.

Mega, M. S., & Cummings, J. L. (1994). Frontal-subcortical circuits and

neuropsychiatric disorders. Journal of Neuropsychiatry & Clinical

Neurosciences, 6(4), 358-370.

Mengelberg, A., & Siegert, R. J. (2003). Is theory-of-mind impaired in Parkinson's

disease? Cognitive Neuropsychiatry, 8(3), 191-209.

Miller, G. A., & Chapman, J. P. (2001). Misunderstanding analysis of covariance.

Journal of Abnormal Psychology, 110, 40-48.

Miller, J. N., & Ozonoff, S. (2000). The external validity of Asperger disorder: Lack of

evidence from the domain of neuropsychology. Journal of Abnormal

Psychology, 109(2), 227-238.

Milner, B. (1963). Effects of different brain lesions on card sorting. Archives of

Neurology, 9, 90-100.

Minshew, N. J., Goldstein, G., Muenz, L. R., & Payton, J. B. (1992).

Neuropsychological functioning nonmentally retarded autistic individuals.

Journal of Clinical & Experimental Neuropsychology, 14(5), 749-761.

Minshew, N. J., Johnson, C., & Luna, B. (2001). The cognitive and neural basis of

autism: A disorder of complex information processing and dysfunction of

neocortical systems. In L. M. Glidden (Ed.), International review of research in

mental retardation: Autism (Vol. 23, pp. 111-138). San Diego, CA: Academic

Press.

Minshew, N. J., Luna, B., & Sweeney, J. A. (1999). Oculomotor evidence for

neocortical systems but not cerebellar dysfunction in autism. Neurology, 52(5),

917-922.

Minter, M., Hobson, R., & Bishop, M. (1998). Congenital visual impairment and 'theory

of mind'. British Journal of Developmental Psychology, 16(2), 183-196.

Minton, J., Campbell, M., Green, W. H., Jennings, S., & Samit, C. (1982). Cognitive

assessment of siblings of autistic children. Journal of the American Academy of

Child Psychiatry, 21(3), 256-261.

Mitchell, P., & Lacohée, H. (1991). Children's early understanding of false belief.

Cognition, 39, 107-127.

Mitchell, P., & Riggs, K. J. (Eds.). (2000). Children's reasoning and the mind. Hove,

UK: Psychology Press.

Miyake, A., Friedman, N. P., Emerson, M. J., Witzki, A. H., & Howerter, A. (2000).

The unity and diversity of executive functions and their contributions to

complex "frontal lobe" tasks: A latent variable analysis. Cognitive Psychology,

41(1), 49-100.

Moore, C., Jarrold, C., Russell, J., Lumb, A., Sapp, F., & MacCallum, F. (1995).

Conflicting desire and the child's theory of mind. Cognitive Development, 10(4),

467-482.

Morton, J., & Frith, U. (1995). Causal modeling: A structural approach to

developmental psychopathology. In D. Cicchetti & D. J. Cohen (Eds.),

Developmental psychopathology (Vol. 1: Theory and methods, pp. 357-390).

Oxford: John Wiley & Sons.

Morton, J., & Frith, U. (2001). Why we need cognition: Cause and developmental

disorder. In E. Dupoux (Ed.), Language, brain, and cognitive development:

Essays in honor of Jacques Mehler (pp. 263-278). Cambridge, MA: MIT Press.

Moses, L. J., & Flavell, J. H. (1990). Inferring false beliefs from actions and reactions.

Child Development, 61(4), 929-945.

Mundy, P. (2003). The neural basis of social impairments in autism: The role of the

dorsal medial-frontal cortex and anterior cingulate system. Journal of Child

Mundy, P., & Neal, A. (2001). Neural plasticity, joint attention, and a transactional

social-orienting model of autism. In L. M. Glidden (Ed.), International review of

research in mental retardation: Autism (Vol. 23, pp. 139-168). San Diego, CA:

Academic Press.

Mundy, P., & Sigman, M. (1989). Specifying the nature of the social impairment in

autism. In G. Dawson (Ed.), Autism: Nature, diagnosis, and treatment (pp. 3-

21). New York: The Guilford Press.

Mundy, P., Sigman, M., & Kasari, C. (1993). The theory of mind and joint-attention

deficits in autism. In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.),

Understanding other minds: Perspectives from autism (pp. 181-203). Oxford:

Murphy, M., Bolton, P. F., Pickles, A., Fombonne, E., Piven, J., & Rutter, M. (2000).

Personality traits of the relatives of autistic probands. Psychological Medicine,

30(6), 1411-1424.

Narayan, S., Moyes, B., & Wolff, S. (1990). Family characteristics of autistic children:

A further report. Journal of Autism & Developmental Disorders, 20(4), 523-535.

Norman, D., & Shallice, T. (1980). Attention to action: Willed and automatic control of

behaviour. Center for Human Information Processing (Technical Report No.

Norman, D. A., & Shallice, T. (1986). Attention to action: Willed and automatic control

of behaviour. In R. J. Davidson, G. E. Schwartz, & D. Shapiro (Eds.),

Consciousness and self-regulation (Vol. 4). New York: Plenum Press.

Ohnishi, T., Matsuda, H., Hashimoto, T., Kunihiro, T., Nishikawa, M., Uema, T., &

Sasaki, M. (2000). Abnormal regional cerebral blood flow in childhood autism.

Brain, 123(9), 1838-1844.

Olson, D. R. (1989). Making up your mind. Canadian Psychology, 30, 617-627.

Oosterlaan, J., Logan, G. D., & Sergeant, J. A. (1998). Response inhibition in AD/HD,

CD, comorbid AD/HD + CD, anxious, and control children: A meta-analysis of

studies with the stop task. Journal of Child Psychology & Psychiatry & Allied

Disciplines, 39(3), 411-425.

Ornitz, E. M. (1969). Disorders of perception common to early infantile autism and

schizophrenia. Comprehensive Psychiatry, 10(4), 259-274.

Ornitz, E. M. (1988). Autism: A disorder of directed attention. Brain Dysfunction, 1(5-

6), 309-322.

Oswald, D. P., & Ollendick, T. H. (1989). Role taking and social competence in autism

and mental retardation. Journal of Autism & Developmental Disorders, 19(1),

119-127.

Owen, A. M., Downes, J. J., Sahakian, B. J., Polkey, C. E., & Robbins, T. W. (1990).

Planning and spatial working memory following frontal lobe lesions in man.

Owen, A. M., Roberts, A. C., Hodges, J. R., Summers, B. A., Polkey, C. E., & Robbins,

T. W. (1993). Contrasting mechanisms of impaired attentional set-shifting in

patients with frontal lobe damage or Parkinson's disease. Brain, 116, 1159-1175.

Owen, A. M., Roberts, A. C., Polkey, C. E., Sahakian, B. J., & Robbins, T. W. (1991).

Extra-dimensional versus intra-dimensional set shifting performance following

frontal lobe excisions, temporal lobe excisions or amygdalo-hippocampectomy

in man. Neuropsychologia, 29(10), 993-1006.

Ozonoff, S. (1995a). Executive functions in autism. In E. Schopler & G. B. Mesibov

(Eds.), Learning and cognition in autism (pp. 199-219). New York: Plenum

Press.

Ozonoff, S. (1995b). Reliability and validity of the Wisconsin Card Sorting Test in

studies of autism. Neuropsychology, 9(4), 491-500.

Ozonoff, S. (1997a). Causal mechanisms of autism: Unifying perspectives from an

information-processing framework. In D. J. Cohen & F. R. Volkmar (Eds.),

Handbook of autism and pervasive developmental disorders (2nd ed., pp. 868-

879). New York: John Wiley & Sons.

Ozonoff, S. (1997b). Components of executive function in autism and other disorders.

In J. Russell (Ed.), Autism as an executive disorder (pp. 179-211). Oxford:

Ozonoff, S. (2001). Advances in the cognitive neuroscience of autism. In C. A. Nelson

& M. Luciana (Eds.), Handbook of developmental cognitive neuroscience (pp.

537-548). Cambridge, MA: MIT Press.

Ozonoff, S., & Jensen, J. (1999). Brief report: Specific executive function profiles in

three neurodevelopmental disorders. Journal of Autism and Developmental

Disorders, 29(2), 171-177.

Ozonoff, S., & McEvoy, R. E. (1994). A longitudinal study of executive function and

theory of mind development in autism. Development & Psychopathology, 6(3),

415-431.

Ozonoff, S., Pennington, B. F., & Rogers, S. J. (1991). Executive function deficits in

high-functioning autistic individuals: Relationship to theory of mind. Journal of

Ozonoff, S., Rogers, S. J., Farnham, J. M., & Pennington, B. F. (1993). Can standard

measures identify subclinical markers of autism? Journal of Autism &

Ozonoff, S., South, M., & Miller, J. N. (2000). DSM-IV-defined Asperger syndrome:

Cognitive, behavioral and early history differentiation from high-functioning

autism. Autism, 4(1), 29-46.

Ozonoff, S., & Strayer, D. L. (1997). Inhibitory function in nonretarded children with

autism. Journal of Autism and Developmental Disorders, 27(1), 59-77.

Ozonoff, S., & Strayer, D. L. (2001). Further evidence of intact working memory in

autism. Journal of Autism and Developmental Disorders, 31(3), 257-263.

Ozonoff, S., Strayer, D. L., McMahon, W. M., & Filloux, F. (1994). Executive function

abilities in autism and Tourette syndrome: An information processing approach.

Pantelis, C., Barnes, T. R., Nelson, H. E., Tanner, S., Weatherley, L., Owen, A. M., &

Robbins, T. W. (1997). Frontal-striatal cognitive deficits in patients with chronic

schizophrenia. Brain, 120(10), 1823-1843.

Passler, M. A., Isaac, W., & Hynd, G. W. (1985). Neuropsychological development of

behavior attributed to frontal lobe functioning in children. Developmental

Paterson, S., Brown, J., Gsoedl, M., Johnson, M., & Karmiloff-Smith, A. (1999).

Cognitive modularity and genetic disorders. Science, 286(5448), 2355-2358.

Pellicano, E., Maybery, M., Durkin, K., & Maley, A. (2004). Weak central coherence in

children with autism: Its relationship to mindreading and executive functioning.

Manuscript submitted for publication.

Pennington, B. F. (1997). Dimensions of executive functions in normal and abnormal

development. In N. A. Krasnegor, G. R. Lyon, & P. S. Goldman-Rakic (Eds.),

Development of the prefrontal cortex: Evolution, neurobiology, and behavior

Pennington, B. F., Groisser, D., & Welsh, M. C. (1993). Contrasting cognitive deficits

in attention deficit hyperactivity disorder versus reading disability.

Pennington, B. F., & Ozonoff, S. (1991). A neuroscientific perspective on continuity

and discontinuity in developmental psychopathology. In D. Cicchetti & S. L.

Toth (Eds.), Rochester symposium on developmental psychopathology (Vol. 3:

Models and integrations, pp. 117-159). Rochester, NY: University of Rochester

Press.

Pennington, B. F., & Ozonoff, S. (1996). Executive functions and developmental

psychopathology. Journal of Child Psychology and Psychiatry, 37(1), 51-87.

Pennington, B. F., Rogers, S. J., Bennetto, L., McMahon Griffith, E., Reed, D. T., &

Shyu, V. (1997). Validity tests of the executive dysfunction hypothesis of

autism. In J. Russell (Ed.), Autism as an executive disorder (pp. 143-178).

Oxford: Oxford University Press.

Pennington, B. F., & Welsh, M. (1995). Neuropsychology and developmental

psychopathology. In D. Cicchetti & D. J. Cohen (Eds.), Developmental

psychopathology (Vol. 1: Theory and methods, pp. 254-290). Oxford: John

Wiley & Sons.

Perner, J. (1991). Understanding the representational mind. Cambridge, MA: MIT

Press.

Perner, J. (1993). The theory of mind deficit in autism: Rethinking the

metarepresentation theory. In S. Baron-Cohen, H. Tager-Flusberg, & D. J.

Cohen (Eds.), Understanding other minds: Perspectives from autism (pp. 112-

137). Oxford: Oxford University Press.

Perner, J. (1995). The many faces of belief: Reflections on Fodor's and the child's theory

of mind. Cognition, 57(3), 241-269.

Perner, J. (1998). The meta-intentional nature of executive functions and theory of

mind. In P. Carruthers & J. Boucher (Eds.), Language and thought (pp. 270-

283). Cambridge: Cambridge University Press.

Perner, J. (2000). About + belief + counterfactual. In P. Mitchell & K. J. Riggs (Eds.),

Children's reasoning and the mind (pp. 367-401). Hove, UK: Psychology Press.

Perner, J., Baker, S., & Hutton, D. (1994). Prelief: The conceptual origins of belief and

pretence. In C. Lewis & P. Mitchell (Eds.), Children's early understanding of

mind: Origins and development. Hove, UK: Lawrence Erlbaum.

Perner, J., Frith, U., Leslie, A. M., & Leekam, S. R. (1989). Exploration of the autistic

child's theory of mind: Knowledge, belief, and communication. Child

Development, 60(3), 689-700.

Perner, J., Kain, W., & Barchfeld, P. (2002a). Executive control and higher-order theory

of mind in children at risk of ADHD. Infant & Child Development, 11(2), 141-

Perner, J., & Lang, B. (1999). Development of theory of mind and executive control.

Trends in Cognitive Sciences, 3(9), 337-344.

Perner, J., & Lang, B. (2000). Theory of mind and executive function: Is there a

developmental relationship? In S. Baron-Cohen, H. Tager-Flusberg, & D. J.

Cohen (Eds.), Understanding other minds: Perspectives from developmental

cognitive neuroscience (2nd ed., pp. 150-181). London: Oxford University

Press.

Perner, J., & Lang, B. (2002). What causes 3-year-olds' difficulty on the dimensional

change card sorting task? Infant & Child Development, 11(2), 93-105.

Perner, J., Lang, B., & Kloo, D. (2002b). Theory of mind and self-control: More than a

common problem of inhibition. Child Development, 73(3), 752-767.

Perner, J., Leekam, S. R., & Wimmer, H. (1987). Three-year-olds' difficulty with false

belief: The case for a conceptual deficit. British Journal of Developmental

Psychology, 5(2), 125-137.

Perner, J., Ruffman, T., & Leekam, S. R. (1994). Theory of mind is contagious: You

catch it from your sibs. Child Development, 65(4), 1228-1238.

Perner, J., Stummer, S., & Lang, B. (1999). Executive functions and theory of mind:

Cognitive complexity or functional dependence? In P. D. Zelazo, J. W.

Astington, & D. R. Olson (Eds.), Developing theories of intention: Social

understanding and self-control (pp. 133-152). Mahwah, NJ: Lawrence Erlbaum.

Perner, J., & Wimmer, H. (1985). "John thinks that Mary thinks that . . .": Attribution of

second-order beliefs by 5- to 10-year-old children. Journal of Experimental

Child Psychology, 39(3), 437-471.

Peterson, C. C. (2002). Drawing insight from pictures: The development of concepts of

false drawing and false belief in children with deafness, normal hearing, and

autism. Child Development, 73(5), 1442-1459.

Peterson, C. C., & Siegal, M. (1995). Deafness, conversation and theory of mind.

Peterson, D. M., & Bowler, D. M. (2000). Counterfactual reasoning and false belief

understanding in children with autism, children with severe learning difficulties

and children with typical development. Autism, 4(4), 391-405.

Peterson, D. M., & Riggs, K. J. (1999). Adaptive modelling and mindreading. Mind and

Language, 14(1), 80-117.

Phillips, L. H. (1997). Do "frontal tests" measure executive function? Issues of

assessment and evidence from fluency tests. In P. Rabbitt (Ed.), Methodology of

frontal and executive function (pp. 191-213). Hove, UK: Psychology Press.

Phillips, W., Baron-Cohen, S., & Rutter, M. (1998). Understanding intention in normal

development and in autism. British Journal of Developmental Psychology,

16(3), 337-348.

Pickles, A., Bolton, P., Macdonald, H., Bailey, A., Le Couteur, A., Sim, C. H., &

Rutter, M. (1995). Latent-class analysis of recurrence risks for complex

phenotypes with selection and measurement error: A twin and family history

study of autism. American Journal of Human Genetics, 57(3), 717-26.

Pickles, A., Starr, E., Kazak, S., Bolton, P., Papanikolaou, K., Bailey, A., Goodman, R.,

& Rutter, M. (2000). Variable expression of the autism broader phenotype:

Findings from extended pedigrees. Journal of Child Psychology & Psychiatry &

Pilowsky, T., Yirmiya, N., Arbelle, S., & Mozes, T. (2000). Theory of mind abilities of

children with schizophrenia, children with autism, and normally developing

children. Schizophrenia Research, 42(2), 145-155.

Pilowsky, T., Yirmiya, N., Shalev, R. S., & Gross-Tsur, V. (2003). Language abilities

of siblings of children with autism. Journal of Child Psychology & Psychiatry &

Piven, J. (1999). Genetic liability for autism: The behavioural expression in relatives.

International Review of Psychiatry, 11(4), 299-308.

Piven, J., Arndt, S., Bailey, J., Havercamp, S., Andreasen, N. C., & Palmer, P. (1995).

An MRI study of brain size in autism. American Journal of Psychiatry, 152(8),

1145-1149.

Piven, J., Bailey, J., Ranson, B. J., & Arndt, S. (1997a). An MRI study of the corpus

callosum in autism. American Journal of Psychiatry, 154, 1051-1056.

Piven, J., Berthier, M. L., Starkstein, S. E., Nehme, E., Pearlson, G., & Folstein, S.

(1990a). Magnetic resonance imaging evidence for a defect of cerebral cortical

development in autism. American Journal of Psychiatry, 147(6), 734-739.

Piven, J., Chase, G. A., Landa, R., Wzorek, M., Gayle, J., Cloud, D., & Folstein, S.

(1991). Psychiatric disorders in the parents of autistic individuals. Journal of the

American Academy of Child & Adolescent Psychiatry, 30(3), 471-478.

Piven, J., Gayle, J., Chase, G. A., Fink, B., Landa, R., Wzorek, M. M., & Folstein, S. E.

(1990b). A family history study of neuropsychiatric disorders in the adult

siblings of autistic individuals. Journal of the American Academy of Child &

Piven, J., & Palmer, P. (1997). Cognitive deficits in parents from multiple-incidence

autism families. Journal of Child Psychology & Psychiatry & Allied Disciplines,

38(8), 1011-1021.

Piven, J., & Palmer, P. (1999). Psychiatric disorder and the broad autism phenotype:

Evidence from a family study of multiple-incidence autism families. American

Journal of Psychiatry, 156(4), 557-563.

Piven, J., Palmer, P., Jacobi, D., Childress, D., & Arndt, S. (1997b). Broader autism

phenotype: Evidence from a family history study of multiple-incidence autism

families. American Journal of Psychiatry, 154(2), 185-190.

Piven, J., Palmer, P., Landa, R., Santangelo, S., Jacobi, D., & Childress, D. (1997c).

Personality and language characteristics in parents from multiple-incidence

autism families. American Journal of Medical Genetics (Neuropsychiatric

Genetics), 74(4), 398-411.

Piven, J., Wzorek, M., Landa, R., Lainhart, J., Bolton, P., Chase, G. A., & Folstein, S.

(1994). Personality characteristics of the parents of autistic individuals.

Psychological Medicine, 24(3), 783-795.

Plaisted, K. C. (2000). Aspects of autism that theory of mind cannot explain. In S.

minds: Perspectives from developmental cognitive neuroscience (2nd ed., pp.

222-250). London: Oxford University Press.

Plaisted, K. C. (2001). Reduced generalization in autism: An alternative to weak central

coherence. In J. A. Burack, T. Charman, N. Yirmiya, & P. R. Zelazo (Eds.), The

development of autism: Perspectives from theory and research (pp. 149-169).

Premack, D., & Woodruff, G. (1978). Does the chimpanzee have a theory of mind? The

Behavioral and Brain Sciences, 4, 515-526.

Price, B. H., Daffner, K. R., Stowe, R. M., & Mesulam, M. M. (1990). The

comportmental learning disabilities of early frontal lobe damage. Brain, 113,

1383-1393.

Prior, M., Dahlstrom, B., & Squires, T.-L. (1990). Autistic children's knowledge of

thinking and feeling states in other people. Journal of Child Psychology &

Prior, M., Eisenmajer, R., Leekam, S., Wing, L., Gould, J., Ong, B., & Dowe, D.

(1998). Are there subgroups within the autistic spectrum? A cluster analysis of a

group of children with autistic spectrum disorders. Journal of Child Psychology

& Psychiatry & Allied Disciplines, 39(6), 893-902.

Prior, M., & Hoffmann, W. (1990). Brief report: Neuropsychological testing of autistic

children through an exploration with frontal lobe tests. Journal of Autism &

Pylyshyn, Z. W. (1978). When is attribution of beliefs justified? The Behavioral and

Brain Sciences, 1, 592-593.

Rabbitt, P. (Ed.). (1997). Methodology of frontal and executive function. Hove, UK:

Psychology Press.

Rapin, I. (1997). Classification and causal issues in autism. In D. J. Cohen & F. R.

Volkmar (Eds.), Handbook of autism and pervasive developmental disorders

(2nd ed., pp. 847-867). New York: John Wiley & Sons.

Razani, J., Boone, K., Miller, B. L., Lee, A., & Sherman, D. (2001).

Neuropsychological performance of right- and left-frontotemporal dementia

compared to Alzheimer's disease. Journal of the International

Reed, T., & Peterson, C. (1990). A comparative study of autistic subjects' performance

at two levels of visual and cognitive perspective taking. Journal of Autism and

Developmental Disorders, 20, 555-568.

Reitan, R. M., & Wolfson, D. (1994). A selective and critical review of

neuropsychological deficits and the frontal lobes. Neuropsychology Review,

4(3), 161-198.

Remmel, E. R. (2003). Theory of mind development in signing deaf children.

Unpublished PhD thesis, Stanford University.

Rinehart, N. J., Bradshaw, J. L., Moss, S. A., Brereton, A. V., & Tonge, B. J. (2001). A

deficit in shifting attention present in high-functioning autism but not Asperger's

disorder. Autism, 5(1), 67-80.

Rinehart, N. J., Bradshaw, J. L., Tonge, B. J., Brereton, A. V., & Bellgrove, M. A.

(2002). A neurobehavioral examination of individuals with high-functioning

autism and Asperger disorder using a fronto-striatal model of dysfunction.

Behavioral & Cognitive Neuroscience Reviews, 1(2), 164-177.

Risch, N., Spiker, D., Lotspeich, L., Nouri, N., Hinds, D., Hallmayer, J., et al. (1999). A

genomic screen of autism: evidence for a multilocus etiology. American Journal

of Human Genetics, 65(2), 493-507.

Roberts, R. J., Hager, L. D., & Heron, C. (1994). Prefrontal cognitive processes:

Working memory and inhibition in the antisaccade task. Journal of

Experimental Psychology: General, 123(4), 374-393.

Roberts, R. J., & Pennington, B. F. (1996). An interactive framework for examining

prefrontal cognitive processes. Developmental Neuropsychology, 12(1), 105-

Robinson, E. J., & Beck, S. (2000). What is difficult about counterfactual reasoning? In

P. Mitchell & K. J. Riggs (Eds.), Children's reasoning and the mind (pp. 101-

119). Hove, UK: Psychology Press.

Robinson, E., & Mitchell, P. (1995). Masking of children's early understanding of the

representational mind: Backwards explanation versus prediction. Child

Development, 66(4), 1022-1039.

Robinson, E., Riggs, K., & Samuels, J. (1996). Children's memory for drawings based

on a false belief. Developmental Psychology, 32(6), 1056-1064.

Rogers, S. J. (1999). An examination of the imitation deficit in autism. In J. Nadel & G.

Butterworth (Eds.), Imitation in infancy: Cambridge studies in cognitive

perceptual development (pp. 254-283). New York: Cambridge University Press.

Rogers, S. J., & Pennington, B. F. (1991). A theoretical approach to the deficits in

infantile autism. Development & Psychopathology, 3(2), 137-162.

Rosenthal, R. (1991). Meta-analytic procedures for social research. Newbury Park,

CA: Sage.

Roth, D., & Leslie, A. M. (1998). Solving belief problems: Toward a task analysis.

Cognition, 66(1), 1-31.

Rowe, A. D., Bullock, P. R., Polkey, C. E., & Morris, R. G. (2001). 'Theory of mind'

impairments and their relationship to executive functioning following frontal

lobe excisions. Brain, 124(3), 600-616.

Royall, D. R., Lauterbach, E. C., Cummings, J. L., Reeve, A., Rummans, T. A., Kaufer,

D. I., LaFrance, W., & Coffey, C. (2002). Executive control function: A review

of its promise and challenges for clinical research: A report from the committee

on research of the American Neuropsychiatric Association. Journal of

Neuropsychiatry & Clinical Neurosciences, 14(4), 377-405.

Ruffman, T., Perner, J., Naito, M., Parkin, L., & Clements, W. A. (1998). Older (but not

younger) siblings facilitate false belief understanding. Developmental

Psychology, 34(1), 161-174.

Rumsey, J. M. (1985). Conceptual problem-solving in highly verbal, nonretarded

autistic men. Journal of Autism & Developmental Disorders, 15(1), 23-36.

Rumsey, J. M., & Hamburger, S. D. (1988). Neuropsychological findings in high-

functioning men with infantile autism, residual state. Journal of Clinical &

Experimental Neuropsychology, 10(2), 201-221.

Russell, J. (1996). Agency: Its role in mental development. Hove, UK: Lawrence

Erlbaum.

Russell, J. (Ed.). (1997a). Autism as an executive disorder. Oxford: Oxford University

Press.

Russell, J. (1997b). How executive disorders can bring about an inadequate 'theory of

mind'. In J. Russell (Ed.), Autism as an executive disorder (pp. 215-255).

Russell, J., Hala, S., & Hill, E. (2003). The automated windows task: The performance

of preschool children, children with autism, and children with moderate learning

difficulties. Cognitive Development, 18(1), 111-137.

Russell, J., & Hill, E. L. (2001). Action-monitoring and intention reporting in children

with autism. Journal of Child Psychology & Psychiatry & Allied Disciplines,

42(3), 317-328.

Russell, J., Hill, E. L., & Franco, F. (2001). The role of belief veracity in understanding

intentions-in-action: Preschool children's performance on the transparent

intentions task. Cognitive Development, 16(3), 775-792.

Russell, J., & Jarrold, C. (1998). Error-correction problems in autism: Evidence for a

monitoring impairment? Journal of Autism & Developmental Disorders, 28(3),

177-188.

Russell, J., & Jarrold, C. (1999). Memory for actions in children with autism: Self

versus other. Cognitive Neuropsychiatry, 4(4), 303-331.

Russell, J., Jarrold, C., & Henry, L. (1996). Working memory in children with autism

and with moderate learning difficulties. Journal of Child Psychology and

Psychiatry, 37(6), 673-686.

Russell, J., Jarrold, C., & Hood, B. (1999). Two intact executive capacities in children

with autism: Implications for the core executive dysfunctions in the disorder.

Journal of Autism and Developmental Disorders, 29(2), 103-112.

Russell, J., Jarrold, C., & Potel, D. (1994). What makes strategic deception difficult for

children - the deception or the strategy? British Journal of Developmental

Psychology, 12(3), 301-314.

Russell, J., Mauthner, N., Sharpe, S., & Tidswell, T. (1991). The "windows task" as a

measure of strategic deception in preschoolers and autistic subjects. British

Journal of Developmental Psychology, 9(2), 331-349.

Russell, J., Saltmarsh, R., & Hill, E. (1999). What do executive factors contribute to the

failure on false belief tasks by children with autism? Journal of Child

Rutherford, M., & Rogers, S. J. (2003). Cognitive underpinnings of pretend play in

Rutter, M. (1968). Concepts of autism: A review of research. Journal of Child

Rutter, M. (1970). Autistic children: Infancy to adulthood. Seminars in Psychiatry, 2,

435-450.

Rutter, M. (1983). Cognitive deficits in the pathogenesis of autism. Journal of Child

Rutter, M. (2000). Genetic studies of autism: From the 1970s into the millennium.

Journal of Abnormal Child Psychology, 28(1), 3-14.

Saltzman, J., Strauss, E., Hunter, M., & Archibald, S. (2000). Theory of mind and

executive functions in normal human aging and Parkinson's disease. Journal of

the International Neuropsychological Society, 6(7), 781-788.

Saver, J. L., & Damasio, A. R. (1991). Preserved access and processing of social

knowledge in a patient with acquired sociopathy due to ventromedial frontal

damage. Neuropsychologia, 29(12), 1241-1249.

Scheerer, M., Rothmann, E., & Goldstein, K. (1945). A case of "idiot savant": An

experimental study of personality organization. Psychological Monographs, 58,

Schneider, W., & Shiffrin, R. M. (1977). Controlled and automatic human information

processing: I. Detection, search, and attention. Psychological Review, 84(1), 1-

Scholl, B. J., & Leslie, A. M. (1999). Modularity, development and 'theory of mind'.

Mind & Language, 14(1), 131-153.

Scholl, B. J., & Leslie, A. M. (2001). Minds, modules, and meta-analysis. Commentary

on "Meta-analysis of theory-of-mind development: The truth about false belief.".

Schwartz, M. L. (1997). Organization and development of callosal connectivity in

prefrontal cortex. In N. A. Krasnegor, G. R. Lyon, & P. S. Goldman-Rakic

(Eds.), Development of the prefrontal cortex: Evolution, neurobiology and

behavior (pp. 49-67). Baltimore, MD: Paul H. Brookes.

Sergeant, J. A., Geurts, H., & Oosterlaan, J. (2002). How specific is a deficit of

executive functioning for attention-deficit/hyperactivity disorder? Behavioural

Brain Research, 130(1-2), 3-28.

Shah, A., & Frith, U. (1983). An islet of ability in autistic children: A research note.

Journal of Child Psychology and Psychiatry, 24(4), 613-620.

Shah, A., & Frith, U. (1993). Why do autistic individuals show superior performance on

the block design task? Journal of Child Psychology & Psychiatry & Allied

Disciplines, 34(8), 1351-1364.

Shallice, T. (1982). Specific impairments in planning. Philosophical Transactions of the

Royal Society of London B, 298, 199-209.

Shallice, T. (1984). More functionally isolable subsystems but fewer "modules"?

Cognition, 17(3), 243-252.

Shallice, T. (1988). From neuropsychology to mental structure. New York: Cambridge

University Press.

Shallice, T. (2002). Fractionation of the supervisory system. In D. T. Stuss & R. T.

Knight (Eds.), Principles of frontal lobe function (pp. 261-277). London: Oxford

University Press.

Shallice, T., & Burgess, P. (1991). Deficits in strategy application after frontal lobe

damage in man. Brain, 114, 727-741.

Shallice, T., & Burgess, P. W. (1996). Domains of supervisory control and the temporal

organisation of behaviour. Philosophical Transactions of the Royal Society of

London B, 351, 1405-1412.

Shallice, T., Marzocchi, G. M., Coser, S., Del Savio, M., Meuter, R. F., & Rumiati, R. I.

(2002). Executive function profile of children with attention deficit hyperactivity

disorder. Developmental Neuropsychology, 21(1), 43-71.

Sherman, M., Nass, R., & Shapiro, T. (1984). Brief report: Regional cerebral blood flow

in autism. Journal of Autism and Developmental Disorders, 14(4), 439-446.

Siegal, M., & Beattie, K. (1991). Where to look first for children's understanding of

false beliefs. Cognition, 38(1), 1-12.

Sigman, M., & Ruskin, E. (1999). Social competence in children with autism, Down

syndrome and developmental delays: A longitudinal study. Monographs of the

Society for Research in Child Development, 64(Serial No. 256).

Silverman, J. M., Smith, C. J., Schmeidler, J., Hollander, E., Lawlor, B. A., Fitzgerald,

M., Buxbaum, J. D., Delaney, K., & Galvin, P. (2002). Symptom domains in

autism and related conditions: Evidence for familiality. American Journal of

Medical Genetics (Neuropsychiatric Genetics), 114(1), 64-73.

Skuse, D. (2001). Endophenotypes and child psychiatry. British Journal of Psychiatry,

178, 395-396.

Skuse, D., James, R., Bishop, D., Coppin, B., Dalton, P., Aamodt-Leeper, G., Bacarese-

Hamilton, M., Creswell, C., McGurk, R., & Jacobs, P. A. (1997). Evidence from

Turner's syndrome of an imprinted X-linked locus affecting cognitive function.

Nature, 387(6634), 705-708.

Slaats-Willemse, D., Swaab-Barneveld, H., de Sonneville, L., van der Meulen, E., &

Buitelaar, J. (2003). Deficient response inhibition as a cognitive endophenotype

of ADHD. Journal of the American Academy of Child & Adolescent Psychiatry,

42(10), 1242-1248.

Smalley, S. L., & Asarnow, R. F. (1990). Brief report: Cognitive subclinical markers in

Smalley, S. L., McCracken, J., & Tanguay, P. (1995). Autism, affective disorders, and

social phobia. American Journal of Medical Genetics (Neuropsychiatric

Genetics), 60, 19-26.

Smith, I. M., & Bryson, S. E. (1994). Imitation and action in autism: A critical review.

Psychological Bulletin, 116(2), 259-273.

Smith, M. L., Klim, P., & Hanley, W. B. (2000). Executive function in school-aged

children with phenylketonuria. Journal of Developmental & Physical

Disabilities, 12(4), 317-332.

Sodian, B., & Frith, U. (1992). Deception and sabotage in autistic, retarded and normal

children. Journal of Child Psychology & Psychiatry & Allied Disciplines, 33(3),

591-605.

Sparrevohn, R., & Howie, P. M. (1995). Theory of mind in children with autistic

disorder: Evidence of developmental progression and the role of verbal ability.

Stahl, L., & Pry, R. (2002). Joint attention and set-shifting in young children with

autism. Autism, 6(4), 383-396.

Starr, E., Berument, S. K., Pickles, A., Tomlins, M., Bailey, A., Papanikolaou, K., &

Rutter, M. (2001). A family genetic study of autism associated with profound

mental retardation. Journal of Autism & Developmental Disorders, 31(1), 89-96.

Steel, J., Gorman, R., & Flexman, J. E. (1984). Neuropsychiatric testing in an autistic

mathematical idiot-savant: Evidence for nonverbal abstract capacity. Journal of

the American Academy of Child Psychiatry, 23(6), 704-707.

Steele, S., Joseph, R. M., & Tager-Flusberg, H. (2003). Brief report: Developmental

change in theory of mind abilities in children with autism. Journal of Autism and

Steffenburg, S., Gillberg, C., Hellgren, L., Andersson, L., Gillberg, I., Jakobsson, G., &

Bohman, M. (1989). A twin study of autism in Denmark, Finland, Iceland,

Norway and Sweden. Journal of Child Psychology & Psychiatry & Allied

Disciplines, 30(3), 405-416.

Stevens, M. C., Fein, D. A., Dunn, M., Allen, D., Waterhouse, L. H., Feinstein, C., &

Rapin, I. (2000). Subgroups of children with autism by cluster analysis: A

longitudinal examination. Journal of the American Academy of Child &

Stone, V. (2000). The role of the frontal lobes and the amygdala in theory of mind. In S.

Stone, V. E., Baron-Cohen, S., & Knight, R. T. (1998). Frontal lobe contributions to

theory of mind. Journal of Cognitive Neuroscience, 10(5), 640-656.

Stuss, D. T., & Alexander, M. P. (2000). Executive functions and the frontal lobes: A

conceptual view. Psychological Research, 63(3-4), 289-298.

Stuss, D. T., & Benson, D. F. (1984). Neuropsychological studies of the frontal lobes.

Psychological Bulletin, 95(1), 3-28.

Stuss, D. T., & Benson, D. F. (1986). The frontal lobes. New York: Raven Press.

Stuss, D. T., Gallup, G. G., Jr., & Alexander, M. P. (2001). The frontal lobes are

necessary for "theory of mind". Brain, 124(2), 279-286.

Stuss, D. T., & Knight, R. T. (Eds.). (2002). Principles of frontal lobe function. London:

Surian, L., & Leslie, A. M. (1999). Competence and performance in false belief

understanding: A comparison of autistic and normal 3-yr-old children. British

Journal of Developmental Psychology, 17(Pt 1), 141-155.

Swettenham, J., Baron-Cohen, S., Charman, T., Cox, A., Baird, G., Drew, A., Rees, L.,

& Wheelwright, S. (1998). The frequency and distribution of spontaneous

attention shifts between social and nonsocial stimuli in autistic, typically

developing, and nonautistic developmentally delayed infants. Journal of Child

Szatmari, P. (1999). Heterogeneity and the genetics of autism. Journal of Psychiatry &

Neuroscience, 24(2), 159-165.

Szatmari, P., Jones, M. B., Tuff, L., Bartolucci, G., Bartolucci, G., Fisman, S., &

Mahoney, W. (1993). Lack of cognitive impairment in first-degree relatives of

children with pervasive developmental disorders. Journal of the American

Academy of Child & Adolescent Psychiatry, 32(6), 1264-1273.

Szatmari, P., Jones, M. B., Zwaigenbaum, L., & MacLean, J. E. (1998). Genetics of

autism: Overview and new directions. Journal of Autism & Developmental

Disorders, 28(5), 351-368.

Szatmari, P., Merette, C., Bryson, S. E., Thivierge, J., Roy, M.-A., Cayer, M., &

Maziade, M. (2002). Quantifying dimensions in autism: A factor-analytic study.

Journal of the American Academy of Child & Adolescent Psychiatry, 41(4), 467-

Szatmari, P., Tuff, L., Finlayson, A. J., & Bartolucci, G. (1990). Asperger's Syndrome

and autism: Neurocognitive aspects. Journal of the American Academy of Child

& Adolescent Psychiatry, 29(1), 130-136.

Tabachnik, B. G., & Fidell, L. S. (1996). Using multivariate statistics. (3rd ed.). New

York: HarperCollins College Publishers.

Tager-Flusberg, H. (1992). Autistic children's talk about psychological states: Deficits

in the early acquisition of a theory of mind. Child Development, 63, 161-172.

Tager-Flusberg, H. (Ed.). (1999a). Neurodevelopmental disorders: Developmental

cognitive neuroscience.. Cambridge, MA: The MIT Press.

Tager-Flusberg, H. (1999b). A psychological approach to understanding the social and

language impairments in autism. International Review of Psychiatry, 11(4), 325-

Tager-Flusberg, H. (2000). Language and understanding minds: Connections in autism.

In S. Baron-Cohen, H. Tager-Flusberg, & D. J. Cohen (Eds.), Understanding

other minds: Perspectives from developmental cognitive neuroscience (2nd ed.,

pp. 124-149). London: Oxford University Press.

Tager-Flusberg, H. (2001). A reexamination of the theory of mind hypothesis of autism.

In J. A. Burack, T. Charman, N. Yirmiya, & P. R. Zelazo (Eds.), The

development of autism: Perspectives from theory and research (pp. 173-193).

Tager-Flusberg, H., & Joseph, R. M. (2003). Identifying neurocognitive phenotypes in

autism. Philosophical Transactions of the Royal Society of London B,

358(1430), 303-314.

Tager-Flusberg, H., & Sullivan, K. (1994a). Predicting and explaining behavior: A

comparison of autistic, mentally retarded and normal children. Journal of Child

Tager-Flusberg, H., & Sullivan, K. (1994b). A second look at second-order belief

attribution in autism. Journal of Autism & Developmental Disorders, 24(5), 577-

Tager-Flusberg, H., & Sullivan, K. (1995). Attributing mental states to story characters:

A comparison of narratives produced by autistic and mentally retarded

individuals. Applied Psycholinguistics, 16(3), 241-256.

Tager-Flusberg, H., Sullivan, K., & Boshart, J. (1997). Executive functions and

performance on false belief tasks. Developmental Neuropsychology, 13(4), 487-

Teunisse, J.-P., Cools, A. R., van Spaendonck, K. P. M., Aerts, F. H. T. M., & Berger,

H. J. C. (2001). Cognitive styles in high-functioning adolescents with autistic

disorder. Journal of Autism & Developmental Disorders, 31(1), 55-66.

Thatcher, R. W. (1997). Human frontal lobe development: A theory of cyclical cortical

reorganization. In N. A. Krasnegor, G. R. Lyon, & P. S. Goldman-Rakic (Eds.),

Development of the prefrontal cortex: Evolution, neurobiology and behavior

Thomas, M., & Karmiloff-Smith, A. (2002). Are developmental disorders like cases of

adult brain damage? Implications from connectionist modelling. Behavioral &

Brain Sciences, 25(6), 727-787.

Tranel, D. (2002). Emotion, decision making, and the ventromedial prefrontal cortex. In

D. T. Stuss & R. T. Knight (Eds.), Principles of frontal lobe function (pp. 338-

352). London: Oxford University Press.

Tranel, D., Anderson, S. W., & Benton, A. (1994). Development of the concept of

'executive function' and its relationship to the frontal lobes. In F. Boller & J.

Grafman (Eds.), Handbook of Neuropsychology (Vol. 9, pp. 125-148).

Amsterdam: Elsevier Science.

Turner, M. A. (1996). Repetitive behaviour and cognitive functioning in autism.

Unpublished PhD thesis, University of Cambridge.

Turner, M. (1997). Towards an executive dysfunction account of repetitive behaviour in

autism. In J. Russell (Ed.), Autism as an executive disorder (pp. 57-100).

Turner, M. A. (1999). Generating novel ideas: Fluency performance in high-functioning

and learning disabled individuals with autism. Journal of Child Psychology and

Psychiatry, 40(2), 189-201.

Tyrer, P. (Ed.). (1988). Personality assessment schedule: In personality disorders:

Diagnosis, management, and course. London: Butterworth.

Veale, D. M., Sahakian, B. J., Owen, A. M., & Marks, I. M. (1996). Specific cognitive

deficits in tests sensitive to frontal lobe dysfunction in obsessive-compulsive

disorder. Psychological Medicine, 26, 1261-1269.

Vecchi, T. (1998). Visuo-spatial imagery in congenitally totally blind people. Memory,

6(1), 91-102.

Volkmar, F. R., Lord, C., Bailey, A., Schultz, R. T., & Klin, A. (2004). Autism and

pervasive developmental disorders. Journal of Child Psychology and Psychiatry,

45(1), 135-170.

Volkmar, F. R., Sparrow, S. S., Goudreau, D., Cicchetti, D. V., Paul, R., & Cohen, D. J.

(1987). Social deficits in autism: An operational approach using the Vineland

Adaptive Behavior Scales. Journal of the American Academy of Child &

Wallach, M. A., & Kogan, N. (1965). Modes of thinking in young children. New York:

Holt, Rinehart, & Winston.

Walsh, K. W. (1978). Neuropsychology: A clinical approach. New York: Churchill

Livingston.

Waltz, J. A., Knowlton, B. J., Holyoak, K. J., Boone, K. B., Mishkin, F. S., de Menezes

Santos, M., Thomas, C. R., & Miller, B. L. (1999). A system for relational

reasoning in human prefrontal cortex. Psychological Science, 10(2), 119-125.

Waterhouse, L., Fein, D., & Modahl, C. (1996). Neurofunctional mechanisms in autism.

Psychological Review, 103(3), 457-489.

Weinberger, D. (2002). Schizophrenia, the prefrontal cortex, and a mechanism of

genetic susceptibility. European Psychiatry, 17(Suppl4), 355-362.

Wellman, H. M. (1990). The child's theory of mind. Cambridge, MA: MIT Press.

Wellman, H. M., Cross, D., & Watson, J. (2001). Meta-analysis of theory-of-mind

development: The truth about false belief. Child Development, 72(3), 655-684.

Wellman, H. M., & Gelman, S. A. (1998). Knowledge acquisition in foundational

domains. In D. Kuhn & R. Siegler (Eds.), Handbook of child psychology:

Cognition, perception and language (5th ed., pp. 523-573). New York: Wiley.

Wellman, H. M., & Lagatutta, K. H. (2000). Developing understandings of mind. In S.

Wellman, H. M., & Woolley, J. D. (1990). From simple desires to ordinary beliefs: The

early development of everyday psychology. Cognition, 35(3), 245-275.

Welsh, M. C., & Pennington, B. F. (1988). Assessing frontal lobe functioning in

children: Views from developmental psychology. Developmental

Welsh, M. C., Pennington, B. F., & Groisser, D. B. (1991). A normative-developmental

study of executive function: A window on prefrontal function in children.

Welsh, M. C., Pennington, B. F., Ozonoff, S., Rouse, B., & McCabe, E. (1990).

Neuropsychology of early-treated phenylketonuria: Specific executive function

deficits. Child Development, 61(6), 1697-1713.

Welsh, M. C., Satterlee-Cartmell, T., & Stine, M. (1999). Towers of Hanoi and London:

Contribution of working memory and inhibition to performance. Brain &

Cognition, 41(2), 231-242.

Whiten, A. (Ed.). (1991). Natural theories of mind: Evolution, development and

simulation of everyday mindreading. Oxford: Blackwell.

Williams, M. A., Moss, S. A., Bradshaw, J. L., & Rinehart, N. J. (2002). Random

number generation in autism. Journal of Autism & Developmental Disorders,

32(1), 43-47.

Wilson, B. A., Evans, J. J., Emslie, H., Alderman, N., & Burgess, P. (1998). The

development of an ecologically valid test for assessing patients with

dysexecutive syndrome. Neuropsychological Rehabilitation, 8(3), 213-228.

Wimmer, H., & Mayringer, H. (1998). False belief understanding in young children:

Explanations do not develop before predictions. International Journal of

Behavioral Development, 22(2), 403-422.

Wimmer, H., & Perner, J. (1983). Beliefs about beliefs: Representation and constraining

function of wrong beliefs in young children's understanding of deception.

Cognition, 13(1), 103-128.

Wing, L., & Gould, J. (1979). Severe impairments of social interaction and associated

abnormalities in children: Epidemiology and classification. Journal of Autism

and Developmental Disorders, 9, 11-29.

Wolff, S., Narayan, S., & Moyes, B. (1988). Personality characteristics of parents of

autistic children: A controlled study. Journal of Child Psychology & Psychiatry

Wong, D., Maybery, M., Bishop, D. V. M., Maley, A., & Hallmayer, J. (2004). Profiles

of executive function performance in parents and siblings of individuals with

autism spectrum disorders. Manuscript in preparation.

World Health Organization. (1992). The ICD-10 classification of mental and behavioral

disorders: Clinical descriptions and diagnostic guidelines. Geneva, Switzerland:

Author.

Yirmiya, N., Erel, O., Shaked, M., & Solomonica-Levi, D. (1998). Meta-analyses

comparing theory of mind abilities of individuals with autism, individuals with

mental retardation, and normally developing individuals. Psychological Bulletin,

124(3), 283-307.

Yirmiya, N., & Shulman, C. (1996). Seriation, conservation, and theory of mind

abilities in individuals with autism, individuals with mental retardation, and

normally developing children. Child Development, 67(5), 2045-2059.

Yirmiya, N., Solomonica-Levi, D., Shulman, C., & Pilowsky, T. (1996). Theory of

mind abilities in individuals with autism, Down syndrome, and mental

retardation of unknown etiology: The role of age and intelligence. Journal of

Yonan, A. L., Alarcon, M., Cheng, R., Magnusson, P. K., Spence, S. J., Palmer, A. A.,

Grunn, A., Juo, S. H., Terwilliger, J. D., Liu, J., Cantor, R. M., Geschwind, D.

H., & Gilliam, T. C. (2003). A genomewide screen of 345 families for autism-

susceptibility loci. American Journal of Human Genetics, 73(4), 886-97.

Zelazo, P. D. (2000). Self-reflection and the development of consciously controlled

processing. In P. Mitchell & K. J. Riggs (Eds.), Children's reasoning and the

mind (pp. 169-189). Hove, UK: Psychology Press.

Zelazo, P. D., Burack, J. A., Benedetto, E., & Frye, D. (1996a). Theory of Mind and

rule use in individuals with Down's Syndrome: A test of the uniqueness and

specificity claims. Journal of Child Psychology & Psychiatry & Allied

Disciplines, 37(4), 479-484.

Zelazo, P. D., Burack, J. A., Boseovski, J. J., Jacques, S., & Frye, D. (2001). A

cognitive complexity and control framework for the study of autism. In J. A.

Burack, T. Charman, N. Yirmiya, & P. R. Zelazo (Eds.), The development of

autism: Perspectives from theory and research (pp. 195-217). Mahwah, NJ:

Zelazo, P. D., Carter, A., Reznick, J. S., & Frye, D. (1997). Early development of

executive function: A problem-solving framework. Review of General

Psychology, 1(2), 198-226.

Zelazo, P. D., & Frye, D. (1998). Cognitive complexity and control: II. The

development of executive function in childhood. Current Directions in

Psychological Science, 7(4), 121-126.

Zelazo, P. D., Frye, D., & Rapus, T. (1996b). An age-related dissociation between

knowing rules and using them. Cognitive Development, 11(1), 37-63.

Zelazo, P. D., Jacques, S., Burack, J. A., & Frye, D. (2002). The relation between theory

of mind and rule use: Evidence from persons with autism-spectrum disorders.

Infant & Child Development, 11(2), 171-195.

Zelazo, P. D., & Müller, U. (2002). Executive function in typical and atypical

development. In U. Goswami (Ed.), Blackwell handbook of childhood cognitive

development (pp. 445-469). Malden, MA: Blackwell Publishers.

Zelazo, P. D., & Reznick, J. (1991). Age-related asynchrony of knowledge and action.

Ziatas, K., Durkin, K., & Pratt, C. (1998). Belief term development in children with

autism, Asperger syndrome, specific language impairment, and normal

development: Links to theory of mind development. Journal of Child

Zilbovicius, M., Garreau, B., Samson, Y., Remy, P., Barthelemy, C., Syrota, A., &

Lelord, G. (1995). Delayed maturation of the frontal cortex in childhood autism.

American Journal of Psychiatry, 152(2), 248-252.

APPENDIX A Repetitive Behaviours Interview – Current Version

Instructions: In this interview I will ask you for details about some of the behaviours covered in the repetitive behaviours questionnaire which you would have completed in regard to each of your children. I’ll just be asking you about questions which you answered ‘yes’ to in that questionnaire. I’ll start by asking if [name] currently displays a particular behaviour and by this I mean a behaviour s/he has displayed once a week or more over the last 3 months. If s/he has, I’d like you to try to describe the behaviour, and I’ll also ask how often he/she shows this behaviour. [I’ll then ask you whether he/she has ever shown this behaviour at least once a week for a period of three months or more. If he/she has shown this behaviour in the past, I’d like you to try to describe the behaviour me and if possible to tell me at what age the behaviour was most frequent.]* Ask me questions at any time if things don’t seem clear. I am interested in all the repetitive behaviours shown by [name], so please tell me anything that you think may be of any interest. All the information you give me will be confidential. Any queries before we start? * [These instructions are only for parents of participants who are over the age of 12.]

STEREOTYPED MANIPULATION OF OBJECTS 1. Does [name] currently manipulate objects repetitively in any way? For example, does he/she spin, twiddle, bang, tap, twist, flick or wave objects or other materials repetitively? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost constantly (9) no information Describe objects and actions- 2. Does [name] currently operate light switches, taps, the toilet flush etc, repeatedly? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost constantly (9) no information Describe actions- 3. Does [name] currently arrange objects in rows or other patterns? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost consta tly n(9) no information c) DOES (NAME] ALWAYS LINE UP THE SAME OBJECTS IN THE SAME ORDER? (1) different objects and different order (2) same objects and different order (3) same objects and same (9) no information (99) not applicable d) DOES [NAME] SEEM TO NOTICE INSTANTLY IF AN OBJECT IS MISSING OR MOVED? (1) no (2) frequently (3) always (9) no information (99) not applicable

e) DOES [NAME] OBJECT IF THESE ROWS OR PATTERNS ARE MOVED OR PACKED AWAY? (1) no (2) frequently (3) always (9) no information (99) not applicable Describe objects and arrangements-

4. Does [name] currently mouth or suck objects or parts of him/herself repeatedly? For example, does he/she mouth or suck his/her fingers, a favourite object, his/her shirt collar or the like? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost constantly (9) no information Describe objects or body parts- 5. Does [name] currently stare closely at objects or his/her body parts? For example, does he/she stare at lights, spinning objects, a certain toy, his/her fingers etc.? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost constantly (9) no information Describe objects or body parts- 6. Does [name] currently obsessively collect or hoard items of any sort? Has s/he ever? (0) no obsessive, or unusually keen, collecting or hoarding (1) very keen collector of usual items (eg. stamps, football cards etc.) (2) hoards unusual or odd items (eg. leaflets, jar lids, sticks etc.), irregularly or on occasion and is reticent to throw

anything that has been collected away. (3) hoards unusual or odd items on a very regular basis, which, because of the volume of items hoarded, leads to

regular difficulties and conflicts (9) no information Details-

STEREOTYPED MOVEMENTS

7. Does [name] currently pace or move around repetitively? For example, does he/she walk to and fro across a room or around the house or garden repetitively? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost constantly (9) no information Describe movement, route and location-

8. Does [name] currently often spin him/herself around and around? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost constantly (9) no information Describe movement- 9. Does [name] currently rock rhythmically backwards and forwards, or side to side, either when sitting or when standing? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost constantly (9) no information Describe whether sitting or standing- 10. Does [name] currently touch parts of his/her body or clothing repeatedly? For example, does he/she repeatedly rub his/her legs, pull at the buttons on his/her clothing, or touch his/her ear or elbow etc.? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost constantly (9) no information Describe action and body part or clothing- 11. Does [name] currently make repetitive arm, hand and/or finger movements? For example, does he/she repetitively wave, flick, flap or twiddle his/her hands or fingers repetitively? Does he/she repetitively clap or clasp his/her hands? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost constantly (9) no information Describe movements and whether this occurs near his/her eyes-

12. Does [name] currently make any repetitive movements with his/her feet or legs? For example, does he/she repetitively tap his/her feet, swing his/her legs or jump etc.? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost constantly (9) no information Describe movements-

TIC-LIKE BEHAVIOURS 13. Does [name] currently make any particular words, noises etc. that he/she uses repeatedly? For example, does he/she repeat single words or nonsense words? Or other sounds such as hums, growls, clicking of the tongue, or clearing the throat? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost constantly (9) no information Describe words, noises etc- 14. Does [name] currently make any repetitive head or neck movements? For example, does he/she nod or shake his/her head repetitively, or show any jerky tic-like movements? Or does he/she show other repetitive movements of the face muscles such as raising eyebrows or moving the muscles around the lips? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost constantly (9) no information Describe movements-

15. Does [name] currently make any repetitive eye movements? For example, does he/she blink, roll or move his/her eyes repeatedly? Has s/he ever? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost constantly (9) no information Describe movements- 16. Does [name] currently make any repetitive mouth and/or tongue movements? For example, does he/she grind his/her teeth, smack his/her lips, or make sucking movements repetitively? Has s/he in the past? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost constantly (9) no information Describe movements-

SELF-INJURIOUS BEHAVIOUR 17. Does [name] currently bang his/her head? Does he/she do this

repeatedly? Has s/he in the past? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost constantly (9) no information Describe what head is banged against- 18. Does [name] currently ever injure himself/herself? For example does he/she bite, scratch, knock or pick at himself/herself? Does he/she do this repeatedly? Has s/he in the past? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost consta tly n(9) no information

GENERAL 19. Has [name] always shown one or more of these behaviours, or have there been periods when he/she hasn't shown any repetitive behaviours for 3 months or more? (1) at times has shown no repetitive behaviours for 3 months or more (2) has always shown one or more behaviours (3) has always shown at least one repetitive activity (9) no information (99) not applicable- items 1-18 all received a (0) rating Details of time periods-

COMPULSIVE BEHAVIOURS 20. Cleaning/Washing Compulsions: Does [name] currently wash his/her hands, shower, bathe or groom himself/herself, more than is necessary? Is he/she overly concerned about dirt and contamination, or take measures to prevent contact with contaminants? Does s/he clean household items or other objects excessively? Has s/he in the past? (0) no obsessive-or compulsive behaviour of this type- washes hands at appropriate times (e.g. at meal times, after

using the toilet), but does not consistently wash at inappropriate times. Is not unusually concerned about dirt or contamination.

(1) suspicious or mild obsessive or compulsive behaviour- washes hands 10 -14 times a day (2) clear obsessive or compulsive behaviour- washes hands 15+ times a day, or is preoccupied with worry about dirt

and contamination (9) no information b) IS THIS WASHING BEHAVIOUR CARRIED OUT IN A RITUALISED FASHION? (i.e. is it always carried out in the same order or in the same way) (1) no (2) frequently (3) always (9) no information (99) not applicable Describe cleaning/washing behaviour - 21. Checking Compulsions: Does [name] currently often check repeatedly that things are switched off, locked up or put away etc? Does s/he check other things like that nothing bad has happened, or that s/he did not make a mistake? Does he/she check these things more often than is necessary? Has s/he in the past? (0) no obsessive-or compulsive behaviour of this type- may check that an item has been switched off ·etc. once, but is

not preoccupied with whether or not items have been checked (1) suspicious or mild obsessive or compulsive behaviour- checks that one or more items have been turned off etc. on

two separate occasions on a daily basis (2) clear obsessive or compulsive behaviour- checks that one or more items has been switched off etc. on at least

three separate occasions on a daily basis, or is preoccupied with items being safely handled in order to avert disaster

(9) no information b) IS THIS CHECKING BEHAVIOUR CARRIED OUT IN A RITUALISED FASHION? (i.e. is it always carried out in the same order or in the same way) (1) no (2) frequently (3) always (9) no information (99) not applicable Describe checking and items checked-

22. Repeating Rituals: Does [name] currently perform any rituals where s/he has to keep repeating a certain action? For example, does s/he reread or rewrite excessively, or repeat routine activities such as going in and out of a door or getting up and down from a chair? (0) no obsessive-or compulsive behaviour of this type- (1) suspicious or mild obsessive or compulsive behaviour- performs repeating routine 3-10 times a day (2) clear obsessive or compulsive behaviour- performs routine 10+ times a day (9) no information b) IS THIS REPEATING BEHAVIOUR CARRIED OUT IN A RITUALISED FASHION? (i.e. is it always carried out in the same order or in the same way) (1) no (2) frequently (3) always (9) no information (99) not applicable Describe repeating ritual- 23. Counting Compulsions: Does [name] currently count objects repeatedly? Does s/he perform any rituals, which involve counting? Has s/he in the past? (0) no obsessive-or compulsive behaviour of this type- may count money or other objects but not excessively or

inappropriately (1) suspicious or mild obsessive or compulsive behaviour- counts objects inappropriately less than 5 times a day (2) clear obsessive or compulsive behaviour- counts objects more than 5 times per day (9) no information b) IS THIS COUNTING BEHAVIOUR CARRIED OUT IN A RITUALISED FASHION? (i.e. is it always carried out in the same order or in the same way) (1) no (2) frequently (3) always (9) no information (99) not applicable Describe counting behaviour -

24. Does [name] currently engage in any other compulsive behaviours? For example does s/he write lists excessively? Does s/he repeatedly touch, tap or rub certain things? Any other superstitious behaviours? Has s/he in the past? (0) no obsessive-or compulsive behaviour of this type- (1) suspicious or mild obsessive or compulsive behaviour- (2) clear obsessive or compulsive behaviour (9) no information b) IS THIS COMPULSIVE BEHAVIOUR CARRIED OUT IN A RITUALISED FASHION? (i.e. is it always carried out in the same order or in the same way) (1) no (2) frequently (3) always (9) no information (99) not applicable Describe compulsive behaviour - 25. How much time do you think s/he spends on these compulsive behaviours per day? (1) 0-1 hrs/day (2) 1-3 hrs/day (3) 3-8 hrs/day (4) >8 hrs/day (9) no information (99) not applicable

OBJECT ATTACHMENTS 26. Is [name] currently attached to any particular objects? For example, does he/she carry a teddy, a blanket or a stick etc. around with him/her? Does he/she want to sleep with this item? Does he/she become distressed if it is lost or forgotten? Has s/he in the past? [In order to be considered an object attachment the individual must insist on sleeping with the item, or must carry it with him/her at specific times or in specific situations (e.g. whenever out of the house). The individual should also be concerned or distressed if the item is mislaid. (0) no attachments to objects (1) attachments to objects which are commonly used as comforters (e.g. teddies, blankets etc.) (2) attachments to unusual objects or junk materials (eg. sticks, tins etc.). Rate here even if unusual attachments cc-

exist with more usual object attachments

(9) no information b) [If score on the previous item is (1) or (2), then also complete the following item] (1) insists that the object must be in bed every night, but only when the individual is at home (2) insists that the object must be in bed every night whether the individual is at home or away (3) insists that the object must be with the individual at times other than when tired or sleeping (9) no information Describe objects-

G. INSISTENCE ON SAMENESS OF ENVIRONMENT

27. Does [name] currently insist on things about the house staying the same? For example, does he/she insist on furniture staying in the same place, or curtains being open or closed etc.? (0) No fixed insistence on furniture, ornaments etc. remaining in the same places simply because he/she doesn't like

things to be moved (1) any relatively inflexible example which does not impact on other family members daily, as it primarily concerns

items that belong to, or are used by, the individual only, or if this is not the case, he/she is able to tolerate alterations when others are present

(2) any pervasive example which is very rigid and impacts on the other members of the family on a daily basis (e.g. having to have lounge furniture organised in a particular way, or insisting that everybody's bedroom door must be closed etc, at all times)

(9) no information Describe items and location- 28. Does [name] currently insist on other items being put out, kept or stored in the same way? For example, does he/she like ornaments, toys or cassette tapes kept in the same places or positions? Has s/he in the past? (0) no fixed insistence that items must be stored in the same places or the same way (1) any example which does not interfere with other family members on a daily basis, as it primarily concerns the

individuals own personal possessions although it may be very inflexible a (e.g. the arrangement of personal toiletries- he/she will not tolerate others moving them, even when cleaning the bathroom).

(2) any pervasive example which is very rigid and impacts on the other members of the family daily (e.g. insisting that a family video collection must always be stored in precisely the same way.)

(9) no information Describe items and location-

29. Is there anything else that [name] currently likes to remain just so? Has s/he in the past? (0) no (1) yes, any relatively inflexible example which is consistently observed by the individual, but has only a limited impact

on the family (i.e. does not impact on the remainder of the family on a daily basis) (2) yes, any pervasive example which is highly rigid and impacts on the other family members on a daily basis (9) no information Describe- 30. Does [name] currently play the same music, game or video, or read the same book repeatedly? Has s/he in the past? (0) does not have any music, games, videos or books that s/he uses more than normal (1) plays the same, music, game or video or reads the same book (excepting continuing on with a novel) at least

once a day (2) plays the same, music, game or video or reads the same book (excepting continuing on with a novel) at least

three times a day and prevention or interruption of this activity causes a marked negative reaction (9) no information Describe the book, game or music- 31. Does [name] currently insist on using the same objects or items in any other situation? For example, does he/she insist on using the same chair, plate, bed linen or door? Has s/he in the past? [Do note rate insistence on using the same mug or cup] (0) no fixed insistence on always using precisely the same items (excepting a mug or cup) in any situation- will

generally use any item that he/she is given or the first item that is available (1) any example which is unusually restricted or fixed, but can generally be modified if it is important to do so (e.g. if

the item is in the dishwasher, if someone else is using it) (2) any pervasive example which is very rigid and leads to regular confrontations with others, or requires extra effort

on the part of the individual or others (e.g. insisting on using a certain plate etc. even if it is dirty or someone else is using it), on a regular or daily basis

(9) no information Describe item and situation- 32. Does [name] currently insist on wearing the same clothes or refuse to wear new clothes? Has s/he in the past? (0) no insistence on wearing the same items of clothes- wears a range of different items and is keen to have new

clothes (1) insists on wearing the same item of clothing (e.g. jumper, trousers), in most situations, including frequently when it

is inappropriate. Or refuses, or shows marked reticence, to wear new clothes. Will wear alternative clothing for at least certain, or special, occasions if prompted.

(2) insists on wearing the same (or substantially the same), outfit most or all of the time so that it is difficult for this outfit to be washed and any deviation from this usual outfit causes an extreme negative reaction.

(9) no information Describe clothing- 33. Does [name] currently insist that certain items of clothing must always be worn, or worn in the same situation or in the same way? For example, does he/she insist on always wearing a vest, or wearing a hat to the shops, or always buttoning a shirt to the collar? Has s/he in the past? (0) no unusually fixed ways of wearing clothes- will modify clothing and the way in which it is warn etc. as appropriate

(e.g. will take off coat if hot, or if wet or dirty etc.) (1) consistently dresses in the same fixed manner, or wears the same clothes in the same situations, in a manner that

is odd or unusual, but can modify this behaviour if it is necessary or important to do so (e.g. generally wears tops done up and with the hood up, but will undo this if it is hot etc.)

(2) has very fixed ways of wearing clothes, or always wears the same clothes in the same situations, and this is adhered to strictly even when it is very odd and impractical (e.g. always wears hat to the shops, always wears a coat outside irrespective of the weather)

(9) no information Describe clothing and situation-

34. Does [name] currently insist on eating the same foods, or a very small range of foods, at every meal? Has s/he in the past? (0) eats a range of foods, although there may be a limited number of foods that he/she doesn't like to eat (1) eats a limited range of foods and it is regularly the case that the he/she will eat a different meal to the rest of the

family- will not try new foods (2) eats fewer than five separate food types (9) no information Describe foods- 35. How does [name] respond if you introduce him/her to a new activity or place? Would he/she have any objection to trying something new and different? Would he/she be anxious? [Rate usual, or most common, reaction] (0) participates/will visit without hesitation (1) will be persuaded, but shows some reticence because the activity/place is new or different (2) refuses to take part in anything new or different (3) shows a high degree of stereotyped behaviour when trying something new (9) no information Describe reaction-

RIGID ADHERENCE TO ROUTINES AND RITUALS 36. Are there any aspects of routine that [name] currently insists must remain the same? For example, does he/she insist on always bathing before breakfast, on going to the shops every afternoon, or on watching a video after every meal? Has s/he in the past? (0) has no rigid routine - preferred routines can be modified if it is necessary or appropriate to do so (1) has a set routine which is inflexible and consistently impacts on other family members because he/she is unable to

take "shortcuts" in his/her routine (e.g. the individual is unable to finish early in the bathroom if someone needs it, or take their walk on another day if a family outing is planned etc.)

(2) has a very fixed or inflexible routine which involves not just the self but also other family members and so has a substantial impact on the family (e.g. expects everyone to go swimming on a Saturday morning and is upset if this routine is violated)

(9) no information Describe routine- 37. Does [name] currently make rituals out of everyday activities such as eating, dressing, getting in the car, walking up stairs etc.? Are these activities always carried out in exactly the same way? Has s/he in the past? (0) has no regular rituals or set ways of doing things- preferred ways of doing things can be modified if it is appropriate

to do so (e.g. may always put socks on first, but if no clean socks are available will put on other items of clothing first)

(1) has set rituals, which are inflexible and impact on other family members to some degree because the individual is unable to modify these rituals when it is important to do so. These rituals concern the individual only and are not excessively time-consuming. They do not incorporate unnecessary or redundant steps and actions.

(2) has very elaborate and inflexible rituals which may, or may not, involve others, but take considerable time (i.e. take significantly more time than the same activity would take in non-ritualised fashion) and cannot be abbreviated. These rituals affect all family members because of the large amounts of time taken up with these rituals on a daily basis (e.g. having to check that every bodies seat belt is fastened and that the glove box contains certain items before setting out on any car journey, no matter how short.)

(9) no information Describe activity and precise ritual-

38. Does [name] currently have any rituals that are linked to particular occasions or places? For example, does he/she have specific rituals for the supermarket, the Doctor's surgery or a relative's house? Has s/he in the past? (0) has no fixed rituals for particular places or occasions- preferred ways of doing things can be modified if it is

appropriate to do so (e.g. if in a hurry, if the weather is not appropriate etc.) (1) has certain fixed activities or rituals that he/she insists on at particular occasions or particular places. These rituals

concern the individual only and have minimal impact on the remainder of the family (e.g. always rides on the swings in the same fixed order, or always orders the same food in a cafe)

(2) has one or more very fixed and inflexible rituals which have a severe impact on the family as it is highly intrusive or involves other family members (e.g. must always enter certain shops in certain order when shopping)

(9) no information Describe ritual and occasion or place- 39. Does [name] currently insist on moving or travelling by the same route? For example, does he/she insist on taking the same route when moving about the house, going for a walk, or travelling in the car? Has s/he in the past? (0) has no set route for moving or travelling- preferred ways of doing things can be modified if it is appropriate to do so (1) has a set route that he/she will always take to one or more specific locations if on his/her own or if given the

choice. Finds it very difficult to accept deviations from this, but will accept an alternative if there is a good reason for doing so.

(2) will take only one route to at least one specific destination and will not tolerate any deviation from this, no matter what the need or justification for the change is.

(9) no information Describe mode of travelling and journey- 40. Is there anything else that[name] currently likes to be done in a certain way, or at a certain time? Has s/he in the past? (0) no (1) yes, any relatively inflexible example which is consistently observed by the individual, but has only a limited impact

on the family (i.e. does not impact on the remainder of the family on a daily basis) (2) yes, any pervasive example which is highly rigid and impacts on the other family members on a daily basis no

information (9) no information Describe- 41. Does [name] currently incorporate any unnecessary, or unusual, behaviours as part of any rituals or routines? For example, does he/she tap the plate after every mouthful when eating, or touch specific objects when walking through a room? Has s/he in the past? (0) no unnecessary, idiosyncratic behaviours incorporated in routines (1) yes, any relatively inflexible example which is consistently observed by the individual, but has a limited impact on

the family- he/she can refrain from the behaviour when asked to do so for at least 10 minutes

(2) yes, any pervasive and unusual example, which is very rigid and is observed by the individual at all times. He/she is unable (or unwilling) to suppress this behaviour.

(9) no information Describe ritual or routine and unnecessary activity-

REPETITIVE USE OF LANGUAGE 42. Does [name] currently mimic others or repeat speech? Has s/he in the past? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost constantly (9) no information C) DOES [NAME] (A) REPEAT WHAT IS SAID IMMEDIATELY AFTER IT IS SAID OR, (B) REPEAT WHAT HAS BEEN SAID SOME TIME AFTER IT HAS BEEN SAID? (1) A (2) B (3) combination of A and B (9) no information (99) not applicable Describe the type of speech repeated- Items 43-45 inclusive specifically address spontaneous language and exclude echolalia, or language that is copied from other sources. If [name] does not have at least good phrase speech, skip items 43-45 (and score (99), not applicable). 43. Does [name] currently say the same things, or sing the same songs, repeatedly? For example, does [name] recite the same thing over and over, or have stock phrases that he/she often uses? Has s/he in the past? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost constantly (9) no information c) DOES [NAME] (A) SAY THE SAME THING OVER AND OVER AGAIN AT ONE POINT IN TIME OR, (B) SAY THE SAME THING AT DIFFERENT TIMES? (1)A (2) B (3) combination of A and B (9) no information· (99) not applicable Describe sentences or songs- 44. Does [name] currently ask the same questions repeatedly? Has s/he in the past? a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost constantly (9) no information c) DOES [NAME] (A) SAY THE SAME THING OVER AND OVER AGAIN AT ONE POINT IN TIME OR, (B) SAY THE SAME THING AT DIFFERENT TIMES? (1) A (2) B (3) combination of A and B (9) no information (99) not applicable d) DOES HE/SHE DEMAND THAT OTHERS ALWAYS GIVE THE SAME ANSWERS? (1) no (2) frequently (3) always (9) no information (99) not applicable Describe questions -

45. Does [name] currently talk about the same topic over and over again? Has s/he in the past? [Rate only repeated attempts to raise the same topic in conversation. These attempts may incorporate some echoed speech, but must also include spontaneous speech and attempts to talk around the topic.] a) HOW OFTEN DOES HE/SHE DO THIS? b) HOW LONG DOES IT LAST? (O) never (1) less 60 secs (1) 1-2x’s per week (2) 1-3 mins (2) 3-6x’s per week (3) 4-9 mins (3) 1-4x’s per day (4) 10-29 mins (4) 5-14x’s per day (5) 30 mins + (5) 15-29x’s per day (9) no information (6) 30+x’s per day (99) not applicable (7) almost constantly (9) no information Describe topic and whether it’s based on fantasy or reality-

CIRCUMSCRIBED INTERESTS

46. Does [name] have any unusual preoccupations? Does he/she regularly talk about and seek out a particular type of object? Has s/he ever? (0) no preoccupations or preoccupation with objects that are common in their age group and not to the exclusion of

other interests or activities (1) preoccupation with items common in their age but to such a degree that it significantly limits involvement in other

interests or activities (2) preoccupation with unusual items (9) no information How long has [name] been preoccupied with [interest]? Please describe the preoccupation- 47a. Does [name] have any particular interests? Is there anything unusual about this interest? Would you describe this interest as particularly keen or obsessional? Does he/she pursue this interest to the exclusion of other interests and hobbies? What other interests and hobbies does [name] have? Has s/he ever had any unusual or obsessional interests? (0) usual topic of hobby or interest (e.g. computers or football teams)- casual to keen interest (1) usual topic of hobby or interest (e.g. computers or football teams)- abnormally keen or obsessional interest OR

mildly unusual topic of hobby or interest (e.g. road maps or record covers)- casual to keen interest (2) mildly unusual topic of hobby or interest (e.g. road maps or record covers) – abnormally keen or obsessional

interest (3) abnormally keen or obsessional interest in highly unusual topic of hobby or interest (e.g. DIY tools or street lamps)

- abnormally keen or obsessional interest (9) no information How long has [name] had this particular interest(s)? Please describe the interest(s)- 47b. Summary Rating (0) has a varied pattern of interests, which are pursued meaningfully. (1) one or more abnormally keen or highly circumscribed interests, but also more usual interests which are pursued

meaningfully. (2) has only obsessional interests which are either pursued to an abnormally keen extent, or are highly circumscribed

in nature (3) has no particular interests or hobbies that he/she will pursue spontaneously (DO NOT RATE WATCHING

TELEVISION) (9) no information 47c. How is this interest or hobby manifested (0) usual manifestation of interest- collecting, sorting, reading, playing/using relevant materials (1) mildly unusual or idiosyncratic manifestation of interest- odd or unusual activity (2) highly unusual or idiosyncratic manifestation of interest- highly stereotyped or ritualised activity (9) not applicable- item received a (O) rating Please describe how it is manifest-

GENERAL ITEMS Skip items 48-52 inclusive (and score (99), not applicable), if all interview items have received a (O) rating. 48a. Does [name] ever make any attempt to cover up, hide or change any of the behaviours you have described? For example, does he/she leave the room to engage in repetitive activities, or does he/she suppress them if he/she knows that other people are watching? Has s/he in the past? (0) never (1) occasionally- but not at specific or predictable times (2) most often- but not at specific or predictable times (3) at all times (4) only, or mainly, when calm and relaxed (5) only, or mainly, at school (6) only, or mainly, with new people or in social situations (excluding solely school) (7) only, or mainly, when likely to be reprimanded (8) at other times (9) no information (99) not applicable- all interview items received a (O) rating Describe the way in which the behaviour has been covered up- 48b. Which behaviours? [Rate the category that the behaviour that s/he attempts to cover belongs to.] (1) repetitive movements – (a) stereotypies (b) repetitive use of objects (c) tic like movements (d) self injurious behaviours (2) object attachments (3) insistence on sameness of environment (4) insistence on sameness of activity or item (5) adherence to routine and rituals (6) repetitive use of language (7) circumscribed interests (8) compulsive behaviours (9) no information (99) not applicable- all interview items received a (O) rating Briefly describe the behaviour- 49. Have you, or anyone else, ever made any attempt to reduce any of the behaviours shown by [name] that we have talked about? (1) no (2) yes, at different times (3) yes, continually and consistently (9) no information (99) not applicable - all interview items received a (0) rating 50. What was the earliest repetitive activity that you remember [name] showing? How old was he/she when this began? [Rate the category that this activity belongs to and the age at which it began.] (1) repetitive movements – (a) stereotypies (b) repetitive use of objects (c) tic like movements (d) self injurious behaviours (2) object attachments (3) insistence on sameness of environment (4) insistence on sameness of activity or item (5) adherence to routine and rituals (6) repetitive use of language (7) circumscribed interests (8) compulsive behaviours (9) no information (99) not applicable- all interview items received a (O) rating

[The following two items apply only to repetitive activities which have been evident during the last three months.] 51a. Of the repetitive behaviours and rituals and special interests that we have discussed, which one would you say is the most marked or the most noticeable? [Rate the category that this activity belongs to.] (1) repetitive movements – (a) stereotypies (b) repetitive use of objects (c) tic like movements (d) self injurious behaviours (2) object attachments (3) insistence on sameness of environment (4) insistence on sameness of activity or item (5) adherence to routine and rituals (6) repetitive use of language (7) circumscribed interests (8) compulsive behaviours (9) no information (99) not applicable- all interview items received a (O) rating b. Which would come second? c. Which would you think comes third? 52a. Of all of the repetitive behaviours and rituals and special interests etc. that we have talked about, which one would you say causes the greatest problem in day-to-day life? (1) repetitive movements – (a) stereotypies (b) repetitive use of objects (c) tic like movements (d) self injurious behaviours (2) object attachments (3) insistence on sameness of environment (4) insistence on sameness of activity or item (5) adherence to routine and rituals (6) repetitive use of language (9) circumscribed interests (10) compulsive behaviours (9) no information (99) not applicable- all interview items received a (O) rating b. Which would you think comes second? c. Which would you think comes third?

APPENDIX B Correlations between EF task variables in the control group (Study One)

Table B1. Raw correlations between EF variables in the control group (N.B.: Intra-domain correlations are depicted in bold) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 181 2 .30*

3 05 .22

4 .11 .17 .04 5 -.07 -.20 -.42* -.266 .11 -.06 .18 .10 -.03 7 .02 -.19

.79** .59** 8 .01 .05 .06 .04 -.06 -.07 -.099 -.04 .10 .24 .16 .15 -.11 .06 -.07 10 .14 .14 .28 -.02

-.22 -.19 .05 .45** 11 -.40**

-.16 -.25 .19 .04 -.18 -.09 -.33 -.14 -.34*

12 -.11 -.09 -.17 -.16 .16 .12 .20 -.49** -.14 -.19 .29*13 .35* .15 -.02 -.28 -.02 .09 .04 .04 -.05 .11 -.51** .18 14 -.44** -.32* .20 -.01 -.24 .12 -.12 -.31 -.24 -.14 .47** .45** -.25 15 .29* .17 -.24 -.32 .26 -.04 .19 -.11 .02 .27 -.36* .33* .48** -.45** 16 -.25 -.14 -.42* .12 .08 .11 .13 .07 -.03 -.13 .08 -.02 -.13 .18 -.12 17 -.23 -.40** -.40* -.08 .14 -.01 .10 -.23 -.05 -.22 .30* .24 -.21 .50** -.28 .51**

18 .05 .27 .27 .13 -.73** -.34 -.78** .15 .03 .34* -.02 -.13 -.01 .04 -.07 -.12 -.10 19 .11 .15 .13 -.34 .42* -.38* .02 -.11 .13 .07 .02 -.11 -.09 -.13 -.10 -.53** -.28 a 1 = ToL adjusted extra moves score; 2 = ToL rule violations; 3 = IDED set-shifting task Perseveration Condition EDS stage errors; 4 = IDED set-shifting task Learned Irrelevance Condition EDS stage errors; 5 = RIL task inhibition error difference score; 6 = RIL task load error difference score; 7 = RIL task inhibition + load error difference score; 8 = RIL task shape error score; 9 = Opposite Worlds error difference score; 10 = Opposite Worlds time difference score; 11 = Relational Complexity total score; 12 = Pattern Meanings correct responses; 13 = Pattern Meanings sum of errors; 14 = Uses of Objects correct responses; 15 = Uses of Objects sum of errors; 16 = Stamps task complexity score; 17 = Stamps task originality score; 18 = Stamps task restriction score; 19 = Stamps task rule adherence score. *p < .05; ** p < .01. All tests were two-tailed. a = Correlation could not be computed because one of the variables was constant. Note: The RIL task RT difference scores are not included in this table for the sake of brevity.

Table B2. Partial correlations between EF variables in the control group (N.B.: Intra-domain correlations are depicted in bold) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 181 2 .14 3 .03 .18

4 .15 .21 .07 5 -.11 -.23 -.50** -.226 .11 -.07 .19 .09 .0 7 -.02 -.22 -.26

.65** 8 -.28 -.15 .02 .03 .03 -.10 -.049 .01 .10 .22 .17 .21 -.10 .10 -.09 10 .04 .03 .24 .06 -.27 -.22 -.35 -.01 .50**

11 -.01 .21 -.26 .24 .08 -.24 -.09 -.04 -.23 -.2312 -.02 .01 -.14 -.17 .16 .12 .20 -.44* -.12 -.14 .21 13 .05 -.07 -.08 -.31 -.07 .09 .0 -.24 -.03 -.04 -.18 .36* 14 -.20 -.11 .33 -.03 -.37* .18 -.17 -.07 -.31 .0 .06 .41** .10 15 .07 .0 -.32 -.34 .32 -.06 .21 -.36* .03 .20 -.07 .49** .31* -.26 16 -.19 -.10 -.43* .10 .18 .11 .21 .12 -.06 -.05 -.05 -.04 -.03 .12 -.05 17 .0 .0 -.38* -.11 .16 -.01 .12 -.03 -.04 -.11 -.09 .14 .05 .31* -.11 .53**

18 .02 .02 .26 .11 -.74** -.36 -.79** .07 .0 .41* .11 -.08 -.07 .16 -.13 -.15 -.01 19 .09 .09 .11 -.31 .37* -.39* -.06 -.12 .15 -.01 .11 -.11 -.16 -.13 -.15 -.51** -.30 a 1 = ToL adjusted extra moves score; 2 = ToL rule violations; 3 = IDED set-shifting task Perseveration Condition EDS stage errors; 4 = IDED set-shifting task Learned Irrelevance Condition EDS stage errors; 5 = RIL task inhibition error difference score; 6 = RIL task load error difference score; 7 = RIL task inhibition + load error difference score; 8 = RIL task shape error score; 9 = Opposite Worlds error difference score; 10 = Opposite Worlds time difference score; 11 = Relational Complexity total score; 12 = Pattern Meanings correct responses; 13 = Pattern Meanings sum of errors; 14 = Uses of Objects correct responses; 15 = Uses of Objects sum of errors; 16 = Stamps task complexity score; 17 = Stamps task originality score; 18 = Stamps task restriction score; 19 = Stamps task rule adherence score. *p < .05; ** p < .01. All tests were two-tailed. a = Correlation could not be computed because one of the variables was constant. Note: The RIL task RT difference scores are not included in this table for the sake of brevity.

APPENDIX C Separate ToM-EF correlations for young and old age subgroups within the

control sample (Study One)

Table C1. Raw and partial correlations between ToM and EF variables within “young” control participants (aged 5-8 years) False belief task EF task Simple 1st-order 2nd-order ToL (n = 25): Adj.extra move score -.34 -.46* -.48* -.39 Rule violations -.11 -.02 -.60** -.55** IDED Set-shifting task condition (n = 13): Perseveration EDS stage errors a -.68* -.90*** -.64* -.84** Learned Irrelevance EDS stage errors a -.19 -.08 RIL task (n = 12): Error difference scores: Inhibition a .06 .02 Load a -.64* -.80* -.51 Inhibition + load a -.69* -.66 -.57 RT difference scores: Inhibition a .18 -.45 Load a .21 .47 Inhibition + load a .46 .10 Shape error score a -.31 -.12 Opposite Worlds (n = 14): Error diff. score a -.24 -.11 Time diff. score a -.11 -.14 Relational Complexity (n = 25): Total score .14 .38 .40* .13 Pattern Meanings (n = 25): Correct responses -.19 .39 .17 Sum of errors -.56** -.58** -.13 -.14 Uses of Objects (n = 25): Correct responses .08 .46* .24 .46* .25 Sum of errors -.48* -.39 -.13 -.21 Stamps task (n = 25): Complexity score -.04 .36 .35 Originality score .05 .41* .25 .48* .36 Restriction score a a a Rule adherence score .13 -.01 .17 * p < .05; ** p < .01; *** p < .001. Note: Partial correlations controlled for age, VIQ and PIQ. All tests were two-tailed. a = No correlation could be calculated as one of the variables was constant

Table C2. Raw and partial correlations between ToM and EF variables within “old” control participants (aged 9-18 years) False belief task EF task Simple 1st-order 2nd-order ToL (n = 21): Adj.extra move score -.20 -.15 -.20 Rule violations .09 .13 .09 IDED Set-shifting task condition (n = 21): Perseveration EDS stage errors -.28 -.08 -.28 Learned Irrelevance EDS stage errors -.18 -.26 -.18 RIL task (n = 21): Error difference scores: Inhibition .29 .07 .29 Load -.32 -.05 -.32 Inhibition + load .10 .03 .10 RT difference scores: Inhibition -.16 -.26 -.16 Load .44* .54* .30 .44* .54* Inhibition + load .21 .01 .21 Shape error score -.31 -.20 -.31 Opposite Worlds (n = 21): Error diff. score .18 -.17 .18 Time diff. score .02 .02 .02 Relational Complexity (n = 21): Total score .43 .08 .43 Pattern Meanings (n = 21): Correct responses .20 .14 .20 Sum of errors -.08 .12 -.08 Uses of Objects (n = 21): Correct responses .10 .27 .10 Sum of errors -.02 -.17 -.02 Stamps task (n = 20): Complexity score .07 .10 .07 Originality score .59** .43 .26 .59** .43 Restriction score .05 .08 .05 Rule adherence score .06 .08 .06 * p < .05; ** p < .01; *** p < .001. Note: Partial correlations controlled for age, VIQ and PIQ. All tests were two-tailed.

APPENDIX D Separate group comparisons for young and old age subgroups on EF tasks (Study One)

Table D1. Group comparisons for “young” (5-8 years) and “old” (9-18 years) participants on inhibition, planning, and generativity tasks N Mean (SD)

Age subgroup ASD Control ASD Control t p Young Inhibition: participants Opposite Worlds: Error difference score 10 14 2.60 (2.59) 0.43 (1.87) 2.39 .03* Time difference score

10 14 15.67 (11.99)

7.01 (4.55)

2.48 .02*

Planning: ToL: Adjusted extra moves score

20 25 29.80 (8.17)

25.36 (7.19)

1.94 .06

Generativity: Uses of Objects: Correct responses 20 25 16.75 (8.28) 22.00 (9.51) 1.95 .06 Stamps task: Complexity score 20 25 18.25 (3.48) 20.12 (3.15) 1.89 .07 Originality score 20 25 2.50 (2.16)

3.96 (2.99)

1.83 .07

Old Inhibition: participants Opposite Worlds: Error difference score 19 22 0.89 (1.76) 0.86 (1.13) .07 .95 Time difference score

19 22 8.73 (6.09)

6.22 (4.03)

1.57 .12

Planning: ToL: Adjusted extra moves score

25 22 23.52 (6.33)

19.32 (6.24)

2.29 .03*

Generativity: Uses of Objects: Correct responses 26 23 20.85 (9.26) 31.22 (6.93) 4.39 .00*** Stamps task: Complexity score 21 21 19.00 (2.55) 20.71 (2.80) 2.08 .04* Originality score 21 21 3.81 (2.69) 5.76 (2.23) 2.56 .01* Note: Only continuous variables on which significant overall group differences were found are included. This table is intended to demonstrate that the EF components which are impaired in individuals with ASDs (in comparison with age-matched controls) change with development.

theory of mind and executive function impairments …...theory of mind and executive function...

Documents

special needs and overseas mission. areas of special needs...

capitalism and impairments

sensory impairments

cognitive impairments

theory of mind of the persons with visual … · 497...

memory impairments

problem solving with function in mind - one wild...

26th annual report to congress on the … · web viewstate...

physical impairments & other health impairments

focal cognitive impairments

emotional impairments

sensory impairments fa2014

goodwill & impairments study

channel impairments

alzheimer disease and mind function

impairments of neural circuit function in alzheimer’s...

executive function impairments - home - learning...

faculty resource guide - usc upstate...mobility impairments...

impairments white paper

speech-language impairments