fast logic? examining the time course assumption of dual...
TRANSCRIPT
Fast logic? Examining the time course assumption of dual process
theory in reasoning.
Bence Bago
Cognitive Science Msc
Paris Descartes University
Laboratory for the Psychology of Child Development and Education
2015
Supervisor: Wim De Neys
Co-advisor: Olivier Houdé
2
Table of Contents Declaration of Originality ....................................................................................................................... 3
Declaration of Contribution ..................................................................................................................... 3
Abstract ................................................................................................................................................... 4
Introduction ............................................................................................................................................. 5
Experiment 1 ......................................................................................................................................... 10
Methods ............................................................................................................................................. 10
Participants .................................................................................................................................... 10
Materials ........................................................................................................................................ 10
Procedure ....................................................................................................................................... 12
Results ............................................................................................................................................... 18
Analysis strategy ........................................................................................................................... 18
Accuracy of final responses........................................................................................................... 18
Direction of change analysis ......................................................................................................... 19
Analyses of confidence ratings and response times ...................................................................... 20
Analysis of response latencies ....................................................................................................... 21
Analysis of confidence ratings ...................................................................................................... 23
Response stability analysis at individual level .............................................................................. 24
Discussion ......................................................................................................................................... 26
Experiment 2 – 4 ................................................................................................................................... 27
Methods ............................................................................................................................................. 27
Participants .................................................................................................................................... 27
Materials and procedure ................................................................................................................ 27
Results ............................................................................................................................................... 29
General discussion ................................................................................................................................. 30
Appendix 1. ........................................................................................................................................... 33
Appendix 2. ........................................................................................................................................... 34
Appendix 3. ........................................................................................................................................... 36
References: ............................................................................................................................................ 37
3
Declaration of Originality The aim of this study was to test the time course assumption behind the classic dual process
theory of reasoning. We wanted to examine whether people could generate logical responses
intuitively. This question was inspired by previous papers, but the available evidence was
found to be insufficient. In this study, we used a previously developed experimental
paradigm, the two response paradigm, to test the research question. A new analysis was
applied (direction of change analysis) to assure a direct test of the temporal assumptions of
dual process theory. In addition, we used a range of experimental methods to validate the two
response paradigm for the first time.
Declaration of Contribution I would like to gratefully thank Dr. Wim De Neys’ help on this thesis, who was my
supervisor. While the literature review, the programming of the experiments, the recruitment
of participants, online testing, and the statistical analysis were fully my work, he helped me
with several aspects of the research process. He helped me to identify the exact research
question, to determine the exact study design, and he proposed feedback on the experiments
before starting the testing phrase. He also kindly gave me feedback about the first drafts of the
written thesis, and helped me to identify grammatical and orthographical errors. Moreover, he
gave me advice concerning the style and the structure of the thesis, and he helped me with
theorizing the results of this study as well. I also want to say thanks to Eric Douglas Johnson
for his help in creating the instructions and stimuli for the cognitive load experiments, and
Bastien Trémolière, for his help familiarizing me with the Crowdflower surface. I would like
to express my gratitude to Oliver Houdé for hosting me in the lab. Finally, I would like to
thank the Ecole des Neurosciences de Paris organization for funding me during this
internship.
4
Abstract Decades of research on reasoning indicated that people’s thinking is often biased. The most
influential explanation for biased reasoning is put forward by the by the default interventionist
dual process theory; according to this theory people immediately produce a stereotype- or
belief- based type 1 heuristic response. Subsequently, slower, more deliberative type 2
processing might override the heuristic answer and result in the generation of the correct,
logical response. The aim of this study was to test this time course assumption of classic dual
process theory. We specifically wanted to see whether a logical response is only generated as
a result of slow, deliberative, type 2 processing. For this reason, we used the two response
paradigm, in which participants have to give an immediate, intuitive response, and afterwards
are given as much time as they want to indicate their final response (Experiment 1). Our key
finding is that we frequently observe correct, logical responses as the first, immediate
response. An analysis of reaction times, confidence ratings, and a control reading condition
supported the intuitive nature of the observed initial logical responses. In three additional
experiments we used a range of experimental procedures (Experiment 2: stringent response
deadline; Experiment 3: cognitive load manipulation; Experiment 4: both deadline and load
manipulation) to knock out type 2 processing and further establish that the initial logical
responses were truly intuitive in nature. Results were identical to the results of Experiment 1.
In sum, evidence was found that logical responding can occur as a result of fast, type 1
processing. We sketch a revised dual process model to account for the findings.
5
Introduction In the last few decades, a huge number of studies revealed that people do not follow
classic normative logical or probabilistic rules during reasoning. As an example, consider the
following question (De Neys & Glumicic, 2008):
“In a study 1000 people were tested. Among the participants there were 4 men and 996
women. Jo is a randomly chosen participant of this study. Jo is 23 years old and is finishing a
degree in engineering. On Friday nights, Jo likes to go out cruising with friends while
listening to loud music and drinking beer.
What is most likely?
a. Jo is a man
b. Jo is a woman”
On the basis of the base rate probabilities one should choose answer b, because there
are much more women than men in the sample. However, many studies have shown that
people tend to neglect the base rate probabilities. Participants typically base their choice on
the stereotypic information, thus most people believe and choose that Jo is a man. Several
similar biases of reasoning have been revealed and investigated so far in the literature
(Gilovich, Griffin, & Kahneman, 2002).
One of the possible explanations for this effect is presented by dual process theories of
thinking. According to these dual process theories, there are two different types of thinking:
type 1 and type 2 processes. Type 1 processing is fast, autonomous, does not require working
memory, operates unconsciously and immediately triggers an answer. Type 2 processing puts
a heavy load on working memory capacity, operates consciously, controlled and relatively
slow. The two types of processes are also often referred to as ‘intuitive’ or ‘heuristic’ and
‘deliberate’ or ‘analytical’ (Stanovich & Toplak, 2012). It is important to note that the above-
mentioned features of the systems are general labels, based on the traditional dual process
theories view. Dual process theory is an umbrella term; several kinds of dual process theories
exist. In this study, I focus on the influential classic default-interventionist view of dual
processes that has been advocated in the seminal work of Evans and Stanovich (2013) and
Kahneman (2011).
6
The standard assumption in the default-interventionist dual process (DI) framework is
that the automatic and fast type 1 process first produces a heuristic answer. Generation of the
heuristic answer might subsequently be followed by a deliberative, slow type 2 process, which
may result in a correction of the initial heuristic answer. Note that in cases - such as the
introductory reasoning problem - in which the initial heuristic response conflicts with the
correct logical1 response, the corrective type 2 thinking is believed to be critical to arrive at
the correct logical answer. In cases where the type 2 processing fails, the heuristic response
will not be corrected and the reasoner will end up giving the erroneous heuristic answer.
Thus, the expected dual process time course assumption is that reasoners will first generate a
heuristic answer and, if needed, will after additional reflection correct this to arrive at the
correct logical response.
Unfortunately, and perhaps somewhat surprisingly, there is little evidence in the
literature that allows us to directly validate this critical time course assumption. For example,
De Neys (2006a) found that it took participants more time to solve reasoning or heuristics and
biases tasks such as the introductory base-rate neglect problem if their answer was correct,
than if it was incorrect. One might argue that this finding is in agreement with the time course
assumption. However, this evidence does not entail the conclusion that correct reasoners
generated the incorrect answer first, and then they figured out the correct solution.
Evans & Curtis – Holmes (2005) used an experimental design in which people had to
judge the logical validity of reasoning problems under time pressure; one group of reasoners
were given only 2 seconds to answer, whereas a control group were allowed to take as much
time as they wanted to give an answer. An elevated percentage of incorrect answers was
found in the time pressure group. Hence, this also indicates that giving the correct response
requires time. However, this does not necessarily show that individuals who gave the correct
response in the free time condition generated the heuristic response first and corrected this
subsequently. It might be that reasoners engaged in type 2 thinking right away, without any
need to postulate an initial generation of a heuristic response.
1Note that I will be using the label “correct” or “logical” response as a handy shortcut to refer to “the response that
has traditionally been considered as correct or normative according to standard logic or probability theory”. The
appropriateness of these traditional norms has sometimes been questioned in the reasoning field (e.g., see
(Stanovich & West, 2000), for a review). Under this interpretation, the heuristic response should not be labeled as
“incorrect” or “biased”. For the sake of simplicity I stick to the traditional labeling. In the same vein, I use the term
“logical” as a general header to refer both to standard logic and probability theory.
7
Arguably, the most direct evidence has been given by experiments using the two
response paradigm (Thompson & Johnson, 2014; Thompson et al., 2011). In this paradigm,
participants are presented with a reasoning problem. They are instructed to respond as quickly
as possible with the first, intuitive response that comes to mind. Afterwards, they are
presented with the problem again, and they are given as much time as they want to think
about it and give a final answer. A key observation for our present purposes was that
Thompson and colleagues noted that people spent little time rethinking their answer in the
second stage and hardly ever changed their initial response. Note that the fact that people do
not change an initial heuristic response is not problematic for the dual process framework, of
course. It just implies that people failed to engage the optional type 2 processing. Indeed,
since such failures to engage type 2 are considered a key cause of incorrect responding, a
dominant tendency to stick to incorrect initial responses is little surprising from the classic
dual process stance. However, the lack of answer change tentatively suggest that in those
cases where a correct logical response was given as final response, the very same response
was generated from the start. Bluntly put, the logical response might have been generated fast
and intuitively based on mere type 1 processing. This would pose a major challenge for
standard dual process theory. Unfortunately, however, it cannot be excluded that Thompson et
al.’s participants engaged in type 2 processing when they gave their first, initial response.
Although Thompson et al. instructed participants to quickly give the first response that came
to mind, participants might have simply failed to respect the instruction and ended up with a
correct response precisely because they recruited type 2 thinking. Clearly, researchers have to
make sure that only type 1 processing is engaged at the initial response stage.
There is also some indirect evidence that could make one suspicious of the temporal
resolution assumption of the dual process. For example, De Neys (2015) argued that people
intuitively detect conflict between heuristic and logical basic logical principles. De Neys and
Glumicic (2008) gave a set of reasoning problems to participants, half of which were so-
called conflict problems in which a heuristically cued response conflicted with the correct
logical response . For the other half of the problems, the heuristically cued response and the
logical response were consistent. For example, the introductory base rate neglect problem that
I presented above was a conflict problem; the heuristically cued response is that Jo is a men
because of the provided stereotypic description, while the logical answer is that Jo is a women
because of the sample sizes. A no-conflict version of this problem can be constructed by
simply reversing the base rates (i.e., 997 men / 3 women). In this case the logical answer cued
8
by the base rates, and the heuristic answer, cued by the stereotype are pointing to the same
answer: Jo is a man. In a set of experiment De Neys and colleagues observed that the presence
of conflict affected people’s reasoning process. Even biased participants who failed to give
the correct response showed elevated response times (e.g., De Neys & Glumicic, 2008),
decreased post-decision confidence (e.g., De Neys, Cromheeke, & Osman, 2011; De Neys,
Rossi, & Houdé, 2013), and elevated skin resistance (De Neys, Moyens, & Vansteenwegen,
2010). De Neys argued that people detect the conflict between the heuristic answer and
logical principles intuitively. This would mean that there are two cued type 1 responses; one
is driven by beliefs or common stereotypes, and one is driven by logical or probabilistic
principles.
Related work by Handley and colleagues (e.g. Handley, Newstead, & Trippas, 2011;
Pennycook, Trippas, Handley, & Thompson, 2014) suggests that stereotypical beliefs can
interfere with logic in the early stadium of the decision process. According to the standard
dual-process model, people should process the belief-based heuristic response at the
beginning and then the logic-based information later. In opposition, they found that the logical
information is available to the reasoning process from the beginning, just as the stereotypic
information. These results also imply that type 1 processing is not only able to produce a
heuristic-based answer, but also a logic-based response as well. Furthermore, Banks & Hope
(2014) collected event-related potential data during a set of reasoning tasks. They found that
people process the logical validity and the believability of the problems simultaneously, at the
very beginning.
Additionally, Villejubert (2009) examined reasoning under time pressure. As people
have no time to engage in more analytical thinking, they were expected to rely mostly on their
cued heuristic answers, and produce more incorrect answers. This expected result was not
supported by the data; no significant difference has been found between the time pressured
group and the free time group. This finding could question the time course assumption as
well, and refer to the possibility that logical answer can be generated by type 1 answers.
Nonetheless, Villejubert (2009) used a 12 sec time limit in total, which might allow them to
engage in more deliberation. As well, this time limit is not based on reading times or any
empirical data, which might challenge this evidence.
However, the findings which support the logical intuition model are challenged (Klauer &
Singmann, 2013; Mata, Schubert, & Ferreira, 2014; Pennycook, Fugelsang, & Koehler, 2012;
9
Singmann, Klauer, & Kellen, 2014). Klauer and Singmann (2013) rightly emphasized that the
idea of logical intuition is in opposition with the classical view of dual process theories in
which a logical response must be the result of slow and effortful type 2 deliberation. Klauer
and Singmann (2013) pointed out that researchers have to be sure that empirical evidence is
not driven by any confound before one should be persuaded to revise the classic theory. Even
supporters of the logical intuition theory admit that this area is its infancy and further
validation and experimentation is required (De Neys, 2014).
Overall, previous literature has not provided sufficient evidence for the time course
assumption of dual process theories, and recent evidence challenged this presumption as well.
In this study we aimed to provide a direct test of the time course assumption of default-
interventionist dual process models. For this purpose, we used the two response paradigm.
Participants were asked to give an immediate first answer, and then they were allowed to take
as much time as they needed to give a final answer. Participants were also asked to indicate
their confidence level after both responses.
Default-interventionist (DI) dual process theory would predict that people always give
the heuristic answer for the first response, which is the incorrect answer in the case of conflict
problems. Afterwards, when sufficient time is allotted for type 2 processing to occur, they
might be able to correct their initial response and arrive at the correct answer. In sum,
according to standard DI theory there should be only two answer types: either incorrect for
first response – incorrect for second response or incorrect for first response – correct for
second response. Our key question is whether generation of a correct final response is indeed
preceded by generation of an initial incorrect response or whether people can generate the
correct logical answer for the first answer as well. This latter pattern would provide direct
evidence for the existence of fast, logical type 1 reasoning.
Critically, we wanted to make sure and validate that the first response that participants
gave only reflected the output of type 1 processing. For this reason, in four experiments we
used a combination of correlational and experimental techniques that allowed us to minimize
or control the impact of type 2 processing. In Experiment 1, we contrasted performance in a
reasoning condition with a baseline control condition in which participants merely read the
problems. In Experiment 2-4 we knocked out type 2 processing experimentally by imposing a
challenging response deadline (Experiment 2), a cognitive load task (Experiment 3), and even
a combination of both a response deadline and cognitive load (Experiment 4).
10
Finally, to check the generality of the findings, two different reasoning tasks were
used; a syllogistic reasoning and a base rate task. These were selected because of two reasons:
first, these tasks are highly popular in the research community and have inspired much of the
theorizing in the field. Furthermore, the tasks are different in the sense that different
normative systems are required to solve them correctly (standard logic for syllogistic
reasoning, and probability theory for base rate task). The differences or similarities between
the tasks will give us an indication of the generality of the findings.
Experiment 1
Methods
Participants
A total of 101 participants were tested (61 female, Mean age = 38.95, SD = 12.69) in
the actual experiment (i.e., reasoning condition). In a pretest, an additional 52 participants (31
female, Mean age = 44.13, SD = 13.2) were tested (i.e., reading condition; see further). The
participants were recruited via the Crowdflower platform, and received $0.30 for their
participation in the reasoning condition, and $0.11 in the reading condition. Only native
English speakers from the USA or Canada were allowed to participate in the study. The
distribution of highest educational properties of the samples can be found at Table 1.
Table 1. Distribution of highest educational level of participants across experimental conditions. Exact number of
people in parenthesis.
Experiment 1 Reading pretest
Less than high school 0% (0) 0% (0)
High school 48% (48) 40,4% (21)
Bachelor degree 41% (41) 46,2% (24)
Masters degree 7% (7) 9,6% (5)
Doctoral degree 4% (4) 3,8% (2)
Did not provide inf. 0,9% (1) 0% (0)
Materials
Base rates. Participants solved a total of eight base-rate problems. All problems were
taken from Pennycook et al. (2014). In the base rate tasks, participants always receive a
description of a sample, which includes two kinds of groups, for example, nurses/doctors,
women/men, librarians/high school students. In addition, participants received a stereotypic
11
description of a randomly drawn individual from the sample. Three kinds of base rates were
used: 997/3, 996/4, 995/5. The task was to indicate to which group the person most likely
belonged. Two kinds of items were used, conflict and no-conflict items. In no-conflict items
the base rate probabilities and the stereotypic information referred to the same group, while in
conflict items the stereotypic information referred to the smaller base rate group. Participants
solved a total of four conflict and 4 no-conflict items.
To minimize reading time influences on the reaction time, we used the “rapid response
base rate paradigm”, introduced by Pennycook et al., 2014). In this paradigm, the base rates
and descriptive information are presented serially, one-by-one. First, participants received the
names of the two groups, for example “this study contains clowns and accountants”, then they
receive a stereotypic information, which was a single word such us, “kind”, “funny”,
“strong”. Note that Pennycook et al. (2014) selected these specific words on the basis of
extensive pretesting to make sure that they were strongly associated with a member of the
group in question. Finally, participants received the base rate information. The items were
counterbalanced by changing the base rates (in one question set the incongruent items were
the congruent ones and vice-versa). The following illustrates the problem format:
This study contains clowns and accountants.
Person 'L' is funny.
There are 995 clowns and 5 accountants.
Is Person 'L' more likely to be:
o A clown
o An accountant
Each problem started with the presentation of a fixation cross for 1000 ms. After the
fixation cross disappeared, the sentence which specified the two groups appeared for 2000
msec. Then the stereotypic information appeared, for another 2000 ms, while the first
sentence remained on the screen. Finally, the last sentence specifying the base rates appeared
together with the question and two response alternatives. Once the question was presented
participants were able to select their answer by clicking on it. The position of the correct
answer alternative (i.e., first or second response option) was randomly determined for each
item. The eight items were presented in random order. An overview of the problem set can be
found in Appendix 1.
12
Syllogistic reasoning. Participants were given eight syllogistic reasoning problems. In
four of these there was a conflict between the believability and the validity of the conclusion
(conflict items), and this conflict was not present for the other four (no-conflict items). The
conclusion of this reasoning task can be valid or invalid; if the conclusion follows the rule of
deductive logic, it is called ‘valid’ reasoning. As well, the conclusion might be believable or
unbelievable; it might reflects common stereotypical beliefs or not, for example: “Puppies
have four legs” is a believable conclusion, while “Boats have wheels” is unbelievable. Two
items were unbelievable-valid, two were believable-invalid, two were believable-valid and
two were unbelievable-invalid. The problems used in this study were taken from De Neys,
Moyens, and Vansteenwegen (2010). Participants had to indicate whether the conclusion
follows logically from the presented premises or not. We used the following format:
All dogs have four legs
Puppies are dogs
Puppies have four legs
Does the conclusion follow logically?
o 1. yes
o 2. no
To minimise reading time influences on reaction time, the premises were presented
one by one. Before each question, a fixation cross was presented for 1000 ms. After the
fixation cross disappeared, the first sentence (i.e., the major premise) was presented for 2000
ms. Next, the second sentence (i.e., minor premise) was presented under the first premise for
2000 ms. After this interval was over, the conclusion together with the question “does the
conclusion follow logically?” and two response options (yes/no) was presented right under the
premises. Once the conclusion and question were presented, participants could give their
answer by clicking on the corresponding bullet point. The eight items were presented in a
randomised order. An overview of the problem set can be found in Appendix 2.
Procedure
The experiment was run online. People were clearly instructed that we were interest in
their first, initial response to the problem. Instruction stressed that it was important to give the
initial response as fast as possible and that participants could afterwards take additional time
to reflect on their answer. The literal instructions that were used, stated the following:
13
General instructions.
“Welcome to the experiment! Please read these instructions carefully!
This experiment is composed of 16 questions and a couple of practice questions. It will take
about 20 minutes to complete and it demands your full attention. You can only do this
experiment once.
In this task we'll present you with a set of reasoning problems. We want to know what your
initial, intuitive response to these problems is and how you respond after you have thought
about the problem for some more time. Hence, as soon as the problem is presented, we will
ask you to enter your initial response. We want you to respond with the very first answer
that comes to mind. You don't need to think about it. Just give the first answer that
intuitively comes to mind as quickly as possible. Next, the problem will be presented again
and you can take all the time you want to actively reflect on it. Once you have made up
your mind you enter your final response. You will have as much time as you need to
indicate your second response.
After you have entered your first and final answer we will also ask you to indicate your
confidence in the correctness of your response. In sum, keep in mind that it is really crucial
that you give your first, initial response as fast as possible. Afterwards, you can take as
much time as you want to reflect on the problem and select your final response. You will
receive $0.30 for completing this experiment. Please confirm below that you read these
instructions carefully and then press the "Next" button.”
All participants were presented with both the syllogistic reasoning and base-rate task
in a randomly determined order. After the general instructions were presented the specific
instructions for the upcoming task (base-rates or syllogisms) were presented. These
instructions were used:
Syllogistic reasoning.
“In this part of this experiment you will need to solve a number of reasoning problems. At
the beginning you are going to get two premises, which you have to assume being true. Then
a conclusion will be presented. You have to indicate whether the conclusion follows logically
from the premises or not. You have to assume that the premises are all true. This is very
important.
Below you can see an example of the problems.
Premise 1: All dogs have four legs
Premise 2: Puppies are dogs
Conclusion: Puppies have four legs
Does the conclusion follow logically?
1. yes
2. no
The two premises and the conclusion will be presented on the screen one by one. Once the
conclusion is presented you can enter your response.
As we told you we are interested in your initial, intuitive response. First, we want you to
respond with the very first answer that comes to mind. You don't need to think about it. Just
give the first answer that intuitively comes to mind as quickly as possible. Next, the problem
will be presented again and you can take all the time you want to actively reflect on it. Once
you have made up your mind you enter your final response. After you made your choice and
clicked on it, you will be automatically taken to the next page. After you have entered your
first and final answer we will also ask you to indicate your confidence in the correctness of
your response. Press "Next" if you are ready to start the practice session!”
14
Base rate task.
“In a big research project a large number of studies were carried out where a psychologist
made short personality descriptions of the participants. In every study there were participants
from two population groups (e.g., carpenters and policemen). In each study one participant
was drawn at random from the sample. You’ll get to see one personality trait of this
randomly chosen participant. You’ll also get information about the composition of the
population groups tested in the study in question. You'll be asked to indicate to which
population group the participant most likely belongs. As we told you we are interested in
your initial, intuitive response. First, we want you to respond with the very first answer that
comes to mind. You don't need to think about it. Just give the first answer that intuitively
comes to mind as quickly as possible. Next, the problem will be presented again and you can
take all the time you want to actively reflect on it. Once you have made up your mind you
enter your final response. After you made your choice and clicked on it, you will be
automatically taken to the next page. After you have entered your first and final answer we
will also ask you to indicate your confidence in the correctness of your response.
Press "Next" if you are ready to start the practice session!”
After the task specific instructions, participants were familiarized with the task and
solved two practice problems. Then they were able to start the experiment. For the first
response people were instructed to give a quick, intuitive response. After they clicked on the
answer, they were asked to give their confidence in their answer, in a sale from 0 to 100, with
the following question: “How confident are you in your answer? Please type a number from 0
(absolutely not confident) to 100 (absolutely confident)”. Next, they were presented with the
problem again, and they were told that they could take as much time as they needed to give a
final answer. As a last step, they were asked to give the confidence in their final answer. The
colour of the actual question and answer options were green during the first response, and
they were blue during the second response phase, to visually remind participants which
question they were answering at the moment. For this purpose, right under the question a
reminder sentence was placed: “Please indicate your very first, intuitive answer!” and “Please
give your final answer.” respectively.
The order of presenting base rate and syllogistic reasoning tasks was randomized.
After participants finished the first task they could briefly pause, were presented with the
instructions and practice problems of the second task, and started the second task. For both the
base-rate and syllogistic reasoning task two different problem sets were used. The conflict
items in one set were the no-conflict items in the other, and vice-versa. This was done by
reversing the base-rates (base-rate task) or by switching the conclusion and minor premise
(syllogisms). Each of the two sets was used for half of the participants. Appendix 2 gives an
overview of all problems in each of the sets. This counterbalancing ruled out the possibility
15
that mere content or wording differences between conflict and no-conflict items could
influence the results. At the end of the study participants were asked to answer demographic
questions.
Reading pretest
In the reading pretest participants were presented with each of the base-rate and base-
rate neglect problems but were simply asked to read the problems. The basic goal of this
reading condition was to have a raw baseline against which the intuitive or type 1 nature of
the response times for the first response in the actual reasoning condition could be evaluated2.
Participants were instructed that the goal of the study was to determine how long people
needed to read item materials. They were instructed that there was no need for them to try to
solve the problems and simply needed to read the items in the way they typically would.
When they were finished reading, they were asked to randomly click on one of the presented
response options to advance to the next problem. Presentation format was the same as in the
actual reasoning condition presented above. The only difference was that the problem was not
presented a second time and participants were not asked for a confidence rating. To make sure
that participants would be motivated to actually read the material we told them that we would
present them with two (for both tasks, four in sum) very easy verification questions at the end
of the study to check whether they read the material. The literal instructions were as follows:
General introduction.
“Welcome to the experiment! Please read these instructions carefully! This experiment is
composed of 16 questions and 4 practice questions. It will take 5 minutes to complete and it
demands your full attention. You can only do this experiment once. In this task we'll present
you with a set of problems we are planning to use in future studies. Your task in the current
study is pretty simple: you just need to read these problems. We want to know how long
people need on average to read the material. In each problem you will be presented with two
answer alternatives. You don’t need to try to solve the problems or start thinking about them.
Just read the problem and the answer alternatives and when you are finished reading you
randomly click on one of the answers to advance to the next problem. The only thing we ask
of you is that you stay focused and read the problems in the way you typically would. Since
we want to get an accurate reading time estimate please avoid whipping your nose, taking a
phone call, sipping from your coffee, etc. before you finished reading. At the end of the study
we will present you with some easy verification questions to check whether you actually read
the problems. This is simply to make sure that participants are complying with the
instructions and actually read the problems (instead of clicking through them without paying
attention). No worries, when you simply read the problems, you will have no trouble at all at
answering the verification questions.
2 Note that as many critics have argued, dual process theories are massively underspecified in this respect. The
theory only posits that type 1 processes are relatively faster than type 2 processes. However, no criterion is
available that would allow us to a priori characterize a response as a type 1 response in an absolute sense (i.e.,
faster than x seconds = type 1). Our reading baseline provides a practical validation criterion.
16
You will receive $0.11 for completing this experiment. Please confirm below that you read
these instructions carefully and then press the "Next" button.”
Specific instructions before the syllogistic items started:
“In the first part of this experiment you will need to read a specific type of reasoning
problems. At the beginning you are going to get two premises, which you have to assume
being true. Then a conclusion, question and answer alternatives will be presented. We want
you to read this information and click on any one of the two answers when you are finished.
Again, no need to try to solve the problem. Just read it. Below you can see an example of the
problems.
Premise 1: All dogs have four legs
Premise 2: Puppies are dogs
Conclusion: Puppies have four legs
Does the conclusion follow logically?
1. yes
2. no
The two premises and the conclusion will be presented on the screen one by one. Once the
conclusion is presented, you simply click on one of the answer alternatives when you
finished reading and the next problem will be presented. Press "Next" if you are ready to
start a brief practice session!”
Specific instructions before the base rate items started:
“This is the first part of this experiment. In a big research project a large number of studies
were carried out where a psychologist made short personality descriptions of the participants.
In every study there were participants from two population groups (e.g., carpenters and
policemen). In each study one participant was drawn at random from the sample. You’ll get
to see one personality trait of this randomly chosen participant. You’ll also get information
about the composition of the population groups tested in the study in question. Then, a
question to indicate to which population group the participant most likely belongs will
appear. We simply want you to read this question and the two answer alternatives. Once you
finished reading this, you simply click on either one of the answer alternatives and then the
next problem will be presented. Again no need to try to solve the problem, just read the
question and simply click on either one of the answers when you are finished. Press "Next" if
you are ready to start a brief practice session!”
An example of the verification question for syllogistic reasoning:
“We asked you to read the conclusions of a number of problems. Which one of the following
conclusions was NOT presented during the task:
Whales can walk
Boats have wheels
Roses are flowers
Waiters are tired”
17
An example of the verification question for the base rate task:
“We asked you to read problems about a number of population groups. Which one of the
following combination of two groups was NOT presented during the task:
Nurses and artists
Man and woman
Scientists and assistants
Cowboys and Indians”
The verification questions were constructed such that a very coarse reading of the
problems would suffice to recognize the correct answer. Note that 94% of the verification
questions were solved correctly, which indicates that by and large, participants were
minimally engaged in the reading task. Only those participants were analysed, who correctly
solved both verification question regarding one task.
In sum, the reading condition should give us a baseline against which the reasoning
response times for the initial response can be evaluated. Any type 1 response during reasoning
also minimally requires that the question and response alternatives are read and participants
move the mouse to select a response. The reading condition allows us to partial out the time
needed for these two components. In other words, the reading condition will gives us a raw
indication of how much time a type 1 response should minimally take. That is, if participants
in the reasoning condition do not comply with the instructions and engage in time-consuming
additional type 2 thinking before giving a first response, the response times for the initial
response in the reasoning condition will be substantially longer than the reading time.
However, if the initial reasoning responses do not differ from mere reading times, this will
provide a strong validation of the intuitive nature of the initial responses.
Results3 showed that the average first response time in the reasoning condition did not
differ from the average reading times in the syllogistic reasoning condition (b = 0.02 t (484) =
0.996, p = 0.32). Regarding the base rate problems, there was significant difference between
3 Note that for this analysis, we used mixed effect models, where the ID of the participants were entered as a
random effect factor to the model. Rationale for this analysis can be found under the subsection: “Analysis of
confidences and response times”.
18
the two condition (b = 0.07, t (495) = 2.89, p = 0.004), but mean reading times were longer
(M = 3.02 sec, SD = 1.99) than mean reasoning times (M = 2.62 sec, SD = 1.97).
Results
Analysis strategy
Our primary interest in the present study is what we will refer to as “Direction of
Change” analysis for the conflict items. By direction of change, we mean the way or direction
in which a given person in a specific trial changed (or didn’t change) her initial answer during
the rethinking phase. More specifically, people can give a correct or incorrect response in
each of the two response stages. Hence, in theory this can result in four different types of
answer change patterns: 1) a person could either give the incorrect (heuristic) answer as the
first response , and then change to the correct (logical) answer as the final response (we will
use the label “01” to refer to this type of change pattern), 2) one can give the incorrect answer
as the first response and final response (we use the label “00” for this type of pattern), 3) one
can give the correct answer as the first response and change to the incorrect response as the
final response (we use the label “10” for this type of pattern), and 4) one can give the correct
answer for the first and final response (we use the label “11” for this pattern). To recap, we
will use the following labels to refer to these four types of potential answer change patterns:
“01” (i.e., response 1 incorrect, response 2 correct), “00” (i.e., response 0 incorrect, response
2 incorrect), “10” (i.e., response 1 correct, response 2 incorrect), and “11” (i.e., response 1
correct, response 2 correct).
This presentation strategy allows us to look at the frequency of each type of direction
of change pattern and analyse response times and response confidence for each of them.
Accuracy of final responses
However, for consistency with previous work we first present the response accuracies
for the final response. Table 2 gives an overview of the results. As the table indicates,
accuracies are in line with previous studies that adopted a single response paradigm. Both for
the base-rate χ2(1) = 179.67, p < .0001 and syllogistic reasoning task χ2(1) = 21.73, p < .0001,
performance was significantly better in the no-conflict than in the conflict problems. By and
large, this indicates that the two response paradigm did not alter the nature of the reasoning
task. Final response accuracies are in line with what can be expected to be observed in a
classic single response paradigm.
19
Table 2. Percentage of correct final responses in each of the two reasoning tasks.
Final response
Base rate Conflict 36,36%
No-conflict 94,5%
Syllogistic reasoning Conflict 52%
No-conflict 68%
Direction of change analysis
Table 3. Total frequency of each of the four direction of change types. Number of trials of each category can be
found in parenthesis.
11 00 10 01
Base rate No-Conflict 90,25% (361) 3,5% (14) 2% (8) 4,25% (17)
Conflict 27% (108) 61% (244) 2,75% (11) 9,25% (37)
Syllogistic
reasoning
No-conflict 63,12% (255) 28,9% (117) 3,22% (13) 4,7% (19)
Conflict 45,79% (185) 44,1% (178) 4,46% (18) 5,69% (23)
Table 3 shows how frequent each of the four types of directions of change were. For
both reasoning tasks there are a number of general trends that clearly support the DI dual
process view in terms: First, with respect to the no-conflict problems the 11 responses are the
most dominant category. This can be predicted given that the heuristic type 1 processing is
expected to cue the correct response here. Similarly, the high prevalence of the 00 category in
the conflict problems also supports DI theory. Just as with the no-conflict problems, people
will often tend to stick to the heuristic response on the conflict problems which results in an
erroneous first response that is subsequently not corrected. Finally, we also observe a small
number of trials in the 01 category. In line with standard DI, sometimes an initial erroneous
response will be corrected after additional reflection, but these cases are quite rare. By and
large, these trends fit the standard DI predictions. However, a key challenge for the standard
DI model is the high frequency of “11” answers (as Table 3 shows, 27% and 46% of
responses for base-rate and syllogisms, respectively). Indeed, both for the base-rate and
syllogistic reasoning task it was the case that the majority of trials in which the final response
was correct, this correct response was already given as the initial response (i.e., 75% and 88%
of the final correct response trials in the base-rate and syllogistic reasoning task, respectively).
Hence, in these cases the correct logical response was given immediately.
20
Note that taken together, these results also support Thompson et al. (2011) earlier
observations that indicate that people mostly stick to their initial response and rarely change
their answer regardless whether it was correct or not. However, the key finding of the present
direction of change analysis is the high prevalence of “11” response. This tentatively suggests
that in those cases where people arrive at a correct final response, the correct response was
already generated intuitively.
Analyses of confidence ratings and response times
By examining response latencies and confidence for the four different types of
direction of change categories, one can get some further insight in the reasons behind people
answer change (or lack thereof).
Results are presented in Figure 1-4. Visual inspection of the figure suggests a clear and very
similar pattern across both tasks.
With respect to latencies of syllogistic reasoning. The results for the first response
latencies suggest that response times for 11, 00, and 01 were very similar and fast (within 3 s).
Latencies for the few 10 response were clearly deviant and were considerably longer,
suggesting that participants did not respect the instruction and that the correct initial response
resulted from slower type 2 thinking. Latencies for the second responses were similar and fast
for the 00, 11 and 10 cases. This indicates that participants spent little time rethinking their
answer. The clear exception here were the 01 cases. Response 2 latencies were clearly much
longer here. This fit with the assumption that the correct final response results from additional
type 2 processing.
These results are further supported by the confidence findings. As Figure 3 and Figure
4 shows, the 00 and 11 case show very high confidence both for Response 1 and Response 2.
Confidence for the 01 and 10 cases is much lower for both responses. Hence, the initial
response in the 00 and 11 cases was given fast and with very high confidence. The 01 first
response was also given quickly, but with much lower confidence, and was subsequently
changed after considerable additional processing time.
Regarding base rate condition: confidence and latency pattern was very similar, but
one might note that in this task the 11 latencies for the first response tended to take slightly
longer than those for 00 responses.
21
Note that the horizontal dotted lines in Figure 1 and Figure 2 represent the average
reading time of participants in our reading condition baseline. As the figures indicates, except
for the rare 10 case in the syllogistic reasoning task, response 1 reading times were about
equal or faster than the mere reading times4. By and large this, supports the claim that
participants respected the instructions and gave the first response purely intuitively without
additional time-consuming deliberation.
We used the nlme statistical package in R to create mixed effect multi-level models
(Pinheiro, Bates, Debroy, & Sarkar, 2015). This allows us to analyse the data in a trial-by-trial
basis, while accounting for the random effect of subjects (Baayen, Davidson, & Bates, 2008).
Mixed effect models have increased statistical power due to the inclusion of random effects,
and the ability to handle data which violates the assumption of homoscedasticity (Baayen et
al., 2008). The direction of change category and the response number (first or final response)
were entered to the model as fixed effect factors, and the ID of the participants was entered as
a random factor.
Analysis of response latencies
We analysed latencies as a function of response number (first or final response) and direction
of change category (00, 11, 01, 10). We ran a separate analysis for each of the two reasoning
tasks. Means and standard deviations of confidence ratings and response latencies can be
found in Appendix 3.
Regarding syllogistic reasoning response latencies, the main effect of direction of change
χ2(9) = 19.97, p = .0002 and the interaction between direction of change and response
number χ2 (12) = 17.86, p = .0005 significantly improved model fit (Figure 1), but the main
effect of response number did not χ2 (6) = 0.026, p = .8716. As Figure 1 and our visual
inspection suggested this indicates that participants did not think for equal time among the
direction of change categories (main effect of direction of change). Moreover, the significant
interaction means that difference between response 1 and response 2 is not equal among
direction of change categories. This confirm our visual inspection that the final response took
longer than first responses in 01 category, but not in 11 or 00 categories.
4 As we noted in the method section, for both tasks it was the case that average response 1 latencies and reading
times did not differ significantly, except in the case of base rates, when reading took them longer, then
reasoning.
22
Figure 1. Average response times (logarithmically transformed) for the first and second response for each of the four types of
direction of change categories in syllogistic reasoning. The dashed horizontal line represents the average reading time for
these problems in the reading baseline condition.
Analysis of base rate neglect response time revealed that only the direction of change
significantly improved model fit χ2(9) = 10.43, p = .015, but response number χ2(6) = 0.08, p
= .731 did not. The interaction between both factors reached marginal significance, χ2(12) =
7.23, p = .0648 (Figure 2). This pattern also confirms the visually observed trends.
Figure 2. Average response times (logarithmically transformed) for the first and second response for each of the four types of
direction of change categories in base rate task. The dashed horizontal line represents the average reading time for these
problems in the reading baseline condition.
23
Finally, it was important for us to compare 11 and 00 response latencies at time 1,
given that it is possible, that correct reasoners did not follow the instructions and arrived at a
correct initial response because they simply took additional time to deliberate and engage type
2 processing. Our reading time control condition already minimized this possibility but an
additional test can be to contrast the response 1 reasoning times for the 00 and 11 responses
directly. A simple effect analysis indicated that there was no significant difference regarding
the syllogistic reasoning task (b = -0.05, t (296) = -1.44, p = 0.15), but significant difference
was found in the base rate task (b = -0.1, t (188) = -2.26, p = .02). Hence, this indicates that in
in the base rate task, initial 11 answers are somewhat slower than 00 answers. Despite the
reading findings, this might imply that initial correct responses in the 11 base rate case
resulted from some minimal type 2 thinking. However, it might also be the case that logical
type 1 base-rate intuitions are slightly slower than heuristic type 1 base-rate intuitions. The
experimental controls in Experiment 2-4 will allow us to eliminate any potential impact of
type 2 processing completely.
Analysis of confidence ratings
Only the main effect of direction of change had a significant effect on syllogistic
reasoning confidence ratings χ2(9) = 39.08, p < .0001, while neither the main effect of
response number χ2(6) = 3.38, p = .066, neither the interaction improved significantly model
fit χ2(12) = 4.19, p = .241. Hence, as the visual inspection suggested, in trials when the answer
was changed, participants were less confident in their initial and in their final response as
well, than in 11 and 00 categories.
24
Figure 3. Average confidences for the first and second responses for each of the four types of direction of change categories
in the syllogistic reasoning task.
Regarding base rate neglect confidence ratings, the main effect of direction of change had a
main effect χ2(9) = 8.29, p = .043, but there was no main effect of response number χ2(6) =
0.37, p = .546 and there was no interactional effect as well χ2(12) = 1.34, p = .72 (Figure 4).
Figure 4. Average confidences for the first and second response for each of the four types of direction of change categories in
the base rate task.
Response stability analysis at individual level
One might argue that there are individual differences between people regarding the
direction of change; one person might have the ability to figure out the answer on the first
25
response, one might have the ability to change from the incorrect to the correct, etc. Thus,
one can compute a dominant category index, which refers to the direction of change category
which is most frequent for each individual. For example, if an individual out of the four
conflict problems that she solved, shows three times a 11 pattern and one 00 pattern the
individual would be labelled a “dominant 11” individual. If no pattern was observed, or two
patterns were observed equally often the individual would be labelled as a “no dominant
category” individual. Table 4 shows the results. As the table indicates we replicate the pattern
we observed in the item based analysis at the individual level. The most frequent categories
are the 00 and 11 categories. Table 5 further indicates that the type of change is pretty stable
at the individual level. The table shows the percentage of participants who displayed the
same direction of change type on 100% (4/4), 75%, 50%, or 25% of trials. As the table shows
the majority of participants displayed the exact same type of change on 3 out of 4 conflict
problems. This indicates that the type of change is fairly stable at the individual level. In
addition, we also would like to know whether people are stable in their dominant category,
which means the percentage of trials belong to the dominant category. These calculations
were applied to the base rate and syllogistic reasoning tasks separately, and the incongruent
problems were used only.
Table 4. Frequencies of dominant categories for each direction of change categories.
11 00 10 01 No dominant
category
Base rate 25,74% 60,39% 0,99% 4,95% 7,92%
Syllogistic
reasoning
27,72% 40,59% 0,99% 2,97% 27,72%
Table 5. Total frequency of stability indexes for each direction of change categories.
25% 50% 75% 100%
Base rate 2% 14% 21% 63 %
Syllogistic
reasoning
1,98% 40,59% 39,6% 17,82%
26
Discussion
In experiment 1, the time course assumption of DI dual process theory was tested. On
one hand, some of the results were in agreement with the classic DI account: 00 and 01
response patterns were found. On the other hand, other results did not support the classic DI
model; we observed a high proportion of 11 responses. Furthermore, the analysis of response
latencies revealed that people produce an equally quick first answer in 11, 00 and 01 groups in
the syllogistic reasoning task, while in 11 cases, they were very confident in their initial
answer. These results suggest that in the 11 condition, participants have produced the logical
answer intuitively. Furthermore, stability and dominant category analysis revealed that people
produce stable answer patterns, which can be interpreted as stable individual differences.
The results of Thompson and Johnson (2014); Thompson et al. (2011) have been also
replicated since people were able to produce the normative answer intuitively. However,
Pennycook and Thompson (2012) examined only the base rate task, and argued that people
are equally likely to change from the heuristic to the logical answer than to change from the
logical to the heuristic answer, which is not supported by the base rate neglect data, but
supported by the syllogistic reasoning results. Similarly, Pennycook and Thompson (2012)
found that 53.8% of the participants changed their answers in the conflict problems, while in
this study only 11% of the participants changed their answer, which difference may be due to
the difference in experimental strategy as Pennycook and Thompson asked participants to
give probability estimations instead of asking them to directly choose between the two
options.
However, the reaction time results in the syllogistic reasoning task were not
completely replicated in the base rate task, where we only found a marginally significant
interaction between response times and direction of change category and observed a
difference between 11 and 00 responses at time 1. This could mean that 11 people engaged in
type 2 thinking at time 1. Even though, reading times were longer or at least not different
from reasoning tasks, which could confirm that participants did not engage in deliberation
during time 1. Nevertheless, this does not fully rule out the possibility of type 2 engagement
at response 1. For this reason, the results of this study have to be validated before drawing any
conclusion. In experiment 2-4 we further examined this research question by adding time
pressure in response 1 and introducing cognitive load.
27
Experiment 2 – 4
Methods
Participants
The same recruitment procedure as in Experiment 1 was used. In Experiment 2, 120
participants were recruited (63 female, M = 39.9 years, SD = 13.31 years). In Experiment
3, 112 participants were recruited (44 female, M = 39.28, SD = 13.28). Finally, in Experiment
4, 115 participants were recruited (53 female, M = 38.85 years, SD = 12.12 years).
Participants were allowed to take part in only one experiment. The distribution of the samples
regarding the highest educational level can be found at Table 6.
Table 6. Frequencies of highest educational level of participants across experimental conditions. Exact number of
people in parenthesis.
Experiment 2 Experiment 3 Experiment 4
Less than high
school
1.6% (2) 0% (0) 2.6% (3)
High school 30% (36) 38.4% (43) 40% (46)
Bachelor degree 36.7% (44) 33.9% (38) 47.8% (55)
Masters degree 11.7% (14) 7.1% (8) 7.8% (9)
Doctoral degree 5.8% (7) 0% (0) 0.9% (1)
Did not provide inf. 14.2% (17) 20.5% (23) 0.9% (1)
Materials and procedure
The same tasks and problems as in Experiment 1 were used. The procedure was
similar except for the following modifications:
Experiment 2: Time-pressure. A time pressure was introduced to the first question in
order to assure intuitive answering. The time limit was based on the mean reading times
collected in Experiment 1, thus in the base rate and in the syllogistic reasoning tasks it was 3
second. Once the question was presented, participants had 3000 ms to click on one of the
answer alternatives and after 2 seconds the background colour turned yellow to remind them
to pick an answer immediately. If participants did not produce an answer within 3000 ms they
got a feedback to remain them that they had not answered within the deadline and they were
told to make sure to respond faster on subsequent trials.
Participants were given 3 practice problems before starting each task to familiarize them with
the deadline procedure. During the actual reasoning task, participants failed to provide a first
28
response within the deadline on 12 % of the trials. These missed trials were discarded and
were not included in the reported data.
Experiment 3: Load condition. In this condition, we used a visuospatial working
memory load task, the dot memorization task (Miyake, Friedman, Rettinger, Shah, & Hegarty,
2001), to burden participants executive cognitive resources. The idea behind load
manipulation is that Evans and Stanovich (2013) argued that the defining feature of type 1
processing that it does not demand cognitive load; hence if we burden cognitive load, we
reduce the possibility to engage in analytic thinking (De Neys, 2006; De Neys & Schaeken,
2007; Franssens & De Neys, 2009).
In every trial, after the fixation cross disappeared, participants were shown to a three
by three matrix, in which 4 dots were presented (see Figure 5) for 2000 ms. Participants were
instructed to memorize this pattern. After the matrix disappeared, they had to indicate their
first response and their first confidence. After this, they were shown 4 matrices with different
dot patterns and they had to select the correct, to-be-memorized matrix. Participants were
given feedback as to whether they recalled the correct matrix or not. There was no time limit
on neither responses. Trials on which an incorrect matrix was selected (11% of trials) were
removed from the analysis.
Before the actual experiment they had to solve a set of practice questions. First, they
received a reasoning problem (base rate or syllogistic reasoning) which was identical to the
practice question used in Experiment 1. After, participants were presented to a cognitive load
practice question – they were simply presented to a dot pattern for 2000 ms, and after it
disappeared they could identify the pattern from the four presented options. As a last step,
they were presented to two more practice reasoning problems, which included the cognitive
load and the reasoning problem in the two response paradigm format as well.
Experiment 4. Load and time pressure. In this condition, the same working memory
load task was applied as in Experiment 3, the only exception was that participants had to
answer both questions and confidence ratings under load.
In this experiment they had the same time limit for the first answer as in experiment 2. Trials
on which an incorrect matrix was selected (12% of trials) were removed from the analysis.
Practice problems were identical to Experiment 3, the only exception was the applied time
limit. Time limit for the first responses was applied to each practice problems. 6.3% of the
29
first responses was remained unanswered due to the time limit. In sum, 16.7% of the trials
were excluded from the analysis.
Figure 5. An example for the dot pattern, which participants had to memorize.
Results
The analysis was identical to the one applied in Experiment 1. We repeated it in the
next 3 experiments. Regarding experiment 2 and 3, only those trials were analysed when the
participant correctly solved the dot memorization task. In the load + time pressure condition
the 88% of the answers were correct, while in the load condition 89% of the answers were
correct. For easy of presentation we focus here on the key results concerning the direction of
change analysis.
30
Table 7. Total frequency of each of the four direction of change types, presented for each experiments. Number of
trials of each category can be found in parenthesis.
The results of experiment 2-4 are very clear. We basically observe the same pattern that was
observed in Experiment 1. Although we used 3 different methods to eliminate the impact of
type 2 thinking on the first response stage, the key finding that challenged DI theory in
Experiment 1, a considerable proportion of 11 responses – is still observed. Indeed, if
anything the proportion of 11 responses tended to be slightly higher in Experiment 2-4 than in
Experiment 1. This directly established that the initial correct logical responses that we
identified in these set of experiments result from purely intuitive, type 1 processing.
General discussion In this study, we examined the standard time course assumption of classic DI theory.
DI theory suggest that people should produce a heuristic-based type 1 response by default,
and then they might override the heuristic answer and produce a logic-based type 2 response.
Hence, a key claim is that logical responding must originate from slow, deliberative type 2
processing. Our results did not fully support this expected time course pattern. In four
experiments, we found evidence for the existence for intuitively generated logical responses.
Our direction of change analysis revealed that 11 responses occurred very frequently, and
people barely changed their initial answer. Answers in the 11 and 00 categories were given
11 00 10 01
Base rate Experiment 1 27% (108) 61% (244) 2.75% (11) 9.25% (37)
Time pressure 25.1% (92) 59.9% (217) 4.4% (16) 11.2% (41)
Load 38.8% (122) 52.5% (179) 2.3% (8) 9.4% (32)
Time pressure
+Load
32.7% (127) 53.4% (207) 3.1% (13) 10.7% (41)
Syllogistic
reasoning
Experiment 1 45.79% (185) 44.1% (178) 4.46% (18) 5.69% (23)
Time pressure 49% (175) 40.1% (143) 5.6% (20) 5.3% (19)
Load 54.6% (185) 35.4% (120) 4.1% (14) 5.9% (20)
Time pressure
+ Load
56.8% (219) 35.5% (137) 3.1% (12) 4.7% (18)
31
very quickly both for time 1 and time 2, with a relatively high confidence. The rare 01
responses were given quickly but with low confidence at time 1. The final response for this 01
category was given slower but also with relatively low confidence. The existence of 00 and 01
responses was expected by the classic dual process view. However, the robust 11 responses
are problematic for the classic DI theory. How might one interpret these results?
One possible explanation is offered by the logical intuition model (De Neys, 2012,
2014). This idea suggests that people intuitively detect the conflict between heuristic
responses and standard logical principles. The original idea is that conflict is caused by two
simultaneously activated type 1 responses, one is based on normative rules, another is based
on heuristic cues. Furthermore, De Neys (2014) suggested that this theory does not entail that
the two type 1 responses are similar in their strength. More specifically, the idea is that people
are biased because they their belief-based heuristic response is more salient or has a higher
activation level (i.e., is “stronger”) than the intuitive logical response. The results of this study
can be interpreted in the light of this logical intuition theory; It is possible that for 11
responses the logic-based type 1 response gained more strength, while in the case of 00
responses the heuristic-based response had more strength. That is different individuals might
differ in the relative strength of the two type of intuitions. Although everyone generates the
two intuitions, some people will have a stronger heuristic intuition (00 cases) and others will
have a stronger logical intuition (11). An evidence for this theory is that these answers were
equally quick, and people were similarly very confident in their responses.
Interestingly, Thompson and Johnson (2014) previously found that IQ was correlated
with normative responding at time 1. This could implicate for example that for high capacity
reasoners the logical type 1 response is stronger. In contrast, for low capacity reasoners the
heuristic-based response might be strongest. This is somewhat supported by our individual
level analysis, where it was found that most of the participants can be categorized into one
direction of change category, and they were found to be very consistent in their answers
across items. However, more research will be required to reveal other possible psychological
factors which could be responsible for individual differences in the strength of the different
type 1 answers. Similarly, “strength” of a response is an undefined concept at the moment,
more theorizing will be necessary to develop this model.
Another important issue regarding this research is that we used various research
designs to assure the validity of the two response paradigm. A limitation of the original tow
32
response paradigm was that researches could not be sure that there is no type 2 engagement
during time 1. By using cognitive load and time pressure manipulations, we virtually
eliminated the possibility of type 2 engagement at the initial response stage. However, note
that ultimately one might never be sure that first response is really intuitive, given that we do
not know how much time (or cognitive resources) is specifically required to produce a type 1
response. But at the very least one can be assured that the probability of type 2 engagement at
time 1 is highly decreased due to the above mentioned manipulations.
One might wonder wow these findings could help us to revise interventionist dual
process model. As we noted above, one possibility is to use De Neys’ solution in proposing
that at the beginning of the reasoning process, two type 1 responses are generated, which
might be inhibited and overridden by type 2 processing. The present findings validate the
existence of an intuitive logical response. However, it should be clear that this model leaves
many questions unanswered. For example, how does the inhibition process work? Is it equally
possible to inhibit a logic-based and a heuristic-based type 1 response (i.e., do people need to
block the logical intuitive response to give the heuristic response?). Likewise, as we already
noted, how do we operationalize and directly measure the “strength” of different competing
intuitions?
Clearly a lot of future work will be needed to answer these fundamental questions.
33
Appendix 1. Base rate problems
This study contains scientists and assistants.
Person 'C' is intelligent.
There are 4 scientists and 996 assistants.
(No-conflict)
This study contains lawyers and gardeners.
Person 'W' is argumentative.
There are 3 lawyers and 997 gardeners.
(Conflict)
This study contains clowns and accountants.
Person 'L' is funny.
There are 995 clowns and 5 accountants.
(No-conflict)
This study contains high school students and librarians.
Person 'M' is loud.
There are 995 high school students and 5 librarians.
(Conflict)
This study contains lab technicians and
aerobics instructors.
Person 'D' is active.
There are 5 lab technicians and 995 aerobics
instructors.
(No-conflict)
This study contains I.T. technicians and boxers.
Person 'F' is strong.
There are 997 I.T. technicians and 3 boxers.
(Conflict)
This study contains nurses and artists.
Person 'S' is creative.
There are 3 nurses and 997 artists.
(No-conflict)
This study contains businessmen and firemen.
Person 'K' is brave.
There are 996 businessmen and 4 firemen.
(Conflict)
34
Appendix 2. Syllogistic reasoning problems
Type „A” questionnaire Type „B” questionnaire
All flowers need light
Roses are flowers
Roses need light
(No-conflict: Valid/Believable)
All flowers need light
Roses need light
Roses are flowers
(Conflict: Invalid/Believable)
All things made of wood can be used as fuel
Trees can be used as fuel
Trees are made of wood
(Conflict: Invalid/Believable)
All things made of wood can be used as fuel
Trees are made of wood
Trees can be used as fuel
(No-conflict: Valid/Believable)
All mammals can walk
Spiders can walk
Spiders are mammals
(No-conflict: Invalid/Unbelievable)
All mammals can walk
Whales are mammals
Whales can walk
(Conflict: Valid/Unbelievable)
All vehicles have wheels
Boats are vehicles
Boats have wheels
(Conflict: Valid/Unbelievable)
All vehicles have wheels
Trolley suitcases have wheels
Trolley suitcases are vehicles
(No-conflict: Invalid/Unbelievable)
All birds have wings
Crows are birds
Crows have wings
(No-conflict: Valid/Believable)
All birds have wings
Crows have wings
Crows are birds
(Conflict: Invalid/Believable)
All cannons fire bullets
Water cannons are cannons
Water cannons fire bullets
(Conflict: Valid/Unbelievable)
All cannons fire bullets
Guns fire bullets
Guns are cannons
(No-conflict: Invalid/Unbelievable)
35
All flowering plants have leafs
Bracken has leafs
Bracken is a flowering plant
(No-conflict: Invalid/Unbelievable)
All flowering plants have leafs
Cacti are flowering plants
Cacti have leafs
(Conflict: Valid/Unbelievable)
All dogs have snouts
Labradors have snouts
Labradors are dogs
(Conflict: Invalid/Believable)
All dogs have snouts
Labradors are dogs
Labradors have snouts
(No-conflict: Valid/Believable)
36
Appendix 3. Table 8. Mean confidence ratings and standard deviations for Experiment 1.
First response Second response
Mean SD Mean SD
Base rate C-C 87.43 21.32 91.17 16.38
I-I 86.21 23.98 85.54 25.46
C-I 76.54 24.22 77.91 30.72
I-C 69.19 24.74 73.97 27.39
Syllogistic
reasoning
C-C 90.64 18.43 93.59 16.71
I-I 91.69 20.36 92.67 19.08
C-I 73.29 27.86 82.71 28.35
I-C 79.74 26.07 76.52 33.06
Table 9. Means and standard deviations of reaction times by tasks, response number and direction of change
category. Only the conflict items were analysed. In parentheses the back-transformed responses times can be
seen in seconds.
First response Second response
Mean SD Mean SD
Base rate 11 0.535 (3,43) 0.279 (1,9) 0.467 (2,93) 0.364 (2,31)
00 0.383 (2,42) 0.275 (1.88) 0.394 (2,48) 0.279 (1,9)
10 0.45 (2,82) 0.376 (2,38) 0.384 (2,42) 0.259 (1,82)
01 0.38 (2,4) 0.327 (2,12) 0.596 (3,94) 0.392 (2,47)
Syllogistic
reasoning
11 0.431 (2,7) 0.262 (1,828) 0.432 (2,7) 0.32 (2,09)
00 0.388 (2,44) 0.282 (1,914) 0.397 (2,39) 0.338 (2,18)
10 0.732 (5,4) 0.413 (2,588) 0.44 (2,75) 0.483 (3,04)
01 0.404 (2,535) 0.262 (1,828) 0.705 (5,07) 0.429 (2,69)
37
References: Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed
random effects for subjects and items. Journal of Memory and Language, 59(4), 390–
412.
Banks, A. P., & Hope, C. (2014). Heuristic and analytic processes in reasoning: An event‐
related potential study of belief bias. Psychophysiology, 51(3), 290–297.
De Neys, W. (2006). Automatic–heuristic and executive–analytic processing during
reasoning: Chronometric and dual-task considerations. The Quarterly Journal of
Experimental Psychology, 59(6), 1070–1100.
De Neys, W. (2014). Conflict detection, dual processes, and logical intuitions: Some
clarifications. Thinking & Reasoning, 20(2), 169–187.
De Neys, W. (2015). Heuristic Bias and Conflict Detection During Thinking. Psychology of
Learning and Motivation.
De Neys, W., Cromheeke, S., & Osman, M. (2011). Biased but in doubt: Conflict and
decision confidence. PloS One, 6(1), e15954.
De Neys, W., & Glumicic, T. (2008). Conflict monitoring in dual process theories of thinking.
Cognition, 106(3), 1248–1299.
De Neys, W., Moyens, E., & Vansteenwegen, D. (2010). Feeling we’re biased: Autonomic
arousal and reasoning conflict. Cognitive, Affective, & Behavioral Neuroscience,
10(2), 208–216.
De Neys, W., Rossi, S., & Houdé, O. (2013). Bats, balls, and substitution sensitivity:
Cognitive misers are no happy fools. Psychonomic Bulletin & Review, 20(2), 269–273.
De Neys, W., & Schaeken, W. (2007). When people are more logical under cognitive load.
Experimental Psychology (formerly Zeitschrift Für Experimentelle Psychologie),
54(2), 128–133.
38
Evans, J. S. B., & Curtis-Holmes, J. (2005). Rapid responding increases belief bias: Evidence
for the dual-process theory of reasoning. Thinking & Reasoning, 11(4), 382–389.
Evans, J. S. B., & Stanovich, K. E. (2013). Dual-process theories of higher cognition
advancing the debate. Perspectives on Psychological Science, 8(3), 223–241.
Franssens, S., & De Neys, W. (2009). The effortless nature of conflict detection during
thinking. Thinking & Reasoning, 15(2), 105–128.
Gilovich, T., Griffin, D. W., & Kahneman, D. (2002). Heuristics and biases: The psychology
of intuitive judgement. Cambridge Univ Pr.
Handley, S. J., Newstead, S. E., & Trippas, D. (2011). Logic, beliefs, and instruction: A test
of the default interventionist account of belief bias. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 37(1), 28.
Kahneman, D. (2011). Thinking, fast and slow. Macmillan.
Klauer, K. C., & Singmann, H. (2013). Does logic feel good? Testing for intuitive detection
of logicality in syllogistic reasoning. Journal of Experimental Psychology: Learning,
Memory, and Cognition, 39(4), 1265.
Mata, A., Schubert, A.-L., & Ferreira, M. B. (2014). The role of language comprehension in
reasoning: How “good-enough” representations induce biases. Cognition, 133(2),
457–463.
Miyake, A., Friedman, N. P., Rettinger, D. A., Shah, P., & Hegarty, M. (2001). How are
visuospatial working memory, executive functioning, and spatial abilities related? A
latent-variable analysis. Journal of Experimental Psychology: General, 130(4), 621.
Pennycook, G., Cheyne, J. A., Barr, N., Koehler, D. J., & Fugelsang, J. A. (2014). Cognitive
style and religiosity: The role of conflict detection. Memory & Cognition, 42(1), 1–10.
Pennycook, G., Fugelsang, J. A., & Koehler, D. J. (2012). Are we good at detecting conflict
during reasoning? Cognition, 124(1), 101–106.
39
Pennycook, G., Trippas, D., Handley, S. J., & Thompson, V. A. (2014). Base rates: Both
neglected and intuitive. Journal of Experimental Psychology: Learning, Memory, and
Cognition, 40(2), 544.
Pinheiro, J., Bates, D., Debroy, S., & Sarkar, D. (2015). nlme: Linear and Nonlinear Mixed
Effects Models.
Singmann, H., Klauer, K. C., & Kellen, D. (2014). Intuitive logic revisited: new data and a
Bayesian mixed model meta-analysis. PloS One, 9(4), e94223.
Stanovich, K. E., & Toplak, M. E. (2012). Defining features versus incidental correlates of
Type 1 and Type 2 processing. Mind & Society, 11(1), 3–13.
Stanovich, K. E., & West, R. F. (2000). Advancing the rationality debate. Behavioral and
Brain Sciences, 23(05), 701–717.
Thompson, V. A., & Johnson, S. C. (2014). Conflict, metacognition, and analytic thinking.
Thinking & Reasoning, 20(2), 215–244.
Thompson, V. A., Prowse Turner, J. A., & Pennycook, G. (2011). Intuition, reason, and
metacognition. Cognitive Psychology, 63(3), 107–140.
Villejoubert, G. (2009). Are representativeness judgments automatic and rapid? The effect of
time pressure on the conjunction fallacy. In Proceedings of the Annual Meeting of the
Cognitive Science society (Vol. 30, pp. 2980–2985). Cognitive Science Society.