psychological review - stanford university

50
Psychological Review VOLUME 8S NUMBER 2 MARCH 1978 A Theory of Memory Retrieval Roger Ratcliff University of Toronto, Ontario, Canada A theory of memory retrieval is developed and is shown to apply over a range of experimental paradigms. Access to memory traces is viewed in terms of a resonance metaphor. The probe item evokes the search set on the basis of probe-memory item relatedness, just as a ringing tuning fork evokes sympa- thetic vibrations in other tuning forks. Evidence is accumulated in parallel from each probe-memory item comparison, and each comparison is modeled by a continuous random walk process. In item recognition, the decision process is self-terminating on matching comparisons and exhaustive on nonmatching com- parisons. The mathematical model produces predictions about accuracy, mean reaction time, error latency, and reaction time distributions that are in good accord with experimental data. The theory is applied to four item recognition paradigms (Sternberg, prememorized list, study-test, and continuous) and to speed-accuracy paradigms; results are found to provide a basis for comparison of these paradigms. It is noted that neural network models can be interfaced to the retrieval theory with little difficulty and that semantic memory models may benefit from such a retrieval scheme. At the present time, one of the major deficiencies in cognitive psychology is the lack of explicit theories that encompass more than a single experimental paradigm. The lack of such theories and some of the unfortunate consequences have been discussed recently by Allport (1975) and Newell (1973). Two im- portant points are made by Newell: First, research in cognitive psychology is motivated This research was supported by Research Grants APA 146 from the National Research Council of Canada and OMHF 164 from the Ontario Mental Health Foundation to Bennet B. Murdock, Jr. I would like to thank Ben Murdock for his support, help, and criticism and the use of data from his published and unpublished experiments. I would also like to thank the following: Ron Okada and David Burrows for use of their data; Peter Liepa for programming Experi- ment 2; David Andrews for help with the mathematical development; Gail McKoon, Bill Hockley, and Bob Lockhart for useful discussion; and Howard Kaplan for general programming assistance. Requests for reprints should be sent to Roger Ratcliff, who is now at the Department of Psychology, Dartmouth College, Hanover, New Hampshire 03755. and guided by phenomena (e.g., those of the Sternberg paradigm and the Brown-Peterson paradigm). Second, attempts to theorize result in the construction of binary oppositions (e.g., serial vs. parallel, storage vs. retrieval, and semantic vs. episodic). Thus, an appre- ciable portion of research in cognitive psychol- ogy can be described as testing binary opposi- tions within the framework of a particular experimental paradigm. This style of research does not emphasize or encourage theory con- struction; but without theory, it is almost impossible to relate experimental paradigms or to substantiate claims that the same pro- cesses underlie different experimental para- digms. Furthermore, the concern with binary oppositions tends to obscure the more inter- esting aspects of data, such as the form of functional relations. The theory presented in the present article is concerned with providing an account of processes underlying retrieval from memory. Perhaps the most intensively investigated class Copyright 1978 by the American Psychological Association, Inc. All rights of reproduction in any form reserved. 59

Upload: others

Post on 29-Apr-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Psychological Review - Stanford University

Psychological ReviewV O L U M E 8S N U M B E R 2 M A R C H 1 9 7 8

A Theory of Memory Retrieval

Roger RatcliffUniversity of Toronto, Ontario, Canada

A theory of memory retrieval is developed and is shown to apply over a rangeof experimental paradigms. Access to memory traces is viewed in terms of aresonance metaphor. The probe item evokes the search set on the basis ofprobe-memory item relatedness, just as a ringing tuning fork evokes sympa-thetic vibrations in other tuning forks. Evidence is accumulated in parallel fromeach probe-memory item comparison, and each comparison is modeled by acontinuous random walk process. In item recognition, the decision process isself-terminating on matching comparisons and exhaustive on nonmatching com-parisons. The mathematical model produces predictions about accuracy, meanreaction time, error latency, and reaction time distributions that are in goodaccord with experimental data. The theory is applied to four item recognitionparadigms (Sternberg, prememorized list, study-test, and continuous) and tospeed-accuracy paradigms; results are found to provide a basis for comparisonof these paradigms. It is noted that neural network models can be interfaced tothe retrieval theory with little difficulty and that semantic memory models maybenefit from such a retrieval scheme.

At the present time, one of the majordeficiencies in cognitive psychology is the lackof explicit theories that encompass more thana single experimental paradigm. The lack ofsuch theories and some of the unfortunateconsequences have been discussed recently byAllport (1975) and Newell (1973). Two im-portant points are made by Newell: First,research in cognitive psychology is motivated

This research was supported by Research GrantsAPA 146 from the National Research Council ofCanada and OMHF 164 from the Ontario MentalHealth Foundation to Bennet B. Murdock, Jr. I wouldlike to thank Ben Murdock for his support, help, andcriticism and the use of data from his published andunpublished experiments. I would also like to thankthe following: Ron Okada and David Burrows for useof their data; Peter Liepa for programming Experi-ment 2; David Andrews for help with the mathematicaldevelopment; Gail McKoon, Bill Hockley, and BobLockhart for useful discussion; and Howard Kaplanfor general programming assistance.

Requests for reprints should be sent to RogerRatcliff, who is now at the Department of Psychology,Dartmouth College, Hanover, New Hampshire 03755.

and guided by phenomena (e.g., those of theSternberg paradigm and the Brown-Petersonparadigm). Second, attempts to theorizeresult in the construction of binary oppositions(e.g., serial vs. parallel, storage vs. retrieval,and semantic vs. episodic). Thus, an appre-ciable portion of research in cognitive psychol-ogy can be described as testing binary opposi-tions within the framework of a particularexperimental paradigm. This style of researchdoes not emphasize or encourage theory con-struction; but without theory, it is almostimpossible to relate experimental paradigmsor to substantiate claims that the same pro-cesses underlie different experimental para-digms. Furthermore, the concern with binaryoppositions tends to obscure the more inter-esting aspects of data, such as the form offunctional relations.

The theory presented in the present articleis concerned with providing an account ofprocesses underlying retrieval from memory.Perhaps the most intensively investigated class

Copyright 1978 by the American Psychological Association, Inc. All rights of reproduction in any form reserved.

59

Page 2: Psychological Review - Stanford University

60 ROGER RATCLIFF

of memory retrieval paradigms has been theitem recognition paradigms, for example, theSternberg paradigm (Sternberg, 1969b), thestudy-test paradigm (Murdock & Anderson,1975), the continuous recognition memoryparadigm (Okada, 1971), and the premem-orized list paradigm (Burrows & Okada, 1975).In this area of research, several models havebeen developed to deal with results from eachparadigm, but each model is inconsistent withsome data (Ratdiff & Murdock, 1976; Stern-berg, 1975). Also, the models are paradigmspecific and do not generalize across paradigms.The theory presented here has been designedto deal with all aspects of the data in eachparadigm (accuracy, response latency, andlatency distributions) and has been designedto apply over a range of paradigms. In thefirst section, a qualitative account of the theoryis presented; in the second section, the mathe-matical model is introduced and briefly de-scribed. The article is written so that an under-standing of the mathematics is not necessaryfor comprehension of the theory. A completedevelopment of the mathematical model ispresented in the Appendix. In the third sec-tion, the theory is applied to five experimentalparadigms, in which both response latencyand accuracy are measured, and the same setof processing assumptions is used to accountfor performance in all of the paradigms. Thefourth section relates the theory to neuralnetwork models, semantic memory models,and propositiomal models,

A Description of tie Theory

In this section, I present an overview of whatI term the r^mmal theory. I have mot presentedany mathematical development in this sectionin order to give a qualitative account of thetheory, which cam be understood independentlyof the mathematical formulation.

Let us consider a typical item recognitiontask in which & group of items in memory hasbeen designated the memory search set and asingle probe item is presented for testing.According to the theory, the probe is encodedand then compared with each item in the searchset simultaneaimsly (ie., in parallel). Eachindividual comparison is assumed to be ac-complished by a random walk (more speci-fically, a diffusion) process. A decision is made

when any one of the parallel comparisonsterminates with a match (self-terminating onpositives) or when all of the comparisonsterminate with a nonmatch (exhaustive onnegatives). When a decision has been made, aresponse is initiated and performed. Figure 1illustrates the overall retrieval scheme.

Search or Evoked Set

In some models developed to account foritem recognition data, it is assumed that theonly memory items accessed in the comparisonprocess are the items in the experimenter-designated search set, that is, the positive set(Schneider & Shiffrin, 1977; Sternberg, 1966).There is a problem with such models becauseit can be shown that items outside this positiveset may be accessed. For example, Atkinson,Herrmann, and Wescourt (1974) report twoexperiments in which the recency of negativeprobes was manipulated. The first used aSternberg (1966) procedure. Subjects werepresented with a memory set of either 2, 3, 4,or 5 words, which was followed by a probeword. It was found that latency of response toa negative probe word was greater and itsaccuracy lower the more recently it had oc-curred. The second experiment used a 25-wordstudy list and a 100-word test list with norepeated negatives. After 50 test words, sub-jects were required to read written instructions;10 words in the instructions served as negativeitems in the remainder of the test list. Latenciesfor the negative items that had been in theinstruction set were significantly longer thanlatencies for other negative items. Therefore,to explain the processes involved in itemrecognition, a model must include somemechanism that allows items from outsidethe experimenter-designated search set to beaccessed in the comparison process.

I will adopt the notion of search set suggestedby a resonance metaphor (Neisser, 1967, p.65). Suppose each item in memory correspondsto a tuning fork and the informational basison which comparisons are made correspondsto frequency. Then, the comparison processcan be viewed as the probe tuning fork ring-ing and evoking sympathetic vibrations inmemory tuning forks of similar frequencies.Whether or not the probe evokes sympatheticvibrations in a memory-set item depends only

Page 3: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 61

PROBE INPUT

EVOCATION OF THE

SEARCH SET

EVOKED SET

( POTENTIALLY VERY LARGE)

PARALLEL COMPARISON

(DIFFUSION PROCESSES)

UPPER BOUNDARY MATCH

LOWER BOUNDARY NON- MATCH

DECISION PROCESS

SELF TERMINATING ON MATCH(OR gale)EXHAUSTIVE ON NON-MATCH(AND gate)

RESPONSE OUTPUT

Figure 1. An overview of the item recognition model.

on the similarity of frequencies and is indepen-dent of whether or not the memory item is inthe experimenter-designated memory set. Theamplitude of resonance drives the random walkcomparison process: The greater the amplitude(better match), the more bias there is towarda "yes" response; the smaller the amplitude(poorer match), the more bias toward a "no"response (see Figure 2).

So far, I have not indicated how manymemory items can be contacted by the probe.It turns out that predicted performance willbe little affected even if the number of itemscontacted is large, perhaps comparable to the

number of items in memory. The reason is that"no" responses are based on exhaustive pro-cessing of items in the searchi set, so the deci-sion time for a "no" response is a function ofthe group of slowest individual nonmatchcomparisons. Thus, many fast finishing pro-cesses may enter the decision process withoutaffecting performance.

I will designate items giving rise to thisslower group of latency- and accuracy-deter-mining processes as the semck set. In laterapplications, the search set will usually beapproximated by the memory set. A demon-stration that the evoked set (defined as all

MATCHBOUNDARY

STARTING.POINT

NON-MATCH .BOUNDARY

1. CLOSE PROBE-ITEM MOO*FAST TEHMMMflrme PROCESS

2. MODERATE MATCHSUW TEMHMMMG PROCESS

3. MODERATE NOW-MATOHtSLOW TERMINATING PROCESS

4. EXfflEME NON-MWTCHFAST TERMWATIMG PROCESS

Figure 2. The relation between relatedness and amount of match in the dMfasiott lamdlwn) walk process.(Relatedness varies from high [Process 1] to low [Process 4].)

Page 4: Psychological Review - Stanford University

62 ROGER RATCLIFF

items contacted by the probe) can theoreticallybe very large is presented later in this section.There, it is shown that many probe-itemcomparisons with very small resonant ampli-tudes can be added to the few performance-determining processes (probe-nontarget itemcomparisons with larger resonances) withoutchanging latency or accuracy.

It is worth noting at this point that theretrieval theory outlined in Figure 1 in con-junction with the resonance metaphor can beconsidered an implementation (for item recog-nition) of content-addressable memory modelsor broadcast models (Bobrow, 1975). Thenotion of resonance also suggests similaritiesto the pandemonium model (discussed inNeisser, 1967) for feature encoding.

Information Entering the ComparisonProcess

Here I examine in more detail factorsthat influence the "amplitude of resonance."Seward (1928, cited in Woodworth, 1938)performed an experiment in which sub-jects were presented with 30 patterns ("fancypapers"). Following 10 minutes of unrelatedactivity, a yes-no recognition test was given inwhich 10 test items were identical to the studyitems, 10 rather similar, 10 slightly similar, and10 very different. It was found that as similaritydecreased, the proportion of "yes" responsesdecreased and latency of "yes" responses in-creased. Similarly, as similarity decreased, theproportion of "no" responses increased andlatency of "no" responses decreased.

More recently, Juola, Fischler, Wood, andAtkinson (1971) performed an experimentwith words as stimulus and probe items.Stimulus words were memorized to a criterionof perfect recall, and test words were presentedsequentially at a fixed rate. There were threetypes of negative items: homophones andsynonyms of target words and neutral words.It was found that synonyms were 60 msecslower than neutral words, and homophoneswere 120 msec slower; although when homo-phones were broken down into two types,visually similar and visually dissimilar totarget items, differences were 200 msec and 40msec, respectively. These results show thatsemantic, visual, and phonemic similaritiesbetween probe and memory items affectrecognition performance.

Schulman (1970) studied a complementarytask, that is, one requiring the subject to makethe judgment as to whether a probe word wasidentical to, a homonym of, or a synonym ofa presented word. Ten words were presented tothe subject (Sternberg varied-set procedure),the judgment condition was cued, and theprobe word was presented. Accuracy, asmeasured by d", and latency behaved in muchthe same way: d' was highest and latencyshortest for the identical condition, d' waslower and latency longer for the homonymcondition, and d' was lowest and latencylongest for the synonym condition.

Some similarity effects can be quite un-expected. For example, Morin, DeRosa, andStultz (1967) and Marcel (1977) have shownthat the more numerically remote a negativeprobe is from the memory set, the faster theresponse (e.g., Probe 1 is faster than Probe 6to Memory Set 798). Thus, numerical related-ness can enter the recognition process.

All of these results suggest that similaritybetween probe and memory-set items is amajor determinant of recognition performance,and that performance is best conceived in termsof discriminability between memory-set itemsand distractors.

Further, the fact that so many factors in-fluence item recognition performance raisesthe question, How do we conceptualize thestructure of the memory trace? Tulving andBower (1974) have suggested that the mostgenerally accepted conception of the trace isthat of a collection of features or a bundle ofinformation. In addition, Tulving and Bowerdescribe methods that have been used to studythe features or combination's of information inthe trace; some of the methods are similar tothose described earlier in this section. However,they conclude by stressing that inferencesabout trace structure can only be drawn in thecontext of a process model. The theory pre-sented here specifies a process model in greatdetail but does little more than describe thememory trace as a bundle of information whileallowing some quantitative assessment of theextent to which certain (possibly featural)information is represented in the memorytrace.

It seems that many qualitatively differenttypes of information are contained in thememory trace. In order to determine how well

Page 5: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 63

a probe matches an item in the memory set,the interaction of the trace and probe informa-tion must be assessed. I will use the single termrelatedness to describe the amount of match orsize of resonance. Thus, in Figure 2, Processes1 to 4 have relatedness varying from high tolow. Relatedness will be assumed to vary overitems because whenever a group of nominallyequivalent items is memorized, some items arebetter remembered than others (cf. strengththeory and attribute theory; Murdock, 1974).In quantification of the mathematical model,relatedness has variance (a normal distributionof relatedness is assumed), which allows thecalculation of a d' measure of discriminability.Variability in relatedness is shown later to benecessary to ensure that asymptotic accuracyis not infinite. It is possible to assess the extentto which different kinds of information con-tribute to the memory trace, for example, bycalculating d' discriminability values for dif-ferent kinds of negative items in the experi-ment by Juola et al. (1971) described earlier.

So far, I have made two main points. First,the resonance metaphor is used in order toemphasize that we are dealing with a parallelinteraction between the probe and the repre-sentation of the memory-set items. Theretrieval scheme suggested by the resonancemetaphor allows performance to be affected byitems outside the memory set, although thisfeature will not be used here in any of theapplications of the mathematical model and isincluded only for sufficiency. Second, all traceinformation is mapped onto a unidimensionalvariable, that of relatedness, and relatedness isthe dimension on which discriminability be-tween positive and negative items is assessed.

Comparison Process

Comparison of a probe to a memory-set itemproceeds by the gradual accumulation of evi-dence, that is, information representing thegoodness of match, over time. It is easiest toconceptualize the comparison process as afeature-matching process in which probe andmemory-set item features are matched one byone. A count is kept of the combined sum of thenumber of feature matches and nonmatches,so that for a feature match, a counter is in-cremented, and for a feature nonmatch, the

counter is decremented. The counter begins atsome starting value Z, and if a total of Acounts are reached, the probe is declared tomatch the memory-set item (A — Z morefeature matches than nonmatches). But if atotal of zero counts are reached, an item non-match is declared. This process is called arandom walk (see top panel of Figure 3).

Relatedness corresponds to the ratio of thenumber of matching features to the totalnumber of features. Two probe-memory-setitem comparisons involving identical related-ness may still have different comparison times.Suppose that in one process, the features areordered so that the matching features areprocessed first, whereas in the second process,nonmatching comparisons are carried out first.Then, if the number of feature matches re-quired for an item match is sufficiently lessthan the total number of features, the twocomparison processes will give quite differentcomparison times (the first fast and the secondslow) or perhaps will give different results(item match and nonmatch, respectively). Ingeneral, there is variation in comparison time,depending on order of feature comparisons andthe distribution of matching and nonmatchingfeatures within that order. Therefore, twosources of variance have now been identified:variance in relatedness and variance in thecomparison process.

Diffusion Process

In the mathematical formulation of thetheory, the comparison process is representedby the diffusion process, the continuous ver-sion of the random walk in both temporal andspatial coordinates. The diffusion process hasthe advantage that the mean rate of accumula-tion of information and the variance in thatrate are independent parameters. In thefeature-matching model, the process is es-sentially binomial, and mean and variance ofa binomial are correlated; whereas the diffusionprocess is essentially normal, and mean andvariance are independent. The third panel ofFigure 3 illustrates the diffusion process.

Diffusion and relatedness. The critical as-sumption made is that drift rate in the dif-fusion process (in Figure 3, average distancetraveled vertically per unit time) is equal to

Page 6: Psychological Review - Stanford University

64 ROGER RATCLIFF

THE RANDOM WALK PROCESS

DISTRIBUTION OF NUMBEROF STEPS TO THE MATCHBOUNDARY (HITS)

MATCH BOUNDARY

NON-MATCH BOUNDARY

DISTRIBUTION OF NUMBER OF STEPS TOTHE NON-MATCH BOUNDARY (MISSES)

RELATEDNESS DISTRIBUTIONS FOR TARGET ITEMS (SIGNAL

NON -TARGETITEMS *

NON-TARGET ITEMS (NOISE )

TARGET ITEMS

VARIANCE *

RELATEDNESS

THE DIFFUSION PROCESS

( FOR TARGET ITEMS WITH RELATEDNESS DISTRIBUTION MEAN U +• VARIANCE f\f

DISTRIBUTION OF FIRST PASSAGESTO THE MATCH BOUNDARY! HITS )

MATCH BOUNDARY

T^-.VARIANCE OF DRIFT «S 2

• DRIFT PARAMETER

NON-MATCH BOUNDARY

DISTRIBUTION OF FIRSTPASSAGES TO THENON-MATCH BOUNDARY(MISSES)

Figure 3. An illustration of the random walk and diffusion process, together with relatedness distribu-

tions that drive the diffusion process.

the relatedness value. Thus, any particularprobe-memory-set item comparison has driftequal to the relatedness value of that com-parison, so that, for example, the greater theprobe-memory-set item relatedness, the fasterthe match boundary is reached. The secondpanel of Figure 3 shows two distributions of

relatedness, one for probe-target item com-parisons and one for probe-nontarget itemcomparisons.

Parameters for the comparison process.Figure 3 shows the six parameters of the com-parison process. Over the five paradigms ex-amined, variance in relatedness (if) and vari-

Page 7: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 65

ance of drift in the diffusion process (s2) werekept constant. Thus, there were four freecomparison process parameters to be fitted todata: relatedness parameters u and v anddiffusion process boundary parameters z and a.These, plus an encoding and response stageparameter TEE, summarize performance initem recognition experiments.

Variable criteria. For any particular itemrecognition task, there are three variablecriteria to be set by the subject: the zero pointon the relatedness dimension (correspondingto (8 in signal detection ^theory) and the twoboundary positions in the diffusion process.Note that an item with relatedness zero, thatis, at the relatedness criterion, will have anaverage rate of accumulation of evidence thatleads to no systematic drift toward either thematch or nonmatch boundaries. Changes inboundary positions are the basis of speed-accuracy trade-off and are discussed in a latersection.

Reaction time distributions. One featurethat distinguishes the retrieval theory frommost other models in this area of research isthat reaction time distribution shapes arepredicted and fitted. Figure 4 illustrates howthe normal distribution of relatedness withmean u maps into a skewed reaction time dis-tribution. Also shown in Figure 4 is an illu-stration of the way in which the distributionchanges as u changes: When u decreases, thefastest responses slow a little, and there is alarge elongation of the tail of the distribution.Thus, a prediction of the model is that ifrelatedness decreases, the mean and mode ofthe distribution will diverge. In general, themathematical model will be manipulated toyield expressions for reaction time distributionsand accuracy for both hits and correct rejec-tions that can then be fitted to data.

Further aspects of the comparison processare discussed later in relation to speed-ac-curacy trade-off and in relation to other modelsof the comparison process.

Decision Process

The decision process can be viewed as acombination stage in which the products ofthe many comparisons are combined to produceeither a "yes" or "no" response. The decision

i— REACTION TIME DISTRIBUTIONS

time

Figure 4. A geometrical illustration of the mappingfrom a normal relatedness distribution to a skewedreaction time distribution (with variance in driftsi = 0). (Note that as relatedness decreases, the dis-tribution tail skews out. a represents the distancebetween the bottom and top boundaries of the diffusionprocess, z represents the distance between the bottomboundary and the starting point, and « represents themean of the normal relatedness distribution.)

process is self-terminating on a match, sothat if enough evidence is accumulated in justone comparison process, a "yes" response isproduced. In contrast, the decision process isexhaustive on nonmatch comparisons, so thata "no" response is produced only when allcomparison processes terminate in nonmatches.

It has already been noted that the finishingtime distribution for a single comparison issimilar to observed reaction time distributions.It turns out that the distribution of finishingtimes of the maximum of a number of diffusionprocesses (each with a skewed finishing timedistribution) is again shaped very much likeobserved reaction time distributions. Thus,the mathematical model produces acceptableshapes for both positive and negative responsetime distributions.

In most item recognition studies, "no"responses are almost as fast as "yes" responses,and sometimes, this finding has providedproblems for certain models (see Collins &Quillian, 1970; Smith, Shoben, & Rips, 1974).This is not a problem for the retrieval theory,although it may be hard to see how "no"responses, which are based on exhaustiveprocessing of the search set, can be fast enough.In fits of the mathematical model, it is foundthat the separation of the starting point and

Page 8: Psychological Review - Stanford University

66 ROGER RATCLIFF

Table 1Accuracy and Latency as a Function of theNumber of Extra Fast Finishing Processesfor Correct Rejections, with v ~ — A,a = .12, z = .03, and Search Set Size 16

No. extraprocesses

m(evokedset = m+ 16}

0too500

5,00025,000

100,000

No. standarddeviations r

(of the mprocess dis-

tribution) be-low the non-target dis-tribution

02,3,4*75»5,

^

Proportioncorrect

.839

.838

.838

.838

.838

.835

Meanreaction

time(in msec)

358363362358363358

noiunatch boundary (z) is smaller than theseparation of the starting point and matchboundary (a — z). Therefore, it is even possibleto have "no" responses much faster than "yes"responses (as found by Rips, Shoben, & Smith,1973).

Resonance Metaphor

It was argued earlier that the search setcould be made very large, but that reactiontime and error rates would still be determinedonly by the slower comparison processes.Suppose that in a typical fit to data from thestudy-test paradigm, m extra processes areadded to the 16 probe-memory-search-setcomparison processes. Table 1 shows accuracyand reaction time as a function of m and r,where r is the number of standard deviationsthe extra process distribution is placed belowthe nontarget item distribution. Note that thenumber of extra processes is adjusted, so thatthe change in latency and accuracy is small forthe particular value of r used; thus, moreprocesses than m would produce significantchanges in accuracy and latency. The results inTable 1 show that many probe-item com-parisons with small relatedness may enter thedecision process without affecting accuracy orlatency. These results suggest that the reso-nance metaphor has the right sort of propertiesto be useful in theorizing about item recogni-tion processes.

Speed-Accuracy Trade-off

In many models of item recognition, ac-curacy and latency are not explicitly related,even though it is well known that a subjectcan sacrifice accuracy to produce a fasterresponse (e.g., Pachella, 1974). There areseveral methods for studying speed-accuracytrade-off, for example, Wickelgren (1977) listssix methods. His six methods fall into threebasic classes: (a) methods that induce thesubject, for example, by use of instructions orpayoffs, to adjust tlje amount of informationrequired for a response; (b) methods thatforce the subject, for example, by use of dead-lines, response signals, or time bands, torespond within some time limit; and (c)methods that partition reaction times intogroups (e.g., those between 420 msec and 440msec, those between 440 msec and 460 msec,and so on) and compare accuracy values withineach partition.

The method of partitioning reaction timeshas been discussed by Pachella (1974) andWickelgren (1977). Pachella claims that thespeed-accuracy function obtained from thisprocedure is probably independent of thespeed-accuracy functions obtained from theother two methods. In fact, the speed-ac-curacy functions obtained from partitioningsimply tell us about the latency density func-tions for error and correct responses and, interms of the retrieval theory, are largely un-related to the other speed-accuracy measuresand so are not considered further.

In the retrieval theory, the relation betweenspeed and accuracy is central, and the diffusioncomparison process (plus the decision process)may be viewed as a transformation from arelatedness discriminability scale to observedspeed and accuracy measures. The first speed-accuracy method, in which the amount ofinformation required for a response is manipu-lated, and the second method, in which the timeof response is manipulated, can be viewed ascomplementary in terms of the theory. Wheninstructions or payoffs are manipulated, theprocess will be referred to as informationcontrolled; when response signals or time win-dows are used, the process will be referred toas time controlled.

Information-controlled processing. To

Page 9: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 67

demonstrate the way in which speed-accuracy probe-memory-set item comparisons: onetrade-off is modeled when instructions or match and one nonmatch. The assumption ispayoffs are manipulated, let us consider two that payoffs or instructions induce the subject

/\̂ __DISTRI8UTION OF HITS' ^ MATCH

Z

0

PH..8TIME

PCR'-BNON -MATCH

DISTRIBUTION OF CORRECT REJECTIONS

I NON-MATCHV BOUNDARY

INCREASED

MATCH BOUNDARYINCREASED

NON-MATCHBOUNDARYINCREASED FURTHER

Figure 5. An example of the change in comparison time distributions and correct response rates (ps isthe probability of a hit and pen. is the probability of a correct rejection) for changes in boundary posi-tions for one match process and one nonmatch process, with all other parameters constant. (The nu-merical values and positions and shapes of the latency distributions illustrate the qualitative relationsbetween the three speed-accuracy conditions but are not quantitatively exact. [Note that in the middlepanel, there will be a slight decrease in ^CR.] Processes 1 and 2 illustrate comparisons that would haveterminated at the wrong boundary if the boundaries had been close as in the top panel, a is the dis-tance between the two boundaries in the random walk comparison process, and z is the bottom boundaryto starting point distance.)

Page 10: Psychological Review - Stanford University

68 ROGER RATCLIFF

MATCHINGCOMPARISONS

NONMATCHINGCOMPARISONS

t,

Figure 6. Spread of evidence as a function of time in the unrestricted diffusion process for one matchingand one nonmatching process. (At time t\, there is a large amount of overlap; at ta, the overlap hasreached asymptote, with asymptotic d' — (u — ti)/r; [see middle panel of Figure 3]. « is the mean ofthe match relatedness distribution, and D is the mean of the nonmatch relatedness distribution.)

to adjust the positions of the match and non-match boundaries and so to adjust the amountof information required for a decision. Forexample, Figure 5 illustrates the effect ofincreasing the position of the nonmatchboundary (middle panel) and increasing theposition of both boundaries (bottom panel).When the position of the nonmatch boundaryis increased, "no" responses are made slowerand the number of misses reduced. When bothboundaries are moved apart, latency andaccuracy of both hits and correct rejectionsincrease.

As the boundaries move apart, the reactiontime distributions both shift (the fastestresponses or leading edge of the distributionbecome slower) and spread (the mode andmean of the distribution diverge), as shown inFigure 5. The bottom panel of Figure 5illustrates the time course of evidence ac-cumulation for four comparison processes.The two processes marked "1" and "2" areexamples of processes that would terminateincorrectly with narrow boundaries but thatterminate correctly with the relatively wideboundaries shown in the bottom panel. It isthe correct termination of processes such asProcesses 1 and 2 that is responsible for the

increase in accuracy as the diffusion processboundaries are moved apart.

Time-controlled processing. Time-controlledprocessing is modeled in a different way thaninformation-controlled processing. Figure 6shows the way evidence accumulates from twodistributions of relatedness: (a) a probe-memory-set item match and (b) a nonmatch(see Figure 3, middle panel) as a function oftime. The decision assumption is that if aparticular comparison is stopped at time tiand the amount of evidence is greater thanthe starting value, then the comparison iscalled a match, and if the amount of evidenceis less than the starting value, the comparisonis called a nonmatch, (For mathematicaltractability later, it is assumed that thematch and nonmatch boundaries are far apartand that the random walk diffusion processcan be considered unrestricted). Figure 6shows the way evidence has accumulated attimes ti, fa, and t3. At early times (ti), there is alarge amount of overlap, and d' is rather small.At later times (fe and k), target and nontargetdistributions have diverged and are approach-ing an asymptotic form that corresponds tothe relatedness distributions shown in themiddle panel of Figure 3. Thus d' as a function

Page 11: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 69

of time approaches an asymptote. (Mathe-matical expressions for d' as a function of timeand fits to data are shown later.)

Thus, we see that the random walk com-parison process can account for results fromboth information- and time-controlled speed-accuracy experiments.

Model Freedom

In the description of the retrieval theory,I have stressed the flexible nature of the pro-cessing system, and this flexibility shows upas apparent model freedom. I now discussthe problem of model freedom and indicatewhich functional relations provide reasonablyconstrained tests of the mathematical model.

When applied to an experiment, the mathe-matical model has five variable parameters,namely, TEE, z, a, i>, and u. The encoding andresponse parameter TEE. is fixed, so that inorder to evaluate the effect of independentvariables on performance, TER can be con-sidered constant. In any situation where experi-mental conditions change between trials andthe subject can know which condition is beingtested (e.g., set size in the Sternberg paradigm),it is possible that criteria (z, a, v, and u) willchange across conditions. So, there are fourparameters free to vary in fitting the data.There is a rather weak constraint on the valueof these parameters in that if performanceexhibits regular behavior, then changes inz, a, v, and u should be regular. More usefultests of the mathematical model can be foundin experiments where, on any particular trial,it is not possible for the subject to know whichcondition is being tested prior to retrieval (e.g.,serial position in the Sternberg paradigm).In such situations, criteria cannot be adjustedto systematically vary with the independentvariable. Thus, the only parameter free to varyis relatedness u, and with only u varying,accuracy and latency must be simultaneouslyfitted.

One major aim in developing the mathe-matical model is to fit reaction time distribu-tions. In fact, reaction time distributions arefitted without the need to add any furtherparameters. This account contrasts sharplywith that given by the serial scanning modelfor the Sternberg paradigm. It is generally

agreed that the exhaustive serial scanningmodel is simple and easily testable, with onlythree parameters, one slope parameter, andtwo intercept parameters. It is not generallyrealized, however, that to fit reaction timedistributions (using additive-factors logic andthe Pearson system of frequency curves), theserial scanning model suddenly requires nineparameters (Sternberg, Note 1). If accuracyperformance were to be fitted also, then stillmore parameters would be required. Thus,the number of parameters needed for theretrieval theory to fit results from the Stern-berg paradigm may turn out to be much thesame as the number of parameters needed bythe serial scanning model. This discussionshows that there are certain constraints thatare rather hard to evaluate, for example, howmany degrees of freedom are taken up inrepresenting reaction time distributions.

It was argued earlier that many kinds ofinformation enter the comparison process,including recency information. The decay ofrecency information (or strength) has beenmodeled several times, and various simplemodels with few parameters have been de-veloped (Atkinson & Shiffrin, 1968; Murdock,1974; Norman & Rumelhart, 1970; Wickel-gren & Norman, 1966). The retrieval theorydescribes performance as a function of recencywith a series of relatedness values u. Thus,for example, there will be a separate value ofrelatedness u at each serial position, that is,there will be as many free parameters as thereare serial positions. If some assumptions aboutdecay in relatedness were made, then thisseries of relatedness values could possibly befitted by a simple few-parameter model (as inthe models mentioned above), thus reducingthe number of free parameters.

In conclusion, it seems that the problem ofmodel freedom is more complex than it wouldappear at first sight. Simple few-parametermodels require the addition of many moreparameters to account for anything more thancrude summary statistics, and complex modelsmay be quite tightly constrained in non-obvious ways.

Mathematical Model

In this section, some of the principal param-eters, functions, and expressions for the

Page 12: Psychological Review - Stanford University

70 ROGER RATCLIFF

mathematical model are introduced. A muchmore complete discussion of the random walkand diffusion processes with derivation com-bination rules for exhaustive and self-terminat-ing parallel processing are presented in theAppendix. Later in the present section, themethod of fitting the theory to data is de-scribed, checks on the fitting program aredescribed, and relations to other models arediscussed.

Parameters of the Mathematical Model

The parameters used in the mathematicalmodel arise from different theoretical sourcesand may be summarized as follows: (a) for thecomparison (diffusion) process: lower bound-ary zero, starting point in the diffusion processz, and upper boundary a; drift in the diffusionprocess (equal to relatedness) and variance inthe drift s2; (b) for probe-item relatedness: anormal distribution N(u, rf) for matchingcomparisons and a normal distribution N(v,»;)for nonmatching comparisons; and (c) anencoding and response output parameter rBR.Note that s2 and if are kept constant through-out the fits presented in this article, so that theonly variable parameters are a, z, u, v, andTER. Thus, M and v alone represent input frommemory into the decision system.

Diffusion Process

The diffusion process is used to representthe accumulation of evidence for a singleprobe-memory-set item comparison. As dis-cussed earlier, one has to select aspects of thedata (summary statistics) as points of contactbetween theory and data. I have chosen to usemean error rate and reaction time distributionstatistics, and so it is necessary to deriveexpressions for error rate and finishing timedensity functions for the diffusion process. Inthe Appendix, expressions for error rate andfinishing time density functions are presentedfor the discrete random walk. Taking thelimit as the number of steps becomes large andthe size of each step becomes small allows us toobtain expressions for error rate and finishingtime density functions for the diffusion process(as shown in the Appendix).

Consider a single probe-memory-set item

comparison with relatedness £ ; as noted earlier,drift in the diffusion process is set equal torelatedness. Let the nonmatch boundary be atzero, the starting point of the process be at z,and the match boundary be at a. Then, if s2

is the variance in the drift in the diffusionprocess, the probability of a nonmatch is givenby

(Throughout this section the subscript minuswill refer to a nonmatch and the subscriptplus will refer to a match.) The finishing timedensity function for a nonmatch is given by

-ITS'

/a'X. (2)

The equivalent expressions for a match[T+(?) and g+(t, £)] can be found by setting£ = — £ and 2 = a — z in Equations 1 and 2.Note that g-(t, £) is not a probability densityfunction because some proportion of compari-sons end up as matches. However, g-(t, £)/7-(£) is a probability density function.

Decision Process

The decision process is self-terminating formatches and exhaustive for nonmatches. Thus,in order for a "no" response to be made, allthe comparison processes must finish, and thedecision time is the maximum of the individualcomparison process times. Let Gi-(t) be thefinishing time distribution function

G-(t, & =

for a nonmatch of process i, and Gm&K(t) bethe finishing time distribution function for themaximum of n nonmatch processes. Then,

and also

(3)z=-l

where 7i-(£) is the probability of a nonmatchof the ith comparison, and 7-(£) is the prob-

JLMCC
Cross-Out
Page 13: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 71

ability of obtaining a nonmatch from the ncomparisons. The expressions for the matchdecision (minimum of the comparison processesthat terminate in a match) are more compli-cated and are shown in Equations A19 to A22in the Appendix.

Equations 1, 2, 3 and A20 form the basis forcomputation of predicted reaction time dis-tributions and error rates.

Probe-Memory-Set Item Relatedness

Because relatedness £ has a normal distribu-tion [_N(u, r;) and N(v, y) for matching andnonmatching comparisons, respectively], it isnecessary to average G(t, £) and y(l-) overrelatedness before obtaining the final theo-retical expressions for error rates and reactiontime distributions.

A complete description of the mathematicalmodel, together with all the equations neces-sary to obtain reaction time distributions anderror rates, is presented in the Appendix.

Fitting the Mathematical Model to Data

Despite the recent popularity of reactiontime research, reaction time distributions havebeen largely ignored as a source of information.This is surprising in view of the fact that dis-tributional information can be useful inevaluating models (Ratcliff & Murdock, 1976;Sternberg, Note 2). However, one problem inusing reaction time distributions concerns thechoice of statistics used to describe a distribu-tion. The traditional method that uses mo-ments (Sternberg, Note 1) turns out to be oflittle practical use because many thousands ofobservations per experimental condition arerequired before stable estimates can be ob-tained. Furthermore, higher moments (thirdand fourth) describe the behavior of the ex-treme tail of the distribution (Ratcliff, Note 3)and, therefore, are of little practical interest.Ratcliff (Note 3) has suggested that the param-eters of the probability density function ofthe convolution of a normal distribution[_N(IJ,, <r)] and an exponential distributionC/(0 — (lA)e~"T] provide a good summary ofempirical reaction time distributions (see alsoRatcliff & Murdock, 1976). I have decided to

use the convolution distribution

£ (4)

as a meeting point of theory and data. Theconvolution model is fitted to data, andparameter values obtained serve to summarizethat data. Theoretical reaction time distribu-tions are computed, and the convolution modelis fitted to the theoretical distribution givingtheoretical values of the parameters. Figure 7shows some fits of the convolution model tothe theoretical distributions, and it can be seenthat the curves are similar.

In order to fit the theory to data, the equa-tions derived earlier were used to give valuesof correct response rate pt and reaction timedistribution parameters /*< and T«. (Theparameter a of the convolution model can beleft out because generally a does not theo-retically and empirically change appreciablywith experimental condition, and the valuesfor both theoretical and empirical estimatesare similar.) The corresponding three empiricalparameters p,, pe, and re are estimated fromthe data (averaging across subjects), and the'function

(P. - o*. -

(summed over both hits [H] and correctrejections CCR]) is minimized as a function ofthe theoretical parameters. The variances <rM

2

and crT2 are asymptotic variance estimates for

the convolution model (see Table 2 of Ratcliff& Murdock, 1976), and ffp

2 = />.(! - p.)/N,where N is the number of observations onwhich pe is based.

Because the theoretical reaction time dis-tributions are best fits to the average reactiontime distributions across subjects (/*« and re

are averages over subjects), it is difficult toget any direct idea of how well the theoryaccounts for the data. Ratcliff (Note 3) hasdeveloped a method of producing group reac-tion time distributions. Essentially, reaction

JLMCC
Cross-Out
Page 14: Psychological Review - Stanford University

72 ROGER RATCLIFF

03<COoocQ_

HITS

11=0.25V = -. 430 = .116Z- a/4u= .MOmsCTr .037ms

s . 187 ms

CORRECT REJECTIONS

V = -.38Q = . 1 2 12= o/4

- .I40ms0= .037ms

= ,255ms

V = -.45Q = . 1 1 5Z = 0/4JU- .114 ms0-= .031 ms"t= .165ms

1.2 1.4

TIME (SEC)

7. Some sample theoretical reaction time distributions (solid lines) for theoretical parametersu, a, z, and v and fits of the convolution model parameters p, a, and T to theoretical distributions (dots).(These distributions illustrate the adequacy of the convolution approximation to the theoreticaldistributions, ms = msec.)

time quantiles are derived for each subject cell,and these quantiles are averaged over subjectsto give group quantiles. The group quantilesare then plotted on the horizontal axis of agraph. If, for example, 5% quantiles were used,then 5% of the probability density would liebetween adjacent quantiles. Therefore, aprobability density function can be con-structed by drawing equal area rectangles be-tween adjacent quantiles. The theoreticalprobability density function can be comparedwith the empirical function by plotting thetwo functions on the same graph. This methodis used to show how well the theoretical reac-tion time distributions fit the data. It shouldalso be noted that the convolution parametersderived by fitting the convolution model to thereaction time quantiles are almost identical to

the convolution parameters that are theaverage, across subjects, of convolution fits toindividual subject cells (Ratcliff, Note 3).

Checks on the Estimation Method

In fitting moderately complex models, suchas this diffusion process model, there are manyplaces where errors might occur. Therefore,two kinds of checks should be carried out:internal and external checks. For example, aninternal check that should be carried out whennumerically integrating to infinity involvesprinting out the integrand over the range ofintegration and checking the values by handcalculation. External checks can be somewhatmore involved. The idea here is to comparevalues obtained from the program with some

JLMCC
Cross-Out
Page 15: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 73

known or established values. A program tocompute accuracy and reaction time distribu-tions was implemented on an IBM 370-165computer in WATFIV FORTRAN and, withoutthe integration over relatedness (EquationsA24 and A25), was implemented on a PDP-12A computer hi FOCAL language. Letting17 —» 0 in the FORTRAN program, the two pro-grams were found to give identical results. TheFOCAL program was checked by calculatingvalues of first passage time distribution fora » z. These results were then compared withvalues obtained from the one barrier diffusionprocess, Equation 6 (see Cox & Miller, 1965,p. 221; Feller, 1968, p. 368), with probabilitydensity function

g-(0 =

These values were identical, and so the evalua-tion of the model can be considered satisfactory.

Relations to Other Models

In this section, I compare and contrast thepresent theory with three classes of models andtheories. The first class of models assumesexponentially distributed processing stages;some of these models attempt to deal with thesame experimental domain as the retrievaltheory. The second and third classes arerandom walk and counter models; these havebeen developed to account for both accuracyand latency in choice reaction time and re-search in perception.

Models with exponential processing stages.In several models of memory search andmemory retrieval, exponential distributionshave been used to represent the finishing timedistributions of processing stages (Anderson &Bower, 1973; Atkinson, Holmgren, & Juola,1969; Townsend, 1971,1972). The exponentialdistribution is useful because it is a one-parameter distribution and has a skewed posi-tive tail just as observed reaction time dis-tributions. Furthermore, the exponential iseasy to work with and quite often allows exactsolutions to be obtained for complicated ex-pressions (e.g., Anderson, 1974). However,there are several problems in using the ex-ponential distribution. First, Sternberg (1975)

argues that the Markov (no memory) propertyof exponential distributions is inconsistentwith what is meant by processing over time.For an exponential process, the expected timeto termination is independent of the time al-ready elapsed; whereas a more reasonable viewof cognitive processing would suppose that thelonger a process has been running, the greaterthe probability of its termination. Second, theuse of exponentials to represent stage distribu-tions may produce good fits to- mean reactiontime but may not lead to theoretical distribu-tions that match the shape of observed reac-tion time distributions. Furthermore, when theadditive-factors method has been applied andcomponent stage distributions extracted usingPearson's method (Sternberg, Note 1), theunderlying distributions are far from ex-ponential. Thus, a good strategy may be todevelop a model using exponentials and thentest whether in fact the stage distributions areexponential. Third, simply assuming that pro-cessing stages have certain finishing timedistributions avoids the crucial fact that bothlatency distributions and errors arise fromcommon stages or mechanisms, and thatperhaps the major theoretical effort shouldfocus on relating accuracy and latency.

Exponential distributions are an extremelyuseful tool for beginning an investigation butcan be somewhat misleading theoretically ifthey represent processing stage distributions ina mature model.

Random walk models. Several models ofchoice reaction time based on the randomwalk comparison process have been developedrecently (Laming, 1968; Link, 1975; Link &Heath, 1975; Stone, 1960). Here, I consider arecent formulation termed relative judgmenttheory (Link, 1975; Link & Heath, 1975) andnote some of the similarities and differences tothe retrieval theory. :

In relative judgment theory, a physio-logically transduced value of the stimulus iscompared to an internal referent or standard.During a unit of time, the difference obtainedis added to an accumulator of differences. Whenthe accumulated difference exceeds one of twosubject-controlled thresholds, a response ismade.

Relative judgment theory has several im-portant similarities to the theory developed in

JLMCC
Cross-Out
Page 16: Psychological Review - Stanford University

74 ROGER RATCLIFF

this article. First and foremost, the decisionprocess is modeled by a random walk, andinformation discriminating the two stimuli ismapped onto a single dimension. Second, thereare three criteria to be set by the subject: thevalue of the internal referent, correspondingto the zero point of relatedness in the retrievaltheory, and the two boundary positions. Thus,there are the same number of variable criteriain the two mathematical models. Third, thetwo theories are both concerned with the jointbehavior of accuracy and latency as a func-tion of independent variables.

It is the differences between the theories thatprovide most insight into their relation. First,the random walk in relative judgment theoryhas discrete steps, and the difference accumu-lated in a particular interval has some dis-tribution ; whereas the retrieval theory assumescontinuous processing. Mathematical expres-sions for the first passage time (latency) dis-tributions have not yet been obtained forrelative judgment theory, except in the case ofbinomial (or trinomial) and normal distribu-tions of differences, as they have for the con-tinuous random walk. For choice reactiontime, error responses are faster than correctresponses when accuracy is high, and relativejudgment theory accounts for this by assum-ing that the moment-generating function fordifferences is nonsymmetric. This assumption isequivalent to supposing that the individualstep distribution is positively skewed in thedirection of an incorrect response. In contrast,in the retrieval theory, the continuous randomwalk has the normal distribution underlyingperformance [essentially drift is distributedN(u, s)~\. Second, the internal referent inrelative judgment theory is assumed to have nosystematic drift over the course of testing.Thus, there is only one source of noise orvariance and that is in the comparison pro-cess. In the retrieval theory, there is a furthersource of variance and that is in memory-setitem-probe relatedness. Therefore, relativejudgment theory predicts that if the responseboundaries are set at infinity, discriminabilityd' increases as a function of the square rootof the number of steps. This prediction isquite reasonable if, for example, the choice isbetween two well-discriminated tones. Incontrast, variance in relatedness in the re-

trieval theory ensures that d' asymptotes as afunction of retrieval time. Third, relativejudgment theory has been developed to dealwith the simplest case, namely, a two-stimulus,two-response discrimination procedure. On theother hand, the retrieval theory has beendeveloped to deal with the problem of match-ing one item against many items in memoryand discriminating one match against manynonmatches. Fourth, relative judgment theoryhas provided somewhat cleaner experimentaltests than will be shown in the next section.However, it is somewhat difficult to comparethe power of the two sets of tests of the twotheories.

From this discussion, it can be seen thatthere is potential for making explicit a con-tinuity between memory retrieval and per-ceptual discrimination.

Counter models. The class of counter modelsprovides a serious competitor to random walkmodels. Perhaps the major difference betweencounter and random walk models is thatrandom walk models assume that positive andnegative differences cancel, whereas countermodels assume separate counters for positiveand negative evidence. Thus, for countermodels, termination occurs when one of thetwo counters exceeds a criterial count.

Counter models have been examined in somedetail by Pike (1973) with specific applica-tion to signal detection. In Pike's article,many of the main properties of counter modelsare listed and discussed. Anderson (1973)combines a counter model with a neuralnetwork model to account for results fromchoice reaction time studies and the Sternbergparadigm. Huesmann and Woocher (1976)have applied a counter model, together withparallel processing assumptions, to the Stern-berg paradigm.

One major problem with counter modelsconcerns the behavior of latency distributionsas a function of count criteria. When countcriteria are low (e.g., 1 or 2), latency distribu-tions are reasonably skewed. However, as thecount criteria increase (e.g., 4 to 10), the dis-tributions become more normal (Pike, 1973,p. 61). This contrasts with the prediction ofrandom walk models that the mode and meanof the latency distribution should diverge as

JLMCC
Cross-Out
Page 17: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 75

the random walk boundaries (criteria) aremoved apart.

At the moment, counter models as well asrandom walk models have not been investi-gated in enough detail, and further researchmay provide useful competitive theories forboth memory retrieval and perceptualdiscrimination.

Experimental Paradigms

Up to this point, I have set up the basicframework of the theory and derived thenecessary mathematical expressions. In thefollowing sections, the theory is applied tofive experimental paradigms.

Study-Test Paradigm

In a typical study-test experiment, 16stimulus words are presented for study, andthen 32 words (16 old and 16 new in randomorder) are tested for recognition, one at a time.The study phase is paced at about 1 sec peritem, and the test phase is self-paced, witheach test item staying in view until a responseis made. Responses are made on a 6-pointconfidence scale, and accuracy and latency arerecorded.

The study-test paradigm has been studiedrecently in some detail (Murdock, 1974;Murdock & Anderson, 1975; Ratcliff &Murdock, 1976), primarily as the empiricalbasis for the conveyor belt model (Murdock,1974) for item recognition. The conveyor beltmodel assumes that a record of the temporalorder of both study and test items is stored inmemory. In order to decide whether a testitem was in the study list or not, a high speedself-terminating backward serial scan is carriedout over the temporal record of both test andstudy items. The conveyor belt model dealswith retrieval from supraspan lists only andbrings coherence to a number of empiricalresults. However, the model has some problemsand limitations, and these are discussed inRatcliff and Murdock (1976). Because of thelarge amount of data collected in Murdock'sexperiments and because of the stability ofthe empirical effects, the study-test paradigmprovides a very strong empirical foundation forattempts to model recognition memory pro-

cesses. Ratcliff and Murdock (1976) provideda summary of empirical effects, and thissummary (see Figure 8) is the basis for muchof the following discussion. It should be notedthat the data actually come from a confidencejudgment procedure but are modeled as thoughthey came from a yes-no procedure. It turnsout that results from the two procedures aresimilar (Murdock, Hockley, & Muter, 1977),so this approximation seems reasonable.

To model the study-test paradigm, I willassume that a test (or probe) item is encoded,and parallel comparisons are made betweenthe encoded test item and members of thesearch set. The search set is approximated bythe 16-item study set, with the same normalprobability density function N ( v , y ) for eachnonmatching comparison (i.e., it is assumedthat only study items are accessed by thetest item and all nonmatching comparisonshave equal relatedness). The probabilitydensity function of relatedness for a matchingcomparison is N(u, i)).

Serial and Test Position Functions

The main empirical effects to be modeledare the behavior of latency and accuracy as afunction of study (or input) and test (oroutput) position for old items and of testposition for new items (see Figures 8b and 8c).Data are taken from Ratcliff and Murdock's(1976) Experiment 1.

Parameter variation. As noted earlier, whenvariation in some experimental variable canbe known to the subject prior to retrieval, thereis the possibility of changes in criteria as afunction of that experimental variable. Testposition is one such variable. As test positionincreases, forgetting of study items increases,and the subject may set more lenient criteria(increases a and z) in an effort to improveaccuracy. Thus, relatedness of probe-non-target item comparisons (ti) and diffusionprocess boundary parameters (z and a) canchange as a function of test position.

Relatedness of probe-target item compari-sons (u) varies as a function of test position, asdo it, a, and z. However, unlike 11, a, and z, uvaries as a function of study (or serial) position.The subject cannot alter criteria as a functionof serial position because knowledge of the

JLMCC
Cross-Out
Page 18: Psychological Review - Stanford University

76

(o) CONFIDENCE JUDGEMENTS

ROGER RATCLIFF

LATENCY A

«•+ CONFIDENCE JUDGEMENT

(b)TESTPOSITION

HITSLATENCY

ACCURACY

CORRECTREJECTIONS LATENCY

'ACCURACY

(c) PRIMACY EFFECT(INPUT POSITION)

OUTPUT POSITION

A

OUTPUT POSITION

ACCURACY

LATENCY

(d)RATE OF PRESENTATION

INPUT POSITION

ACCURACY

LATENCY

0.6PRESENTATION

(e) NUMBER OF fLATENCYSTIMULUSPRESENTATIONS

L5 RATE (SEC/ITEM)

'ACCURACY

LATENCY

OUTPUT POSITION '• ' > NUMBER OF2 PRESENTATIONS

( f )L IST LENGTH LATENCY64 ( LIST LENGTH )

OUTPUT POSITION

Figure 8. Schematic representation of functional relations obtained in the study-test paradigm (fromRatcliff & Murdock, 1976). (IP and 2P represent once-presented and twice-presented stimuli,respectively.)

study position of a probe item can only result position must both be predicted by changes infrom the retrieval of that item. Thus, changes (the single parameter u.in accuracy and latency as a function of serial The variance parameters s* and if are kept

Page 19: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 77

constant, as mentioned earlier, at (.08)2 and(.18)2, respectively. A single constant timeTEE is assumed for probe encoding, preparationfor comparison and decision processes, execu-tion of the decision process, and response out-put processes. These processes could beassumed to have some distribution of process-ing times with small variance, without alteringthe fits to distributions shown later.

Correct rejections. Figure 9 shows accuracy,mean reaction time, and reaction time distri-bution parameters /i and r for correct rejectionsas a function of test position, together withtheoretical fits and theoretical parameters v, a,

and z. The fits are quite acceptable, and it canbe seen that, indeed, criteria change as afunction of test position to compensate for theoverall drop in discriminability (forgetting).

Hits. Figure 10 shows accuracy and meanreaction time as a function of study (or serial)position, together with theoretical fits andvalues of relatedness u. From these fits, it canbe seen that the one-parameter u can fit bothaccuracy and latency and that changes inaccuracy and latency can be modeled bychanges in the single parameter u. Inspectionof Figure 10 shows that u behaves in a veryregular manner as a function of study and test

.94

.90

.86

.82

.78

74

.54

.53

.52

.51

-.38

-.40

-.42

ACCURACY MEAN LATENCY(sec)

(sec)

2-8 9-16 17-24 25-32

.84

.82

.80

.78

.76

.74

.30

28

26

24

16

15

038

036

2-8 9-16 17-24 25-32

TEST POSITIONFigure 9. Accuracy, mean reaction time, and reaction time distribution parameters (/u and T) as a func-tion of test position for the study-test paradigm, together with the theoretical parameters », a, and z,which are used to generate the theoretical fits. (Data are from Ratcliff and Murdock's, 1976, Experi-ment 1.)

Page 20: Psychological Review - Stanford University

78 ROGER RATCLIFF

LATENCY msec0/P 25-32

ACCURACY0/P 25-32 -4

•*!~* .2

V -^I t I T I l I I i I I i

0/P 17-24

0/P 9-16

RELATEDNESS U

4 6 8 10 12 W 16 " 2 4 6 8 10 12 14 16

STUDY OR SERIAL POSITION

2 4 6 8 0 12 14

Figure 10. Accuracy and mean latency as a function of study position blocked by test, or output, posi-tion (O/P), which is used to produce the fits. (Other theoretical parameters used in the derivationsare shown in Figure 9.)

position. There is a consistent primacy effectacross test (or output) position, and the strongrecency seen at early test positions has dissi-pated by later test positions. Therefore, itseems likely that one could reduce the numberof degrees of freedom by modeling the patternof behavior of u just as the pattern of accuracyperformance in free recall has been modeledas a function of serial position and delay (i.e.,test position; Murdock, 1974, chap. 8). I havenot attempted to fit such a functional relationbecause it adds little theoretically. Changes inreaction time distributions as a function ofserial position are mainly changes in spread ofthe distribution or the T parameter in theconvolution model (Ratcliff & Murdock,1976), and this pattern of results is predicted bythe theory.

Reaction time distributions. In the sectionon fitting the mathematical model to data, I

argued that the most satisfactory way to fitthe theoretical reaction time distributions todata (in the case of this theoretical model)is through the use of the convolution model asan empirical summary of shape. Further, itwas also argued that the best way to displayfits of the theory to data averaged oversubjects is by the use of group reaction timedistributions (see also Ratcliff, Note 3).Figure 9 shows both the theoretical andempirical values of ju (leading edge) and T (tail),parameters of the convolution model, as afunction of test position for correct rejections.It is these data that posed problems for thebackward serial scanning (conveyor belt) model(Ratcliff & Murdock, 1976). The conveyorbelt model predicts that most of the increasein mean latency will be accounted for by anincrease in latency of the whole distribution(/u parameter) rather than the tail (T param-

Page 21: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 79

eter). Although the theoretical distributionsare fitted to empirical data by using the ^ andT parameters (i.e., minimizing a function ofH and T) of the convolution model, the retrievaltheory makes the strong prediction thatchanges in mean latency will be manifestmainly as changes in T (i.e., slower responsesgetting slower; see Figure 4). In fact, patternsof results with changes in /* greater thanchanges in T as a function of test position wouldbe impossible to accommodate without largechanges in random walk boundary positionsa and 0. Large changes in a and 2 would changeerror rates and hit latencies in such away that the set of data could not be fittedsimultaneously.

Figure 11 shows fits of the theoreticaldistributions to the group reaction timedistributions for both hits and correct rejectionsas a function of test position. From thesegraphs, it can be seen that the theory does agood job of fitting the reaction timedistributions.

Error reaction times. It was noted earlierthat the data used in fitting the theory camefrom a confidence judgment procedure ratherthan a yes-no procedure. For correct reaction

times, it was reasonable to use the data fromhigh-confidence correct responses because re-sults were not changed much by addition oflower confidence reaction times. Anotherreason to use only the high-confidence re-sponses is that many lower confidence responsesmay be contaminated by other processes suchas more than one comparison or a slow guess.In contrast, there are relatively few errorresponses, and high-confidence responses aretypically 150 msec to 600 msec faster thanlow-confidence responses. The theory makespredictions (using parameter values obtainedearlier in this section for Ratcliff & Murdock's,1976, Experiment 1) about error reaction timesthat are not too unreasonable. For example,the theory predicts that miss reaction timesincrease with output position at about 5msec per item (by linear regression), with anintercept of about 860 msec. The data forhigh-confidence responses show a slope ofabout 11 msec per item, with an intercept of820 msec. All miss responses combined (high,medium, and low confidence) give reactiontimes about 200 msec longer. For false alarms,predicted slope is about 5 msec per item, andthe intercept is 1,200 msec. High-confidence

CORRECT REJECTIONS

TEST POSITION 2-8

TEST POSITION 9-16

TEST POSITION 17-24

TEST POSITION 25-32

HITS

TEST POSITION 2-8

TEST POSITION 17-24

TEST POSITION 25-32

Figure 11. Group reaction time distributions and theoretical fits to those distributions from Ratcliffand Murdock's (1976) Experiment 1.

Page 22: Psychological Review - Stanford University

80 ROGER RATCLIFF

Table 2Fits to Accuracy and Latency for Repeated and Nonrepeated Study Items in the Study-TestParadigm (Data are from Ratcliff & Murdoch's, 1976, Experiment 4)

Response

Correct rejectionOnce-presented hitTwice-presented hit

u v

— -.38.18 -.38.18 -.38

Theoretical

a Accuracy

.1 .74

.1 .69

.1 .87

Experimental

M

472455445

a"

373426

T

216190170

Accuracy

.74

.70

.84

M

470450450

(7

404040

T

220190160

Note, , a, and T are measured in msec; TEB = 350 msec and z = .02.

false alarms have slopes of 8 msec per item andintercepts of 830 msec, so the theory over-predicts high-confidence false alarms. However,adding lower confidence false alarms increasesreaction time by about 600 msec. Thus, thetheoretical error reaction times come near tothe data in some cases but are not so near inother cases.

Number of Stimulus Presentations

When items are repeated in the study list,I assume that separate representations areestablished. The assumption of separaterepresentations is made because of the prob-lems that a single-representation strengththeory appears to have (Anderson & Bower,1972; Wells, 1974). With two representations,there are two comparison processes with highrelatedness values, and these two processesrace to match. Thus, reaction time will befaster and accuracy higher than for singlypresented items, as is seen in the data (seeFigure 8e). To model these data, representativevalues of accuracy and latency for once-presented items are chosen, model parametersfor those values are deduced, and then ac-curacy and latency for the two racing processesare computed. To compute fits for twice-presented items, it was assumed that the tworacing processes had equal u values, andEquation A22 was modified appropriately.This approximation is reasonable, taking intoaccount the input or serial position functionsshown in Figure 10, because in general, uvalues are not too different at different serialpositions. Table 2 shows theoretical fits andexperimental results for accuracy and latencydistributions. The only discrepant result is the

slight theoretical overestimation of accuracyof twice-presented items.

Rate of Presentation: Dangers Inherent in"Between" Designs in Reaction TimeExperiments

A major problem for strength- or familiarity-based theories of retrieval is the result shownin Figure 8d. As rate of presentation of studyitems decreases, accuracy increases; but re-action time increases instead of decreasing,as a strength theory would predict. In theexperiment from which Figure 8d was derived(Ratcliff & Murdock's, 1976, Experiment 2),rate of presentation was a between-sessionsvariable. The retrieval theory would predictthe same result as a strength theory if allcriteria were constant across conditions. How-ever, there is a way out for the retrieval theory,and that is to suppose that criteria changewith conditions.

As discussed earlier, one of the strongertests of the retrieval theory is made whencriteria cannot be adjusted to experimentalconditions. Thus, a strong test of the theorycan be made if presentation time can be madea within-trials variable. Then, only relatednessu can vary as a function of presentation timeper item, and variation in both accuracy andlatency must be predicted by variation in ualone. If the results shown in Figure 8d holdup when presentation time per item is made awithin-trials variable, then the theory in itspresent form must be rejected. In Experiment1, presentation time per item is made a within-trials variable.

JLMCC
Cross-Out
Page 23: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 81

Experiment 1

Method. A standard recognition memory procedurewas employed. A list of 16 study items was presented,followed by a test list containing all the study itemsand an equal number of new items in random order.The lists were random samples from the Universityof Toronto word pool, a collection of 1,024 two-syllablecommon English words not more than eight letterslong, with homophones, contractions, and propernouns excluded. Each trial consisted of a random selec-tion from the word pool. Each session had 32 lists, andthere were no repetitions within a session. List genera-tion, display, and response recording were controlledby a PDP-12A laboratory computer. In each studylist, there were four different study times (.5, .8, 1.2,and 1.8 sec per item), and four study items (in randompresentation order) were assigned to each of four studytime conditions per list. The study list was terminatedby an instruction asking the subject to press a responsekey to start the test phase. The test phase was self-paced, and items stayed in view until a response wasmade. A confidence judgment procedure was employed,and the subjects had to respond on a 6-point scale from

("sure new") to +++ ("sure old") by pressingthe appropriate response key. For each item tested,input and output position (output position only fornew items), confidence judgment (i.e., key pressed),and latency (stimulus onset to response key depression)were recorded. A S-msec time base was used. The foursubjects were undergraduates in psychology at theUniversity of Toronto and were paid $30 for the 12sessions plus 1 practice session.

Results. Figure 12 shows the main result:When presentation time per item is a within-trials variable, then the longer an item ispresented, the more accurate and faster is theresponse to that item. This result contrastswith the result found in Ratcliff and Murdock's(1976) Experiment 2 (shown in Figure 8d),where responses became more accurate butslower as duration of presentation increased.Also shown in Figure 12 are values of u as afunction of presentation time per item andpredicted values of accuracy and reaction time.Thus, the theory makes the correct predictionand is supported by the experimental results.

Implications for reaction time research. Be-sides providing a critical test of the retrievaltheory, these results on presentation time peritem have important consequences for themethodology of reaction time research. It hasoften been argued that subjects in reactiontime experiments can adopt different speed-accuracy criteria under different experimentalconditions, and subjects have been inducedto do so in many experiments (e.g., Banks &Atkinson, 1974; Lively, 1972). However, inmost reaction time experiments, there does notseem to be much evidence of criteria being

.86

o:§.82

.80

650-

enE

:630

620

V = -0.4020 : 0. 126Z =0032TER= 297ms

CORRECT REJECTIONS

PREDICTED EMPIRICALRT 7&4m« 707i2m»

ACCURACY .844 .848± .003

30

0-28 COQj

§

0.5 1.0 1.5 2.0 SEC 0.5 1.0 1.5 2.0 SEC

PRESENTATION TIME PER ITEM

Figure 12. Accuracy, mean latency, and relatedness u as a function of presentation time per item forExperiment 1. (Theoretical parameters v, a, z, and TBR are shown in the figure. RT represents meanlatency; ms = msec.)

JLMCC
Cross-Out
Page 24: Psychological Review - Stanford University

82 ROGER RATCLIFF

adjusted as a function of experimental condi-tion. For example, Wickelgren (1977) states,

To be completely fair, we do not know for certain thatsubjects' capability to achieve any degree of speed-accuracy tradeoff (from chance to asymptotic accuracy)under speed-accuracy tradeoff instructions impliesthat this capacity is used under reaction time instruc-tions. Perhaps in reaction time experiments, all sub-jects use the same speed-accuracy criterion and use itunder all conditions, (p. 81)

The above results on presentation time showthat a theoretically important functionalrelation (latency as a function of presentationtime per item) contradicts theories whenstudied in a between design yet supports thesame theories when studied in a within design.There are two important consequences: (a)In any reaction time experiment where criteriacan be changed as a function of experimentalcondition, any differences in reaction time asa function of condition may be interpretableonly as a change in speed-accuracy criteriaand nothing more, (b) Theories of reactiontime should be flexible enough to deal withcriteria changes such as those demonstratedabove.

Study-Test Paradigm: Summary

The study-test paradigm provides a firmempirical basis for modeling item recognitionprocesses. The retrieval theory models resultsby assuming that parallel comparisons betweenthe probe (test item) and members of thestudy list are made. The decision process isself-terminating on matches and exhaustiveon nonmatches. Test or output position func-tions are well fitted by assuming that v,relatedness of nontarget comparisons, de-creases and diffusion process boundaries in-crease a little with increasing test position.Latency and accuracy as a function of studyposition show serial position functions withboth primacy and recency. For fixed outputposition, only the relatedness parameter «can vary with serial position, and fits to bothaccuracy and latency with just this oneparameter varying are good. Latency distri-butions and error latencies are also welldescribed by theoretical predictions.

Ratcliff and Murdock (1976) presented anexperiment in which number of stimulus

presentations was varied. The theory was ableto account for results from this experiment byassuming separate representations, so thatthere were two processes with high relatednessracing to the match boundary. Both latencyand accuracy were well fitted by results derivedfrom this assumption.

An investigation into rate-of-presentationeffects exposed important methodologicaldangers in between designs involving the use oflatency measurements. Ratcliff and Murdock(1976) showed that both accuracy and latencyincreased as rate of presentation was decreasedin a between-sessions design. The latency effectis contrary to many theories, including thetheory developed here, if it is assumed thatspeed-accuracy criteria are kept constant.The present Experiment 1 varied presentationtime per item within trials. Accuracy increasedas presentation time increased, as before, butreaction time decreased. Therefore, in betweendesigns, it is possible for the subject to changespeed-accuracy criteria and thus to makebetween latency comparisons subject tomisinterpretation.

Sternberg Paradigm

The Sternberg paradigm is the most studiedparadigm in memory research that employslatency measures. The method is simple: Asmall number of items (digits, words, pictures,and so on) within memory span is presentedto the subject one at a time. A probe item isthen presented, and the subject has to decideif that item was in the set of study items or not.The subject pushes one button for a "yes"response, another for a "no" response, andreaction time and accuracy are recorded. Theresult that has been of most interest theoreti-cally has been the reaction-time-set-size func-tion. Often, reaction time has been a linearfunction of set size, with equal slopes for"yes" and "no" responses. This result ledSternberg (1966) to develop a serial exhaustivescanning model. The model is easily testable,and several serious problems have been found;for example, the model is unable to deal withserial position effects (Corballis, 1967; Cor-ballis, Kirby, & Miller, 1972), repetition effects(Baddeley & Ecob, 1973), stimulus andresponse probability effects (Theios et al.,

JLMCC
Cross-Out
JLMCC
Cross-Out
Page 25: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 83

1973), and nonlinear set-size effects (Briggs,1974). There are alternative simple models thathave been set up to account for subsets ofthe empirical findings, but it seems that eachof these is inconsistent with some data (Stern-berg, 1975), Thus, we find an area of researchin which experiments are relatively easy toperform and in which each of the competingmodels (in an unelaborated form at least) isfalsified.

There are more serious problems though.First, it is generally found that error rate andmean reaction time covary (Banks & Atkinson,1974; Forrin & Cunningham, 1973; Miller &Pachella, 1973), but little attempt has beenmade to relate accuracy and latency theoreti-cally (Wickelgren, 1977). Second, it seemslikely that each of the models could be elabo-rated to account for most of the empiricaleffects (Townsend, 1974), as long as meanreaction time is the only statistic considered,so that no resolution at this level of analysiscan be expected. Third, the main statisticused to describe reaction time is mean reactiontime. Because of this emphasis, few of themodels are able to account for properties ofreaction time distributions (Sternberg, 1975).This is important because distributionalproperties may prove to be decisive in evalu-ating models (Ratcliff & Murdock, 1976;Sternberg, 1975, Note 1, Note 2). For reviewsof this area of research, see Corballis (1975),Nickerson (1972), and Sternberg (1969a,1969b, 1975).

To model the Sternberg paradigm, I will usethe same assumptions used in modeling thestudy-test paradigm. The probe is encoded,and parallel comparisons are made betweenthe encoded probe and members of the searchset. The search set is approximated by thememory set, each item having the samenormal probability density function N(v, TJ)of relatedness for nonmatching comparisons.The probability density function of relatednessfor a match comparison is N(u,ri), where uvaries with serial position and set size, v, a,and 2 may vary with set size, and ij and s arekept constant at the same values used in thestudy-test paradigm. As before, a singleconstant time is assumed for probe encoding,response output, and so on.

Experiment 2 was performed to demonstrate

the way in which the theory is applied to theSternberg paradigm. The experiment wasdesigned to replicate typical patterns of resultsfound in the varied-set procedure while maxi-mizing conditions for serial position effects(fast presentation rate and short probe delay).Serial position effects were sought becausechanges in accuracy and latency as a functionof serial position can only be the result ofchanges in the single relatedness parameter u,as all criteria are fixed. Two subjects wereeach tested for eight experimental sessions inorder to obtain reasonable estimates of errorrates and reaction time distributions.

Experiment 2

Method. Two paid student volunteers served assubjects. Sequence generation, display, and responserecording were controlled by a PDP-12A laboratorycomputer. The memory set was either three, four, orfive randomly selected digits, from the set zero to nine,with no repetitions. The memory set was presentedsequentially at .5 sec per digit. One-tenth secondfollowing the presentation of the last digit, a fixationpoint was displayed, which remained on for .4 sec. Theprobe digit immediately followed and remained on untila response was made. Each list length was presentedequally often, and each serial position within each listwas probed equally often. Also, there were equal num-bers of old and new probe items. There was a 3-sec delaybetween trials, with a fixation point appearing for thelast .5 sec of that period. Responses were made bydepressing one of two buttons on a response box, righthand for "yes" responses and left hand for "no"responses. The subjects served in 8 sessions precededby 1 practice session. Each session consisted of 10 blocksof 48 trials. Subjects were instructed to be as fast aspossible while maintaining high accuracy. Feedbackon accuracy performance was presented at the end ofeach session.

Results. Figure 13 shows mean latency as afunction of set size for hits and correct rejec-tions, together with linear least-squares re-gression lines. These show that the basicfinding, namely, parallel, linear set-size-latencyfunctions, is replicated. Figure 13 also showsserial-position-latency functions, which havea large recency effect and replicate Burrowsand Okada (1971), Corballis (1967), Corballiset al. (1972), and Forrin and Cunningham(1973).

Figure 14 shows detailed fits of the mathe-matical model to the data. As noted earlier,the mathematical model was fitted to thesummary statistics (proportion correct and

Page 26: Psychological Review - Stanford University

84 ROGER RATCLIFF

RT

680

660

640

620

600

580

560

540

520

SET SIZE 3-*-1 T SIZE 4

2 3 4 5

SERIAL POSITION

RT

680

660

640

620

600

580

560

540

CORRECT REJECTIONS

^22s+586

RT=l8s + 531

1 2 3 4 5

SET SIZE (s)Figure 13. Latency-set-size functions and latency-serial-position functions for Experiment 2. (Error barsare one standard deviation. RT represents meanlatency in msec.)

latency distribution parameters /* and T) byweighted least squares for hits by set size andserial position and for correct rejections byset size. The encoding and response outputparameter TEE was kept constant across allthese conditions, but it also turned out thatto a good approximation, random walk bound-ary criteria a and z and relatedness of probe-nontarget item comparisons v were constantacross set size. Thus, latency and accuracyfor correct rejections changed as a functionof number of items in the memory set (noadditional degrees of freedom once a, z, and vwere fitted). Latency and accuracy for hitsvaried only as a function of probe-target itemrelatedness u (and memory set size); thus,across serial position, the single parameter uvaried to produce fits to the three parameters

determined from the data, that is, latencydistribution parameters n and T and accuracy.The fits to the data, although not perfect,show no more than four deviations of theoryfrom data greater than about two standarddeviations out of 45 fitted statistics.

Figure 15 shows fits of the theoreticaldistributions to the group reaction timedistributions. The theoretical distributionscapture the main features of the empiricaldistributions, and changes in distribution shapeas a function of set size and serial position arewell modeled by changes in the size of memoryset and relatedness u.

Model Freedom

At this point, it may be argued that theSternberg exhaustive serial scanning model hasfar fewer parameters than the random walkmodel presented above. If mean reaction timeas a function of set size is to be fitted, then theSternberg model has 3 parameters and theretrieval theory (for the above data) has 16parameters (TEE, V, a, z, and 12 values of u)or 7 if serial position effects are averaged.However, if distributions are to be modeled,the number of parameters in the Sternbergmodel jumps to 9 (Sternberg, Note 1): 4parameters for the comparison stage distri-bution and 4 parameters for the base (en-coding, decision, and response output) distri-bution (plus one parameter for the yes-nointercept difference). Further, if error ratesas a function of set size are to be fitted, thenperhaps as many as 6 more parameters haveto be added. The number of parameters in thetwo models is now comparable. If serial positionfunctions are considered, then as noted earlier,the Sternberg model is falsified, but theretrieval theory does a good job. Thus, theproblems of model freedom and model com-parison are not as simple and straightforwardas might be imagined.

Perhaps the major difference between theretrieval theory and the Sternberg model canbe summarized as follows: The retrieval theory(unlike the Sternberg model) provides anintrinsic tie-up between reaction time andaccuracy, whereas the Sternberg model (un-like the retrieval theory) can make the advance

Page 27: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 85

prediction that the reaction-time-set-size func-tion is linear.

Further reduction in the number of param-eters in the retrieval theory might be accom-plished if the relation between relatedness uand serial position could be described by afew-parameter functional relation, for example,exponential decay. I have not attempted tofit such a functional relation because atpresent it adds nothing theoretically. However,

if a theory of recency effects were developed,then the prediction of relatedness as a functionof serial position would be an important test ofthe conjunction of that theory with the re-trieval theory.

Error Reaction Times

In typical reports of Sternberg varied-setexperiments, error reaction times are tarely

HITS LISTLENGTH 3

HITS LISTLENGTH 4

HITS LISTLENGTH 5

CORRECTREJECTIONS

V = -0.4710 = 0 . 237Z = 0.099T£R= 259 mMC

4 5 1 2 3 4 5

SERIAL POSITION

2 3 4 5—I—I—i—o_

2 3 4 5

SET SIZE

Figure 14. Accuracy, latency distribution parameters (from the convolution model) tt and T, and related-ness u as a function of serial position and list length for hits, together with accuracy and latency dis-tribution parameters n and T as a function of set size for correct rejections in Experiment 2. (The filledcircles represent the theoretical values. For correct rejections, only number of processes in the search setchanges with set size; for hits, only relatedness u changes with serial position. All other theoreticalparameters [v, a, z, and TER] are fixed at the values shown in the bottom right-hand panel. Error barsare one standard deviation, ms = msec.)

Page 28: Psychological Review - Stanford University

86 ROGER RATCLIFF

SET SIZE 3CORRECT REJECTIONS

ffrhSET SIZE 4

CORRECT REJECTIONSSET SIZE 5

CORRECT REJECTIONS

SET SIZE 4HIT S.P1

SET SIZE 5HIT S P. 1

J • •

d °m

SET SIZE 3HIT S.P. 2

SET SIZE 4HIT S P. 2

L

SET SIZE 5HIT S.P.2

lit,SET SIZE 3HIT S.P 3

SET SIZE 4HIT S.P. 3

SET SIZE 5HIT S.P. 3

SET SIZE SS P.5

SET SIZE 4S P 4

4 5 6 7 .8 9 10

REACTION TIME (SEC)

.9 IO II

Figure IS. Group reaction time distributions (bar graphs) and theoretical fits (dots) for correct rejectionsby set size and for hits by set size and serial position (S.P.) for Experiment 2. (These data and fitsshould be compared with the distribution parameters M and ^ in Figure 14.)

presented because error rates are usually low,so that error latencies are unreliable, and alsobecause error latencies are not of theoreticalinterest. However, error latencies can beuseful in deciding between theories (Corballis,1975; Murdock & Dufty, 1972). Corballis

(1975) reported that in his experiments usingthe Sternberg procedure, error latencies weregenerally smaller than latencies for correctresponses. In Experiment 2 above, errorlatencies for Set Sizes 3, 4, and 5 were 940msec, 760 msec, and 1,005 msec for misses

Page 29: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 87

(average of 902 msec) and 1,307 msec, 909msec, and 1,055 msec for false alarms (averageof 1,090 msec), respectively. Predictions ofmean reaction time from the theory are fartoo large. For typical values of parametersused to fit Experiment 2 (« = .435, v = — .437,a = .237, z = .099, and Set Size 4), reactiontime for false alarms is 2.05 sec; for misses,it is 1.59 sec.

Error latency depends to a large extent onthe tails of the relatedness distributions. There-fore, if the choice of normal distributions wereinappropriate, everything else in the theorymight still be tenable. For example, supposethe relatedness distributions have extremelylong tails, so that about 1% to 2% of the totalprobability density resides in these tails. Then,theoretical values of latency and accuracy ofcorrect responses may not be alteredmuch, but error latency may change quitesignificantly.

Figure 16 demonstrates the effect of theshape of the distribution and tails of thedistribution on latency and accuracy. Forreference, the bottom panel of Figure 16 showsthe normal distribution of relatedness for anonmatching comparison with correct and errorlatencies and error rate. The middle panelshows the same normal distribution, with 3%of the probability density removed and addedas a rectangular distribution over the range ofrelatedness — 2 to +1. The error rate isincreased a little; correct latency is the sameas correct latency in the bottom panel, buterror latency is reduced to 1.3 sec, which isnear the experimental result.

The top panel of Figure 16 shows just howfar it is possible to change the assumption ofnormal relatedness distributions without affect-ing correct reaction time or accuracy. Thelarge rectangle contains 97% of the probabilitydensity; and as in the middle panel, the smallrectangle, between —2 and +1 relatedness,contains 3% of the probability density. Withthis assumed relatedness function, meanreaction time for correct responses decreases to.62 sec, with the leading edge (M parameter inthe convolution model) of the reaction timedistribution increasing by 30 msec and the rparameter in the convolution model decreasingby 50 msec. The reaction time distribution isstill very similar in shape to those shown in

§ 2>

3m i

CORRECT RT..62secERROR RATE* 1%ERROR RT = .62 sec

CORRECT RT = .66secERROR RATE = 3 %ERROR RT = 1.30 sec

CORRECT RT*.66secERROR RATE" 2%ERROR RT = 2.05 sec

-1.0 -.5 0 .5RELATEDNESS

1.0 1.5

Figure 16. Effect of the shape of the relatedness dis-tribution on correct and error reaction time (RT)and error rate. (The bottom panel shows a normaldistribution of relatedness; the middle panel shows anormal distribution, with 3% of the probability densityforming a rectangular distribution between —2 and +1relatedness; the top panel shows a rectangular distribu-tion of relatedness with the same 3% tail as in themiddle panel).

Figure 15. The error rate drops to 1%, all errorscoming from the tail of the small rectangulardistribution that extends into positive related-ness. But now error latency has decreased to620 msec. The assumption of rectangulardistributions has some precedent in that Luce(1963) and Krantz (1969) have both usedrectangular distributions in high-thresholdmodels to account for signal detection.

There are three conclusions that can bedrawn from this exercise. First, reaction timedistributions for correct responses are ratherinsensitive to the shape of the relatednessdistributions so long as there is a lump ofprobability density around the appropriaterelatedness value. Second, error rates dependon the amount of probability density near andgreater than the zero point of relatedness.Third, error latency depends critically on the

Page 30: Psychological Review - Stanford University

88 ROGER RATCLIFF

Table 3Theoretical Predictions for Latency and Accuracy oj Hits for Repeated Itemsfor Parameter Values in Experiment 2

Response type

Once-presented hitTwice-presented hitOnce-presented hitTwice-presented hitOnce-presented hitTwice-presented hit

u

.35

.35

.435

.435

.55

.55

Accuracy

.966

.999

.9881.0.998

1.0

Meanlatency

686566605514592465

M

433429420415404399

a

474442383632

r

25313718599

12566

Note. Mean latency, /i, <r, and T are in msec. Other parameter values are as follows: TER = 259 msec, v= — .471, a = .237, z = .099, and memory set size = 4.

shape of the tail of the relatedness distributionnear the zero point of relatedness. Thus,reaction time distributions and error rates arereasonably independent of relatedness shapeso long as there is a lump of probability densitynear the mean relatedness value of the normaldistribution used to fit the experimental dataand so long as there is a reasonable amount ofoverlap. This shape-independent propertysuggests that the correct focus of the model ison error rates and correct response latenciesrather than on error latencies.

Reed (Note 4) has some evidence that thesignal distribution (in a signal detectionanalysis of bilingual recognition) changes shapeover the time course of recognition. The signaldistribution starts off as multimodal, with asignal peak in the middle of the noise distri-bution, and then quite quickly becomesunimodal. This kind of change in relatednessdistributions (the theoretical equivalents ofsignal and noise distributions in signal detec-tion analysis) would give rise to fast errors inthe retrieval theory and so may be an alter-native to non-normal unimodal relatednessdistributions. Although this finding is a littletentative, little is known about the time courseof recognition, and we should at least keep suchpossibilities in mind.

One further point is that just as very longoutlier reaction times are often assumed to bespurious, a certain proportion of errors mayalso be spurious. For example, subjects oftenreport that they simply hit the wrong button.Therefore, to get meaningful and stable errorlatency data, it is better to work with aparadigm with reasonably large error rates,

such as the study-test paradigm, rather thana paradigm with error rates around 1% of thetotal responses.

Repetition Effects

Baddeley and Ecob (1973) studied perform-ance on memory sets containing repetitionsusing a varied-set Sternberg procedure andfound faster recognition for repeated items.The effects on reaction time (50 msec to 100msec) were larger than those observed in thestudy-test paradigm (30 msec to 40 msec;see Table 2). Changes in accuracy were notreported, probably because accuracy was near100%, and there were not enough observationsto allow the detection of significant differences.

There are two possible ways to model theeffect of repetitions. For repetition effects inthe study-test paradigm, I assumed twomatching racing processes. The alternative isto assume one representation twice as strong,that is, a compressed memory set (presentationof Digits 3238 would lead to Memory Set 238with 3 stronger than 2 or 8). However, Baddeleyand Ecob (1973) ruled out this possibility bynoting that nonrepeated items from sequenceswith repetitions were no faster than itemsfrom sequences (of the same size) withnonrepetitions.

To model the repetition result, I tookparameter values used in Experiment 2 andcomputed accuracy and latency for one andthen two racing matching processes. Results arepresented in Table 3 and show that twice-pre-sented items are between 64 msec and 120 msecfaster than once-presented items. Thus, the

JLMCC
Cross-Out
Page 31: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 89

theory gives quite good quantitative agreementwith the result presented by Baddeley andEcob (1973).

Discussion and Summary of the SternbergParadigm

There are many variations on the Sternbergparadigm published in the literature, but noneprovide qualitative problems for the theory.For example, stimulus probability effects inthe fixed-set procedure have provided problemsfor the serial exhaustive scanning model(Miller & Pachella, 1973; Sternberg, 1975;Theios et al., 1973; Theios & Walter, 1974).In Link's (1975) random walk model forchoice reaction time, stimulus probabilityeffects provide the basis of a successful testof the theory. In effect, stimulus probabilityaffects the random walk boundary positions.Therefore, it seems likely that the memoryretrieval theory could account for stimulusprobability effects in much the same way.

There have been several speed-accuracystudies using the Sternberg paradigm, butbecause the question of speed-accuracy trade-off is central to the retrieval theory, the topicis discussed later in a separate section.

To summarize, the Sternberg paradigm ismodeled in much the same way as the study-test paradigm. Parallel comparisons betweenthe probe and members of the search set aremade. The decision process is self-terminatingon matches and exhaustive on nonmatches.The search set is approximated by the memoryset, even though representations of items fromearlier trials (with probe-item relatednesslower than probe-memory-set item related-ness) may have to be included to account foreffects of recency of negative probes (Atkinsonet al., 1974). An experiment was performedto demonstrate fits of the mathematical modelto data. All criteria were fixed as a function ofset size, and changes in accuracy and latencyof correct rejections as a function of set sizewere well fitted as a result of search set sizechanging (no free parameters to produce thosechanges). The covariation of accuracy andthe shape of latency distributions as a functionof serial position was well accounted for bychanges in the single parameter u (relatednessfor a probe-memory-set item comparison).

The fits to error latency were poor, and thisled to an investigation of the dependence offits of the model on the form of the relatednessdistributions. Results showed that meanlatency and the shape of latency distributionsfor correct responses are rather insensitive tothe shape of the relatedness distribution solong as there is a lump of probability densityaround the appropriate relatedness value;further, error rate does not depend on the shapeof the relatedness distribution but on theamount of probability density around the zeropoint of relatedness; finally, error latency isdependent on the tail of the relatednessdistribution near the criterion. Although thefits to error latency were poor, it was shownthat better fits could be obtained by changingthe shape of the relatedness distribution, andsuch changes could be made without affectingpredictions about correct latency or error rate.Therefore, until a theory of the structure of thememory trace can be developed that willspecify the relatedness distribution, the mostuseful data for testing the mathematical modelappear to be error rates and latency distri-butions for correct responses.

Supraspan Prememorized List Paradigm

It has been argued that in order to comparereaction time results from experiments usinglong lists (above memory span) with resultsfrom experiments using short lists (subspan),error rates should be low and approximatelythe same. To produce low error rates with longlists, prememorized lists have been employed.An experiment that shows the way in whichthe theory applies to the prememorized listparadigm is the Burrows and Okada (1975,Experiment 1) study. This experiment usedthe Sternberg fixed-set procedure, with pre-memorized memory-set items and list lengthvarying from 2 to 20. Each word in the positiveset was tested once in the test sequence forthat list, together with an equal number ofnew words. Subjects had to press one buttonto indicate that the test word was in the studylist, another button to indicate a new word.

The results showed parallel slopes for positiveand negative responses as a function of set sizeand low error rates. Reaction time as a functionof set size showed an apparently discontinuous

JLMCC
Cross-Out
Page 32: Psychological Review - Stanford University

90 ROGER RATCLIFF

function. Subspan reaction times were fittedby a linear function with a slope approximatingthat typically found in the Sternberg paradigm(35 msec to 55 msec per item). Supraspanreaction times were fitted by a linear functionwith a much lower slope (13 msec per item).It was argued that the discontinuity wasevidence for separate long-term and short-termretrieval processes. However, in the samearticle, it was shown that a continuous function(logarithmic) did almost as good a job offitting the data as the two linear functions.

I now demonstrate that the data from theBurrows and Okada (1975) study can be wellfitted by the random walk retrieval theory,which assumes only one kind of retrievalprocess. It should be noted that in terms of theretrieval theory, it is not necessary to havelow error rates in studying retrieval from longlists. The study-test paradigm uses supraspanlists, and error rates can be as high as 25%.

In order to fit the mathematical model tothe data, it is necessary to estimate reactiontime distribution parameters using the em-pirical convolution model, as was done withother paradigms. There were between 70 and90 observations per set size per subject, whichis barely enough to allow stable estimationsof distribution parameters. The fits of theconvolution model were performed, and it wasimmediately apparent that the a parameter(the standard deviation in the normal com-ponent of the convolution) was far larger thanin any fits to any other paradigm: The averagevalue of a across subjects and set size was 104msec. The large value of a (and inspection ofthe distributions) showed that the reactiontime distributions had slowly rising leadingedges, sometimes with short outlier reactiontimes.

I was at a loss to explain why these reactiontime distributions were much less skewed thanusual until I noted that the test words werepresented verbally. A recent article by Morton,Marcus, and Prankish (1976) suggests dangersin this procedure. They demonstrated that ifthe physical onset of a series of words is fixed,the time for perception of the words can beextremely variable. For example, the delaybetween onset and perceptual center of thespoken digits "seven" and "eight" differs by80 msec. Burrows and Okada used two-

syllable words, and it is likely that the per-ceptual centers of those words could vary evenmore than 80 msec. Of course, if the mainreaction time statistic considered is meanreaction time, then there is no problem—justan addition of a little noise—but if distributionsare considered, then the use of verbal probescan lead to fast (as well as slow) outliers.

To obtain better estimates of distributionparameters, I trimmed off 5% of the fastreaction times and refitted the convolutionmodel. The average value of cr was reduced by47 msec to 57 msec. The main effect on theother distribution parameters was to increaser, leaving /t about the same as before trimming.The retrieval theory was applied as in thestudy-test and Sternberg paradigms. Parallelcomparisons are made between the encodedprobe and members of the search set. Thesearch set is approximated by the experimentalmemory set, and each item has a normaldistribution of relatedness N(v, r]) for non-matching comparisons. Each memory-set itemhas a normal distribution of relatednessN(u,rj) for matching comparisons, and thevariance parameters r? and s are kept constantat the values assigned earlier.

The fits of the theory to the data are shownin Figures 17 and 18. Figure 17 shows fits tothe empirical reaction time distribution param-eters n and T for hits and correct rejections.The error bars represent the maximum likeli-hood standard deviation estimates in param-eters but do not take into account any subject-to-subject variability. Figure 18 shows fits ofthe theoretical error rates to data, togetherwith behavior of theoretical parameters u, v, a,and z as a function of set size. The fits of thetheory to data are by no means perfect. Thefact that the theory cannot be made to givebetter fits demonstrates quite clearly that thereare fairly powerful nonobvious constraints onthe theory (before the fast reaction times weretrimmed, the best fit was much worse; pre-dicted miss rates were half the size of thoseshown in Figure 18).

Figure 18 shows values of u and v that allowa d' measure of discriminability to be calculatedfrom d' = (u — v)/ri. d' starts at a valueof 5.3 at Set Size 2 and decreases to 4.8 bySet Size 10 and remains at that value to SetSize 20. The constancy of d' for longer lists

JLMCC
Cross-Out
Page 33: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL

can be interpreted as showing that the listprememorization criterion produced equiv-alent relatedness levels for the larger set sizes.The continuous change in d' as a function ofset size shows quantitative continuity betweensubspan and supraspan retrieval in itemrecognition.

Much of the research done using thisparadigm has been performed within the frame-work of the Atkinson and Juola (1973) model.In the experimental procedure used byAtkinson and Juola (1973), a study list wasmemorized prior to the experimental session.In any block of trials, test items consisted ofboth new positive items and new negativeitems, together with all previously presenteditems. Results showed that repetition ofpositive items made them faster and moreaccurate, whereas repetition of negative itemsmade them slower and less accurate. The wayin which the theory would deal with theseresults can be seen by referring to Figure 2.First presentations of positives and negatives

msec800

TOO

600

900

t J50msec

300

250

200

150

HITS

would correspond to Processes 2 and 4,respectively, and repeated presentations wouldcorrespond to Processes 1 and 3, respectively,thus leading to the observed pattern of results.

Several studies have investigated the situa-tion in which there are different sets of memory-set items stored in long- and short-termmemory (Forrin & Morin, 1969; Scheirer &Hanley; 1974; Sternberg, 1969b; Wescourt &Atkinson, 1973). The general consensus seemsto favor models that postulate parallel accessto short- and long-term sets (Wescourt &Atkinson, 1976). For example, Wescourt andAtkinson (1973) used a procedure in which a30-word long-term set was memorized pre-experimentally. In an experimental trial,subjects were presented with a short-term setof one to four words (or zero in a controlcondition) before each test word. Test wordscame from either the short- or long-term set,with each having a probability of .25, or thetest word was a negative. The set-size-latencyfunction for probes from the short-term set

CORRECT REJECTIONS

2 4 6 8 10 12 W 16 * 20 2 4 6 8 K) 12 14 16 18 20

SET SIZE

Figure 17. Reaction time distribution parameters (jn and T) as a function of set size in Experiment 2 (Bur-rows & Okada, 1975), together with theoretical fits (continuous lines). (Note that TBR = 420 msec. Errorbars are one standard deviation.)

JLMCC
Cross-Out
Page 34: Psychological Review - Stanford University

92 ROGER RATCLIFF

.04

' - - ' ! I ' L-

3 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20SET SIZE

Figure IS. Error rates as a function of set size in Experiment 2 (Burrows & Okada, 1975), together withtheoretical fits (continuous lines). (Also shown are theoretical parameters «, v, a, and 2 as a functionof set size. Note that TEE = 420 msec. Error bars are one standard deviation.)

was roughly linear with a slope of 21 msec peritem. In contrast, the latency function forprobes from the long-term set (as a functionof short-term set size and excluding Set Size 0)was nearly flat, 1 msec per item slope, andabout 100 msec slower than response to short-term probes. Negatives showed the samepattern of results with a slope of about 5msec per item and responses 40 msec slowerthan responses to long-term probes. Thespecific model proposed by Wescourt andAtkinson (1973) is similar to the decisioncomponent of the retrieval theory, namely,parallel independent searches of the short-termand long-term sets, self-terminating on posi-tives and exhaustive on negatives. To make thetwo models identical, it is only necessary toassume independent and parallel probe-memory-set item comparisons within both thelong-term and short-term stores. Therefore,because processing is the same for both setsof items, there is no basis for separating short-and long-term stores in terms of processing.

Another finding that is consistent with thetheory is the result that when material in theshort-term set is discriminable from materialin the long-term set, there is no effect of long-term set size on reaction time to short-termitems and vice versa (Scheirer & Hanley,

1974). In contrast, when material in the twosets is not highly discriminable, latency is ajoint function of short- and long-term set size.

Summary

Results from prememorized list proceduresare modeled in the same way as results fromthe Sternberg and study-test procedures.Application of the theory to the Burrows andOkada (1975) study, in which list length wasvaried from subspan to supraspan, showedthere is no need to assume separate retrievalprocesses for subspan and supraspan itemrecognition. Furthermore, in experiments whereshort-term and long-term sets are searchedsimultaneously, it seems that the best modelfor the data assumes parallel search of bothsets in line with the retrieval schemes developedin this article.

Continuous Recognition Memory Paradigm

In the continuous recognition memoryparadigm, single words are presented, and thesubject has to press one button if the testword is new and another if the test word waspresented earlier in the test list. This task issomewhat different from the three paradigms

JLMCC
Cross-Out
Page 35: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 93

discussed up to this point in that items changefrom negative to positive on presentation, andthe search set is continually increasing in size.The principal independent variable is lagbetween successive presentations of an item,and accuracy and latency are measured.

Okada (1971) performed two experimentsthat investigated performance on this task.In both experiments, mean reaction time as afunction of lag was well approximated by anegatively accelerating exponential. I havefitted the mathematical model to Okada's(1971) Experiment 2 to show how the theory isapplied. In his Experiment 2, items werepresented once, twice, or three times, but onlyonce- and twice-presented items were fittedbecause there were not sufficient data toperform distributional analyses on thrice-presented items. Even though the paradigmis different in many respects from the study-test and Sternberg paradigms, the sameprocessing assumptions are made. The memoryset changes from 0 to 150 in the course of ablock of trials, but in order to approximate overthe whole block, the search set size was setat 40 (though values between 30 and 60 would

not change the fits very much). At any testposition in the sequence, the test item can beeither new or old with lag values of 0, 1, 2, 4,or 6. Therefore, in the theory, all criteria arefixed and only relatedness u may vary withlag. Thus, any single value of u must fit bothaccuracy and reaction time distributions(through the parameters n and r). Empiricalresults are presented in Figure 19 for propor-tion correct and reaction time distributionparameters M and T, together with theoreticalfits, as a function of lag. In these fits, i and ijare fixed at .08 and .18 as before. The threeexperimental statistics (accuracy and latencydistribution parameters/* and T) are adequatelyfit with just one theoretical parameter u,which varies as a function of lag.

Thus, the pattern of results found in thecontinuous recognition memory task can bemodeled in the same way as results from thestudy-test and Sternberg paradigms.

Speed-Accuracy Trade-off Paradigms

Experiments to study speed-accuracy trade-off are usually performed within the context

<r§

ms160

140

120

100

80

1.00

.96

2 .92tE£ .88

.84

V = -0 .520 = 0. 13Z = 0.025

2 3 4 5 6

ms160

140

120

no

90

ffi.60 Z

QLJ

.50 <t

40

LJo:

CR 0

LAG

I 2 3 4 5 6 CR

Figure 19. Proportion correct, reaction time distribution parameters M and T, and relatedness u for hitsas a function of lag for Okada's (1971) Experiment 2. (Theoretical fits are shown by open circles; alsoshown are correct rejection data [CR]. The number of processes assumed to be in the search set was40, and TEE = 520 msec. Theoretical parameters 11, a, and z are as shown in the figure.)

JLMCC
Cross-Out
Page 36: Psychological Review - Stanford University

94 ROGER RATCLIFF

of a particular paradigm, for example, choicereaction time and the Sternberg paradigm(Pachella, 1974), the Brown-Peterson para-digm (Reed, 1973), sentence processing(Dosher, 1976), cued recall (Murdock, 1968),paired associate recall, and recognition (Wickel-gren & Corbett, 1977). Because the topic iscentral to the theory developed in this article,it is dealt with in this separate section.

In the introduction, speed-accuracy trade-off methods were partitioned into two classes(excluding latency partitioning). Information-controlled processing refers to the usualreaction time method in which the subject setssome limit on the amount of informationrequired for a response, and time of responseis not limited. Experimentally, the amount ofevidence required for a response can bemanipulated using instructions (e.g., speed vs.accuracy) or payoffs. Time-controlled process-ing refers to the class of experimental methodswhere the time of response is experimentallycontrolled using methods such as responsesignals, deadlines, or time windows.

Information-Controlled Processing

Each of the preceding item recognitionparadigms is an example of information-controlled processing. To check that theretrieval theory is capable of accounting forresults obtained when speed-accuracy criteriaare changed, I shall show that changes inboundary criteria are sufficient to account forthe kinds of changes observed in experimentaldata when speed-accuracy conditions arechanged. Banks and Atkinson (1974) used themethod of payoffs to investigate speed-accuracy effects in the Sternberg varied-setprocedure. They found that with payoffsstressing accuracy, error rates were about 1%;whereas with payoffs stressing speed, errorrates were between 10% and 30%, the slopeof the latency-set-size function was halved,and the intercept was reduced by about 200msec. To model this pattern of results, it isassumed that under speed conditions, therandom walk boundaries are moved closetogether; whereas under accuracy conditions,the boundaries are moved relatively far apart(see Figure 5 for an illustration).

To check that the numerical values generated

from the retrieval theory are compatible withthe size of the effects obtained by Banks andAtkinson (1974), typical parameter valuesused in Experiment 2 were taken, and randomwalk boundary position parameters a and zwere reduced to .04 and .015, respectively.Latency for both hits and correct rejectionsdecreased by 300 msec, and average errorrates increased from 2% to 18%. Thus, thetheory produces effects that are quite con-sistent with results from the payoff method.

Payoff or instruction is usually a between-sessions variable; therefore, criteria can changebetween conditions, and such experiments donot provide a strong test of the theory.Nevertheless, the demonstration that speedand accuracy can covary over a wide range ofvalues is important in evaluating models ofretrieval processes. For further discussion ofthese methods, see Pachella (1974) andWickelgren (1975,1977).

Time-Controlled Processing

In this section, it is shown that time-controlled processing methods produce datathat provide a strong test of the retrievaltheory. One example of time-controlled pro-cessing is the response signal method that wasdeveloped to avoid the problem that criteriamay change between conditions. In essence,the subject is given the probe followed by asignal to respond presented a variable amountof time after probe onset (Reed, 1973, 1976;Schouten & Bekker, 1967). Reed (1976) usedthe response signal method to investigate thetime course of recognition in the Sternbergvaried-set procedure. One, two, or four letterscomprising the memory set were presentedsequentially. Following probe onset, a signalto respond was presented at a lag varyingfrom 7 msec to 4,104 msec. Subjects wererequired to respond "yes" or "no" as quicklyas possible after the signal and to make aconfidence judgment on how confident theyfelt at the time of response. There were twoanalyses of major interest: d' as a function ofresponse signal lag and response latency as afunction of signal lag. It was argued that thefirst of these indicated the time course ofaccumulation of evidence on whether the probewas a member of the memory set, and the

JLMCC
Cross-Out
Page 37: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 95

second gave evidence that the decision is basedon the termination of a discrete process.Before discussing the results, I will derive anexpression for the increase in accuracy as afunction of signal lag from the theory.

In order to make the problem tractable, letus assume that the match and nonmatchboundaries are placed far from the startingpoint, so that in the range of signal lagsconsidered here, the diffusion process can beconsidered unrestricted. Also, let us consideronly Memory Set Size 1. In an unrestricteddiffusion process, the probability densityfunction h(x, t) at position x and time t withdrift f and variance in the drift *2 is given by

h(x,f) =1

(7)

(Feller, 1968, p. 357). In the theory, however,relatedness £ has a normal distribution (seeEquation A24). Therefore, in order to calculatethe distribution of evidence as a function oftime, we must integrate over the distributionof relatedness N(u, TJ) and find

-j<3-{fl!/s!< :2irs*t~ Vfcnj"8

y(x, t} =

Thus, the distribution of evidence is given by anormal distribution N[_ut, V/(r;2/ + s2)]. Fortwo distributions of relatedness, N(u, rj) formatches and N(v,ri) for nonmatches, d' isgiven by

d1 u — v(variance)1

u —

+ i!/(»?2<)]'(9)

Now as t—><x>, the value of d' asymptotes(ASY) at ^'ASY — (u— »)/ij, that is, d' of theoriginal relatedness (as expected, given a largeamount of time). Thus,

d' =d'ASY (10)

Equation 10 gives an expression for thegrowth of d' as a function of time for MemorySet Size 1. For Set Sizes 2 and 4, the decisionrule must be taken into account: If at the timeof the response signal, there is evidence fromat least one process greater than the startingvalue 2, then a "yes" response is produced;otherwise, a "no" response is produced (seeEquation A23). Therefore, to calculate d'values for Set Sizes 2 and 4, it is necessary (a)to calculate d' values from Equation 10, (b) touse the d' values to give values of hit rate andfalse alarm rate, (c) to combine the hit rateand false alarm rate according to the decisionrule, and (d) to thus produce a hit rate andfalse alarm rate for the combined process fromwhich a d' value can then be calculated.

Let p be the hit rate and q the false alarmrate for Set Size 1 at some time t (derived fromthe d' calculated from Equation 10). Then,for n processes, the probability of a nonmatchw is given by

W = 1 - (1 - q)n, (11)

and the probability of a hit r is given by

?)«-1, (12)

by analogy with Equation A 19. Thus, r and wdepend on the particular values of p and q, sothat for a particular discriminability valued' = (u — v)/r], the discriminability for nprocesses depends on the criterion chosen, andsubjects could maximize the d' value byadjusting the criterion. The dependence ofdiscriminability d' on criterion value is ratheran undesirable property; however, the de-pendence is rather small. Consider d' = 1.85;values of p and q that can give this value arep = .86, .75, and .54 and q = .22, ,12, and .04,respectively. Using Equations 11 and 12,values of d' for n = 2 are 1.51, 1.52, and 1.54,respectively, and for n = 4 are 1.14, 1.20, and1.27. Thus d' is reasonably robust to the choiceof criterion. In Reed's study, hit rates wereusually high (80% to 90%), so that theselarger p values were used in the following fits.

Figure 20 shows a plot of the accuracy-response-signal-lag data obtained by Reed(1976), together with fits from Equation 10and d' values for Set Sizes 2 and 4, calculatedusing Equations 11 and 12. The encoding andresponse output parameter was 220 msec, and

JLMCC
Cross-Out
Page 38: Psychological Review - Stanford University

96 ROGER RATCLIFF

d1

3 .

.2 .4 .6 .8 i.O

RESPONSE SIGNAL LAG (SEC)1.2

Figure 20. d' as a function of signal lag for a speed-accuracy experiment by Reed (1976) using responsesignals in the Sternberg varied-set procedure. (The broken line represents data, and the smooth curvesrepresent fits ^derived from Equations 10, 11, and 12].)

<*'ASY for Set Size 1 was set at 3.2. It should benoted that each of the three theoretical curveswas fixed in shape by the initial choice of j2

and r;2 made at the start of fitting the study-test paradigm, so that shape is described withno free parameters.

Reed (1976) has argued that the response-latency-signal-lag function is particularly use-ful in discriminating models. These curvesshow a minimum at some particular lag,depending on type of response (yes-no) andlist length. It was argued that these provideevidence for the termination of a discreteprocess that leads to an accelerated responselatency for a short period. The differencebetween the minimum for "yes" and "no"responses was used to argue for differentialtermination of the "yes" and "no" decisionprocess. These two results are exactly what thetheory predicts if it is now assumed that thediffusion process boundaries are close to thestarting point of the process but not closeenough to significantly affect the shape of thetheoretical accuracy-signal-lag curves: Itshould be noted that the latency minimacorrespond to a significant increase in responselatency rejections (100 msec < T < 500 mseconly accepted; see Table 1 of Reed, 1976),and this may be a problem for interpretationof these results.

Confidence Judgment Procedures

The study-test paradigm employs a 6-pointconfidence judgment procedure for responserecording. Up to this point, the task has beenmodeled as though it used a yes-no procedurebecause results from a yes-no procedure showthe same pattern of data. In order to model(qualitatively) the confidence judgment pro-cedure, it is necessary to assume a variableinternal temporal deadline. In terms of thetheory presented here, the only informationavailable about the "strength" of the itemduring the recognition processes the compari-son time. Thus, if a comparison is taking along time, the subject may reduce his confi-dence and respond on a lower confidence key.It should be noted that subjects are quitegood at responding to external deadlines(Reed, 1976) and so may be able to set internaldeadlines. Thus, confidence judgment resultsmay be explained in terms of subject-setinternal deadlines on the comparison process.

Summary

Speed-accuracy trade-off is a phenomenonthat can probably be induced in all reactiontime experiments. Therefore, any retrievalmodel that deals adequately with both latency

JLMCC
Cross-Out
Page 39: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 97

and accuracy must have mechanisms at theheart of the model to deal with this trade-off.The theory developed here was designed withthis in mind, and so speed-accuracy relationsare fundamental. The theory is quite consistentwith results from several speed-accuracy pro-cedures and fits results from the response signalprocedure (Reed, 1976) especially well. Inthe theory, two variance parameters (if and s2)were fixed throughout fits to previous para-digms. The shape of the accuracy-signal-lagfunction was determined solely by the ratio ofthose two parameters (s^/if), and the fit wasexcellent.

Discussion

Comparison and Summary of Paradigms

Each of the experimental paradigms con-sidered so far has been discussed more or lessin isolation from the others. Therefore, it isappropriate to give a brief summary andcomparison of experimental parameters be-tween paradigms. The study-test, Sternberg,prememorized list, and continuous recognitionparadigms are all assumed to have the sameprocessing stages (see Figure 1). The encodedrepresentation of the probe is compared inparallel with members of the memory searchset. The search set is defined as those itemsdetermining processing rate and is to beconceptualized as the set of resonating elementsin the framework of a resonance metaphor.

Each comparison process is formalized as adiffusion process. In the diffusion process,evidence for a probe-memory-set item matchis accumulated over time: the greater theevidence (greater the resonant amplitude), themore likely a match; the lower the evidence(smaller the resonant amplitude), the morelikely a nonmatch (see Figure 2). The rate ofaccumulation of evidence is determined by therelatedness of the probe to a memory-setitem (see Figure 3), where relatedness is asingle-dimension variable onto which allstructural, semantic, and phonemic (etc.)information is mapped. The decision process isself-terminating on a comparison processmatch, leading to a "yes" response. For a"no" response, all comparison processes musthave terminated with a nonmatch.

The search set was assumed to be thememory set for the study-test, Sternberg, andprememorized list paradigms, .though it wasnoted that this is only an approximationbecause there is evidence that items from earlierlists enter the comparison process. For thecontinuous recognition memory paradigm, asearch set size of 40 was assumed.

Table 4 shows parameter values for thefour item recognition paradigms for the twoconditions in each experiment in which thelargest value of d' and the smallest value of d'were obtained. The theoretical value of d'represents the difference in the size of reso-nance between an item in memory and theprobe when the probe matches and when the

Table 4Parameter Values for Four Item Recognition Paradigms for the Two Conditions Givingthe Largest and Smallest d' Values

Largest d' vaJue

Paradigm

Smallest d' value

a z u

Study-test (Experiment 1)Sternberg (Experiment 2)Prememorized listsContinuous recognition

.15

.24

.16

.13

.04

.1

.09

.03

.45

.55

.4

.6

-.44-.47-.55-.52

5.05.75.36.2

.16

.24

.33

.13

.04

.1

.11

.03

.19

.35

.36

.35

-.38-.47-.52-.52

3.24.64.94.8

360 msec260 msec420 msecb

520 msec0

• In the response signal study by Reed (1976), TEH = 230 msec and d' = 3.2.b The test words were spoken, and spoken words have a larger stimulus onset to perception than visuallypresented words.° Stimuli were presented on an IBM typewriter advanced by a solenoid, the pulse triggering the solenoid,which started a timer. There is a 250-msec delay (approximately) from timer start to stimulus onset with thisapparatus (compare Experiments 1 and 2 of Murdock et al,, 1977, with stimuli presented on a computer andstimuli presented on a typewriter).

JLMCC
Cross-Out
JLMCC
Cross-Out
Page 40: Psychological Review - Stanford University

98 ROGER RATCLIFF

probe does not match the item. Therefore, tocalculate asymptotic d' for any experiment, itis necessary to obtain a hit and false alarm ratefrom the d' value, to use Equations 11 and 12 toderive hit and false alarm rates for the com-bination of n processes in the memory set, andthen to use the derived hit and false alarm ratesto estimate asymptotic d'. It turns out that inall four experiments, asymptotic accuracy is notmuch more than the accuracy obtained in theexperiment, which suggests that subjects areworking with an emphasis on accuracy.

Parameters show no more variation withina paradigm than a factor of two; and betweenparadigms, variations are no more than afactor of three. Values of v, that is, probe-memory-set item nonmatch relatedness, arerelatively invariant across paradigms, showingthat subjects tend to set a relatively constantrelatedness criterion with respect to the non-target item relatedness (noise) distribution.

The encoding and response output param-eter (TER)' seems to fall in the range of 250msec to 350 msec, so that the other 250 msecto 500 msec is comparison and decision time.This division of processing time seems muchmore reasonable than that proposed by othermodels, for example, 35-msec comparison and400-msec encoding, decision, arid responseoutput (Sternberg; 1966) or 5-msec comparisonand 600-msec encoding, decision, and responseoutput (Murdock & Anderson, 1975).

It must be apparent that such cross-paradigm comparisons may prove most interest-ing when the same group of subjects is used ineach of the paradigms. Furthermore, suchcomparisons may prove useful in studyingthe locus of individual differences.

Relation to Neural Network Models

A large part of this article lias been concernedwith the development of a theory that inte-grates a. number of experimental paradigms.In this and the following section, the comple-mentary problem is treated, that of the relationbetween this theory and other more globaltheories. I consider neural network models inthis section and semantic network models inthe next section.

Perhaps one of the simpler neural networkmodels is the model developed by J. A.

Anderson (1973), which deals with retrievalfrom short memorized lists. It should be statedat the outset that this model is not to be viewedas definitive but rather as the first step of acontinuing and developing research program(Anderson, 1972, 1976; Anderson, Silverstein,Ritz, & Jones, 1977; Borsellino & Poggio,1973; Cavanagh, 1976; Kohonen, 1976;Kohonen & Oja, 1973). The basic structure ofthe model is fairly simple. It is assumed thatwhat is of importance is the simultaneouspattern of individual neuron activities in alarge group of neurons. The patterns interactin the process of storage, so that memory issimply the sum of past activities in the system.In the recognition process, the probe inputtrace is matched to the memory. If a certainlevel of positive evidence is accumulated, a"yes" response is made, and if a certain levelof negative evidence is accumulated, a "no"response is produced. The memory is formallyrepresented by a vector of N elements, and amemory trace /,• is represented as a specificvector of length N. If there are K traces, thenthe memory vector is constructed by

K

*=£/*.~ 4-1 ~

(13)

Then, the match between the input probe /and the memory vector is given by the dotproduct/-5 (or matched filter), that is, eachelement of the input vector is multiplied withthe corresponding ;element of the memoryvector, and the sum of these values representsthe amount of match.

If this description is rephrased slightly, thenthe recognition model shows remarkablesimilarity to the retrieval model developed inthis article. For example, the comparisonprocess can be viewed as resonance, and ifeach element of the dot product is accumulatedsuccessively through one filter (not two), thenthe resulting process is a random walk. Thus,the two theories are closely related.

However, there are some problems withAnderson's (1973) model. The model suffersmany of the problems of strength theory inthat there are no separate records of eachmemorized item, and activities simply sum,as in strength theory. Therefore, the model isnot capable of performance levels shown by

JLMCC
Cross-Out
Page 41: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 99

subjects in judgments of frequency (Wells,1974) and list discrimination (Anderson &Bower, 1972).

I noted earlier that this model is part of acontinuing research program, which can beseen in J. A. Anderson (1976) and Andersonet al. (1977). In these articles, it is shown hownetwork models can deal with associations,probability learning, feature analysis anddecomposition, and categorical perception.Therefore, it may be a useful future exercise toattempt to relate in some detail these twotheories: neural network theory and theretrieval theory presented here.

<.Relation to Semantic Network Models andPrepositional Models

Semantic processing models. There is nodoubt that the human memory system ishighly structured. One approach to modelingthe structural complexity of the system hasinvolved the use of semantic networks. Theuse of a network representation has been inpart a consequence of the appealing digitalcomputer metaphor for the human informationprocessor. However, just because this form ofrepresentation is appealing and most useful insetting out structural relations, it does notmean that process models suggested by anetwork representation are true or useful.This point can be illustrated as follows.

Rips, Shoben, and Smith (1973) developed aset theoretic model of semantic representation.Later, Hollan (1975) pointed out that settheoretic representations can be translatedinto networks, that is, there exists an iso-morphism between the two representations.However, Rips, Shoben, and Smith (1975) ina rejoinder made the important point that thechoice of representation can lead to sub-stantive processing differences. For example,network models explain reaction time effectsin terms of the time for retrieval of pathwaysbetween nodes; whereas for set theoreticmodels, reaction time effects are determinedby comparisons of semantic elements. Thus,for any network model, there is at least oneother isomorphic representation that suggestsalternative processing assumptions.

Much of the data used to support semanticprocessing models makes use of reaction time

and, in particular, the statistic mean reactiontime. One of the main thrusts of this articlehas been to demonstrate that mean reactiontime is of limited use as a statistic. Also meanreaction time can often be misleading becausemodels based on mean latency may be contra-dicted by further analyses of reaction timedata (e.g., Ratcliff & Murdock, 1976). Further-more, in many item recognition studies, it isfound that accuracy and latency covary, andthis seems to be true in many semanticmemory studies (e.g., Collins & Quillian, 1969;Meyer, 1970; Rips et al., 1973). Therefore, anymodel that purports to account for latencyeffects should also account for accuracy effectsand be capable of dealing with speed-accuracytrade-off.

Process models of retrieval latency that arederived from network models assume thatlatency is a function of number of linkstraversed. Latency for "yes" responses is wellmodeled using this assumption, but latencyfor "no" responses poses serious problems.For example, in verification of statements suchas "a robin is a mammal," latency can be asmuch as 200 msec slower than for statementsin which the two nouns are more unrelatedsemantically, for example, "a robin is a car"(Rips et al., 1973). In attempting to verifythe latter, one might expect that more linkswould be searched than in the former (e.g.,Collins & Quillian, 1969), but as shown above,the derived prediction is incorrect. It is interest-ing to note that the relation between latencyand accuracy as a function of "semanticrelatedness" is similar to the relation betweenlatency and accuracy as a function of related-ness in the retrieval theory developed here,namely, extreme relatedness leads to fastreaction times and high accuracy.

Collins and Loftus (1975) have proposeda comparison process for semantic processing,wherein evidence is accumulated until eithera positive or negative criterion is reached,whereupon a response is initiated. This com-parison process belongs to the class of randomwalk processes (as Collins and Loftus pointout) that forms the basis of the retrievaltheory developed here. Collins and Loftus(1975, Table 1) listed qualitatively differentkinds of evidence that can contribute to adecision. I argued earlier that qualitatively

JLMCC
Cross-Out
JLMCC
Comment on Text
Optional Section Starts Here
Page 42: Psychological Review - Stanford University

100 ROGER RATCLIFF

different kinds of information contribute torelatedness in item recognition. Therefore, itseems that the two points of view converge,and the theory developed earlier at leastqualitatively extends into the domain ofsemantic processing.

Propositioned models. Propositional models(J. R. Anderson, 1976; Anderson & Bower,1973; Kintsch, 1974; Rumelhart, Lindsay, &Norman, 1972; Schank, 1976; see also Cofer,1976) are close cousins of semantic networkmodels but have slightly different aims inthat the central task is to model sentence ortext representation. The ancestry of thesemodels is in linguistic theory, automata theory,and the theory of formal languages (Chomsky,1963). Therefore, it is not surprising that amajor goal of these theories has been suffi-ciency, that is, the ability of the theory (atleast in principle) to deal with and representthe complexities of human language andknowledge. Thus, there has been less attentionpaid to processes underlying comprehensionand retrieval, although the work of J. R.Anderson (1976), Anderson and Bower (1973),and Kintsch (1974, 1975) are importantexceptions. Even so, Anderson and Bower(1973, p. 6), in their formulation, selectedprocessing assumptions on the basis of theirease of implementation on a digital computer.In particular, serial search of network struc-tures was assumed for matching or comparisonprocesses.

J. R. Anderson (1974) developed two mathe-matical models for retrieval of propositionalinformation. One model assumed independenceof processing stages (the usual search model),whereas the other assumed "complete de-pendence," in which search time was exponen-tially distributed with the time constant equalto the sum of time constants for individual linksearch times. Anderson claimed that the twomodels gave similar predictions for meanlatency, but it is easy to see that they giveopposite predictions for the behavior of skew-ness of the latency distributions. As the numberof links increases, for complete independence,skewness decreases; whereas for completedependence, skewness increases. In general, inword recognition experiments, as mean latencyincreases, skewness of the distribution in-creases. Therefore, the complete-dependence

assumption seems most consistent with data.The assumption of complete dependencecertainly does not represent serial processing.Rather, it is more consonant with the viewthat the greater the number of links, the lessevidence there is for the required relationholding (for in a network theory, eventuallya pathway can be found between any pair ofnodes). Thus, the assumption of completedependence in processing seems halfwaytoward adoption of a decision process similarto the random walk class discussed earlier.

J. R. Anderson (1976) has gone a stagebeyond earlier models in that the productionsystem is the first attempt to model controlprocesses in a model that makes close contactwith experimental data. However, the process-ing assumptions underlying comparison pro-cesses still reflect the influence of the computermetaphor, and predictions for latency andaccuracy come from separate sources (e.g.,fan size and deadline, respectively) and are notclosely interrelated. This is an importantproblem because most of the data the modelmakes contact with are reaction time data,and it can be seen that accuracy and latencycovary in many of the data presented (J. R.Anderson, 1976).

This discussion suggests that the retrievaltheory presented in this article may be capableof representing the retrieval and decisionprocesses involved in processing semantic andmore highly structured information (proposi-tional and text). Thus, a task for the future isthe integration of the retrieval theory andtheories of semantic and text memory structure.

General Conclusions

I have presented a theory of memoryretrieval that not only applies over a range ofparadigms but also deals with experimentaldata in greater depth and more detail thancompeting models. The theory provides arationale for relating accuracy, mean reactiontime, error latency, and reaction time distri-butions. Such relations have not been dealtwith explicitly before. Because the theoryapplies over a range of experimental paradigms,it provides a basis for the claim that the sameprocesses are involved in each of these para-digms, allowing theoretical comparisons to be

Page 43: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 101

made between paradigms, and thus sets thestage for experimental comparison of para-digms. Also, the theory makes contact withseveral other more general theories, namely,neural network, semantic memory, and prepo-sitional models. The success of the theory sofar, and the potential shown for integrationwith other general models, leads to the hopethat some measure of theoretical unificationmay soon be achieved.

Reference Notes

1. Sternberg, S. Estimating the distribution of additivereaction-time components. Paper presented at themeeting of the Psychometric Society, Niagara Falls,Ontario, Canada, October 1964.

2. Sternberg, S. Evidence against self-terminatingmemory search from properties of RT distributions.Paper presented at the meeting of the PsychonomicSociety, St. Louis, November 1973.

3. Ratcliff, R. Group reaction time distributions and ananalysis of distribution statistics. Paper presented atthe Mathematical Psychology meeting, Los Angeles,August 1977.

4. Reed, A. V. Discriminability spectra in bilingualrecognition. Paper presented at the MathematicalPsychology meeting, Los Angeles, August 1977.

References

Allport, D. A. Critical notice: The state of cognitivepsychology. Quarterly Journal of ExperimentalPsychology, 1975, 27, 141-152.

Anderson, J. A. A simple neural network generating aninteractive memory. Mathematical Biosciences, 1972,14, 197-220.

Anderson, J. A. A theory for the recognition of itemsfrom short memorized lists. Psychological Review,1973, 80, 417-438.

Anderson, J. A. Neural models with cognitive implica-tions. In D. La Berge & S. J. Samuels (Eds.),Basic processes in reading: Perception and compre-hension. Hillsdale, N.J.: Erlbaum, 1976.

Anderson, J. A., Silverstein, J. W., Ritz, S. A., & Jones,R. S. Distinctive features, categorical perception,and probability learning: Some applications of aneural model. Psychological Review, 1977, 84, 413-451.

Anderson, J. R. Retrieval of prepositional informationfrom long-term memory. Cognitive Psychology, 1974,6, 451-474.

Anderson, J. R. Language, memory and thought. Hills-dale, N.J.: Erlbaum, 1976.

Anderson, J. R., & Bower, G. H. Recognition and re-trieval processes in free recall. Psychological Review,1972, 79, 97-123.

Anderson, J. R., & Bower, G. H. Human associativememory. Washington, D.C.: Winston, 1973.

Atkinson, R. C., Herrmann, D. J., & Wescourt, K. T.

Search processes in recognition memory. In R. L.Solso (Ed.), Theories in cognitive psychology: TheLoyola Symposium. Hillsdale, N.J.: Erlbaum, 1974.

Atkinson, R. C., Holmgren, J. E., & Juola, J. F.Processing time is influenced by the number of ele-ments in a visual display. Perception &• Psychophysics,1969, 6, 321-327.

Atkinson, R. C., & Juola, J. F. Factors influencingspeed and accuracy of word recognition. In S.Kornblum (Ed.), Attention and performance IV.New York: Academic Press, 1973.

Atkinson, R. C., & Shiffrin, R. M. Human memory:A proposed system and its control processes. InK. W. Spence & J. T. Spence (Eds.), The psychologyof learning and motivation: Advances in research andtheory (Vol. 2). New York: Academic Press, 1968.

Baddeley, A. D., & Ecob, J. R. Reaction time and short-term memory: Implications of repetition effects forthe high-speed exhaustive scan hypothesis. QuarterlyJournal of Experimental Psychology, 1973, 25, 229-240.

Banks, W. P., & Atkinson, R. C. Accuracy and speedstrategies in scanning active memory. Memory &Cognition, 1974, 2, 629-636.

Bobrow, D. G. Dimensions of representation. In D. G.Bobrow & A. M. Collins (Eds.), Representation andunderstanding: Studies in cognitive science. New York:Academic Press, 1975.

Borsellino, A., & Poggio, T. Convolution and correlationalgebras, Kybernetik, 1973,13, 113-122.

Briggs, G. E. On the predictor variable for choicereaction time. Memory & Cognition, 1974, 2, 575-580.

Burrows, D., & Okada, R. Serial position effects inhigh-speed memory search. Perception & Psycho-physics, 1971, 10, 305-308.

Burrows, D., & Okada, R. Memory retrieval from longand short lists. Science, 1975, 188, 1031-1033.

Cavanagh, J. P. Holographic and trace strength modelsof rehearsal effects in the item recognition task.Memory &• Cognition, 1976, 4, 186-199.

Chomsky, N. Formal properties of grammars. In R. D.Luce, R. Bush, & E. Galanter (Eds.), Handbook ofmathematical psychology (Vol. 2). New York: Wiley,1963.

Churchill, R. V. Fourier series and boundary value prob-lems (2nd. ed.). New York: McGraw-Hill, 1963.

Cofer, C. N. The structure of human memory. SanFrancisco: Freeman, 1976.

Collins, A. M., & Loftus, E. F. A spreading-activationtheory of semantic processing. Psychological Review,1975, 82, 407-428.

Collins, A. M., & Quillian, M. R. Retrieval time fromsemantic memory. Journal of Verbal Learning andVerbal Behavior, 1969, 8, 240-247.

Collins, A. M., & Quillian, M. R. Does category sizeaffect categorization time? Journal of Verbal Learningand Verbal Behavior, 1970, 9, 432-438.

Corballis, M. C. Serial order in recognition and recall.Journal of Experimental Psychology, 1967, 74, 99-105.

Corballis, M. C. Access to memory: An analysis ofrecognition times. In P. M. A. Rabbitt & S. Dornic(Eds.), Attention and performance V. New York:Academic Press, 1975.

Corballis, M. C., Kirby, J., & Miller, A. Access to

Page 44: Psychological Review - Stanford University

102 ROGER RATCLIFF

elements of a memorized list. Journal of ExperimentalPsychology, 1972, 94, 185-190.

Cox, D. R., & Miller, H. D. The theory of stochasticprocesses. London: Methuen, 1965.

Dosher, B. A. The retrieval of sentences from memory:A speed-accuracy study. Cognitive Psychology, 1976,8, 291-310.

Feller, W. An introduction to probability theory and itsapplications (Vol. 1, 3rd. ed.). New York: Wiley,1968.

Forrin, B., & Cunningham, K. Recognition time andserial position of probed item in short-term memory.Journal of Experimental Psychology, 1973, 99, 272-279.

Forrin, B., & Morin, R. E. Recognition times for itemsin short- and long-term memory. Ada Psychologica,1969, 30, 126-141.

Hollan, J. D. Features and semantic memory: Set-theoretic or network model? Psychological Review,1975, 82, 154-155.

Huesmann, L. R., & Woocher, F. D. Probe similarityand recognition of set membership: A parallel-processing serial-feature-matching model. CognitivePsychology, 1976, 8, 124-162.

Juola, J. F., Fischler, I., Wood, C. T., & Atkinson,R. C. Recognition time for information stored inlong-term memory. Perception &• Psychophysics,1971,10, 8-14.

Kintsch, W. The representation of meaning in memory.Hillsdale, NJ.: Erlbaum, 1974.

Kintsch, W. Memory representations of text. In R. L.Solso (Ed.), Information processing and cognition:The Loyola Symposium. Hillsdale, NJ.: Erlbaum,1975.

Kohonen, T. Associative memory: A system-theoreticalapproach. Berlin, West Germany: Springer-Verlag,1976.

Kohonen, T., & Oja, E. Fast adaptive formation oforthogonalizing filters and associative memory inrecurrent networks of neuron-like elements. Bio-logical Cybernetics, 1973, 21, 85-95.

Krantz, D. H. Threshold theories of signal detection.Psychological Review, 1969, 76, 308-324.

Laming, D. R. Information theory of choice reaction time.New York: Wiley, 1968.

Link, S. W. The relative judgment theory of choicereaction time. Journal of Mathematical Psychology,1975, 12, 114-135.

Link, S. W., & Heath, R. A. A sequential theory ofpsychological discrimination. Psychometrika, 1975,40, 77-105.

Lively, B. L. Speed/accuracy trade off and practice asdeterminants of stage durations in a memory-search task. Journal of Experimental Psychology,1972, 96, 97-103.

Luce, R. D. A threshold theory for simple detectionexperiments. Psychological Review, 1963, 70, 61-79.

Marcel, A. J. Negative set effects in character classifica-tion: A response-retrieval view of reaction time.Quarterly Journal of Experimental Psychology, 1977,29, 31-48.

Meyer, D. E. On the representation and retrieval ofstored semantic information. Cognitive Psychology,1970, /, 242-300.

Miller, J. O., & Pachella, R. G. Locus of the stimulusprobability effect. Journal of Experimental Psychol-ogy, 1973, 101, 227-231.

Morin, R. E., DeRosa, D. V., & Stultz, V. Recognitionmemory and reaction time. In A. F. Sanders (Ed.),Attention and performance I. Amsterdam: North-Holland, 1967.

Morton, J., Marcus, S., & Prankish, C. Perceptualcenters (P-centers). Psychological Review, 1976, 83,405-408.

Murdock, B. B., Jr. Response latencies in short-termmemory. Quarterly Journal of Experimental Psy-chology, 1968, 20, 79-82.

Murdock, B. B., Jr. Human memory: Theory and data.Potomac, Md.: Erlbaum, 1974.

Murdock, B. B., Jr., & Anderson, R. E. Encoding,storage and retrieval of item information. In R. L.Solso (Ed.), Theories in cognitive psychology: TheLoyola Symposium. Hillsdale, NJ.: Erlbaum, 1975.

Murdock, B. B., Jr., & Duf ty, P. O. Strength theory andrecognition memory. Journal of ExperimentalPsychology, 1972, 94, 284-290.

Murdock, B. B., Jr., Hockley, W. E., & Muter, P. M.Two tests of the conveyor belt model for item recog-nition. Canadian Journal of Psychology, 1977, 31,71-89.

Neisser, U. Cognitive psychology. New York: Appleton-Century-Crofts, 1967.

Newell, A. You can't play 20 questions with nature andwin: Projective comments on the papers of thissymposium. In W. G. Chase (Ed.), Visual informationprocessing. New York: Academic Press, 1973.

Nickerson, R. S. Binary-classification reaction time: Areview of some studies of human information-processing capabilities. Psychonomic MonographSupplements, 1972, 4(17, Whole No. 65).

Norman, D. A., & Rumelhart, D. E. A system for per-ception and memory. In D. A. Norman (Ed.),Models of human memory. New York: AcademicPress, 1970.

Okada, R. Decision latencies in short-term recognitionmemory. Journal of Experimental Psychology, 1971,90, 27-32.

Pachella, R. G. An interpretation of reaction time ininformation processing research. In B. Kantowitz(Ed.), Human information processing: Tutorials inperformance and cognition. Hillsdale, N J.: Erlbaum,1974.

Pike, R. Response latency models for signal detection.Psychological Review, 1973, 80, 53-68.

Ratcliff, R., & Murdock, B. B., Jr. Retrieval processesin recognition memory. Psychological Review, 1976,83, 190-214.

Reed, A. V. Speed-accuracy trade-off in recognitionmemory. Science, 1973, 181, 574-576.

Reed, A. V. List length and the time course of recogni-tion in immediate memory- Memory &• Cognition,1976, 4, 16-30.

Rips, L. J., Shoben, E. J., & Smith, E. E. Semanticdistance and the verification of semantic relations.Journal of Verbal Learning and Verbal Behavior,1973, 12, 1-20.

Rips, L. J., Shoben, E. J., & Smith, E. E. Set-theoreticand network models reconsidered: A comment on

Page 45: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 103

Hollan's "Features and semantic memory." Psycho-logical Review, 1975, 82, 156-157.

Rumelhart, D. E., Lindsay, P. H., & Norman, D. A.A process model for long-term memory. In E. Tulv-ing & W. Donaldson (Eds.), Organization of memory.New York: Academic Press, 1972.

Schank, R. C. The role of memory in language process-ing. In C. Cofer (Ed.), The structure of human memory.San Francisco: Freeman, 1976.

Scheirer, C. J., & Hanley, M. J. Scanning for similarand different material in short- and long-termmemory. Journal of Experimental Psychology, 1974,102, 343-346.

Schneider, W., & Shiffrin, R. M. Controlled and auto-matic human information processing. PsychologicalReview, 1977, 86, 1-66.

Schouten, J. F., & Bekker, J. A. M. Reaction time andaccuracy. Ada. Psychologica, 1967, 27, 143-153.

Schulman, H. G. Encoding and retention of semanticand phonemic information in short-term memory.Journal of Verbal Learning and Verbal Behavior,1970, 9, 499-508.

Seward, G. H. Recognition time as a measure of con-fidence. Archives of Psychology, 1928, 99, 723. (Citedin Woodworth, R. S. Experimental psychology. NewYork: Holt, 1938.)

Smith, E. E., Shoben, E. J., & Rips, L. J. Structureand process in semantic memory: A featural modelfor semantic decisions. Psychological Review, 1974,81, 214-241.

Sternberg, S. High-speed scanning in human memory.Science, 1966, 153, 652-654.

Sternberg, S. The discovery of processing stages:Extensions of Bonders' method. In W. G. Koster(Ed.), Attention and performance II. Amsterdam:North-Holland, 1969. (a)

Sternberg, S. Memory-scanning: Mental processesrevealed by reaction-time experiments. AmericanScientist, 1969, 57, 421-457. (b)

Sternberg, S. Memory scanning: New findings andcurrent controversies. Quarterly Journal of Experi-mental Psychology, 1975, 27, 1-32.

Stone, M. Models for choice reaction time. Psycho-metrika, 1960, 25, 251-260.

Theios, J., Smith, P. G., Haviland, S. E., Traupmann,J., & Moy, M. C. Memory scanning as a serial self-terminating process. Journal of Experimental Psy-chology, 1973, 97, 323-336.

Theios, J., & Walter, D. G. Stimulus and responsefrequency and sequential effects in memory scanningreaction times. Journal of Experimental Psychology,1974, 102, 1092-1099.

Tikhonov, A., & Samarskii, A. A. Equations of mathe-matical physics. New York: Macmillan, 1963.

Townsend, J. T. A note on the identifiability of paralleland serial processes. Perception 6* Psychophysics,1971,10, 161-163.

Townsend, J. T. Some results concerning the "identi-fiability" of parallel and serial processes. BritishJournal of Mathematical and Statistical Psychology,1972, 25, 168-199.

Townsend, J. T. Issues and models concerning theprocessing of a finite number of inputs. In B. H.Kantowitz (Ed.), Human information processing:Tutorials in performance and cognition. Hillsdale,NJ.: Erlbaum, 1974.

Tulving, E., & Bower, G. H. The logic of memoryrepresentations. In G. H. Bower (Ed.), The psychol-ogy of learning and motivation: Advances in researchand theory (Vol. 8). New York: Academic Press,1974.

Wells, J. E. Strength theory and judgments of recencyand frequency. Journal of Verbal Learning andVerbal Behavior, 1974, 13, 378-392.

Wescourt, K. T., & Atkinson, R. C. Scanning forinformation in short-term and long-term memory.Journal of Experimental Psychology, 1973, 98,95-101.

Wescourt, K. T., & Atkinson, R. C. Fact retrievalprocesses in human memory. In W. K. Estes (Ed.),Handbook of learning and cognitive processes (Vol. 4).Hillsdale, N.J.: Erlbaum, 1976.

Wickelgren, W. A. Dynamics of retrieval. In D. Deutsch& J. A. Deutsch (Eds.), Short-term memory. NewYork: Academic Press, 1975.

Wickelgren, W. A. Speed-accuracy tradeoff and in-formation processing dynamics. Acta Psychologica,1977, 41, 67-85.

Wickelgren, W. A., & Corbett, A. T. Associativeinterference and retrieval dynamics in yes-no recalland recognition. Journal of Experimental Psychology:Human Learning and Memory, 1977, 3, 189-202.

Wickelgren, W. A., & Norman, D. A. Strength modelsand serial position in short-term recognition memory.Journal of Mathematical Psychology, 1966,3, 316-347.

Appendix

In this section, the mathematical model usedto relate the theory to data is formulated. Theorganization is as follows: First, the theory ofthe random walk (gambler's ruin problem) isdescribed as an introduction to the diffusionprocess. Second, the theory of the diffusionprocess is outlined, and two ways of derivingfirst-passage time distributions and error rates

are indicated. Third, equations for the maxi-mum of n processes (exhaustive processing)and the minimum of n processes (self-termin-ating processing) are derived.

In the development of any model, one hasto select aspects of the data, that is, summarystatistics, as points of contact between themodel and data. In item recognition, the

Page 46: Psychological Review - Stanford University

104 ROGER RATCLIFF

statistic generally used is mean reaction time,but as argued earlier, mean error rate andreaction time distribution statistics are moreuseful. Thus, the following development isaimed primarily at deriving expressions fordistributions and error rates.

Random Walk

Traditionally, the theory of this process iscentered around the example of a gamblerwho wins or loses a dollar with probabilitiesp and q, respectively. His initial capital is Zand his opponent's capital is A — Z. Thegame terminates when his capital becomes Aor 0 dollars (Feller, 1968, p. 342).

It is simple to rephrase the problem interms of a feature-matching, memory com-parison process (cf. Huesmann & Woocher,1976). Suppose the features of a probe itemare compared with the features of a memoryitem, and the probability of a match is p andof a nonmatch is q (= 1 — p). Then, if Z morenonmatches than matches are required for aprobe-memory item nonmatch, and A — Zmore matches than nonmatches are requiredfor a probe-memory item match, this processis isomorphic to the gambler's ruin problem.I shall discuss the random walk within thefeature match framework.

One of the main methods for finding solu-tions for stochastic processes consists ofderiving difference equations for the processand solving these equations with respect tocertain boundary conditions. In our featurecomparison process, let qz be the probabilityof a probe-item nonmatch, given startingpoint Z, match at A, and nonmatch at 0. Then,after the first trial, either Z ~ 1 or Z + 1steps are required for termination with anonmatch. Thus,

probability of a nonmatch is given by

qz = pqz+i + qqz~i, (Al)

provided 1 < Z < A — 1. For Z = 1 andZ = A —• 1, the first trial may lead to termi-nation, so we define

go 1 and = 0 (A2)

to be boundary conditions on the randomwalk. Thus, the probability of nonmatch qzsatisfies Equations Al and A2.

Using the method of particular solutions,the solution to Equations Al and A2 for the

qz =

when

1 —j, when q = p.A

This is the required expression for error rate.The next result needed is the first-passagetime distribution for a nonmatch (the distri-bution of number of steps to nonmatch).

Let gz.n denote the probability that theprocess ends with the wth step at the barrier 0(nonmatch at the wth feature comparison).After the first step, the position is Z + 1 orZ— l f o r l < Z < X — 1 and n ^ 1 ; thus,

gZ,n+l = PgZ+l,n (A4)

Boundary conditions for this difference equa-tion are given by

gO.n = gA,n = 0

and (AS)

go,o = 1, gz,o = 0, when Z > 0.

Then, the difference Equation A4, withboundaries given by Equation AS, holds forall 0 < Z < A and all « > 0.

The solution to Equations A4 and AS isgiven by

£Z,n =2n+1

~A~f

X.irk . irk .

cos —7 sin — sink<A/2

(A6)

The derivation of Equation A6 is shown inFeller (1968, chap. 14) and involves derivingthe probability generating function for first-passage times, then performing a partialfraction expansion on that expression.

In order to obtain the expressions equivalentto Equations A3 and A6 for a probe-memoryitem match, q and p are interchanged and Zbecomes A — Z. It should also be noted thatgz,» is not a probability density function, butthat the sum of the first-passage time proba-bilities for matches and nonmatches is aprobability density function.

Diffusion Process

It was noted earlier that the discreterandom walk could be used to represent afeature comparison process. However, I have

Page 47: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 105

chosen to use the continuous random walk(Wiener diffusion) process for several reasons.First, my conceptual bias is to think of in-formation or evidence accumulation as acontinuous process rather than a discreteprocess. Second, for the continuous randomwalk, drift and variance in drift are in-dependent, and because these and otherparameters of the model are relatively de-coupled, manipulation is simpler. Third, it isfaster and cheaper to integrate numericallyover continuous functions than to sum overdiscrete functions.

Before proceeding, I will introduce thefollowing convention: F and G will representfirst-passage time distribution functions, and/ and g will represent first-passage time densityfunctions. These are not normalized (i.e., arenot probability distributions or density func-tions), but they may be normalized by divisionby the appropriate error or correct probability7. On the other hand, H and h will representproper probability distribution and densityfunctions. Subscripts + and — will refer tothe boundaries giving match and nonmatch,respectively.

It was noted earlier that the Wiener diffusionprocess (Brownian motion) is the result oftaking the limit as step size tends toward zeroin the random walk process. This limitingprocess will now be described.

In order to apply the random walk in thecase where the number of events per unit timeis large, but the effect of each event is small(e.g., molecular collisions in Brownian motion),one can take the discrete expressions and findthe limit directly. Let the length of the in-dividual steps S be small, the number of stepsper unit time r be large, and (p — q) be small,with the constraint that the following expres-sions approach finite limits :

(/»- (A7)

It is necessary for the two expressions to havefinite limits. For example, if s2— » 0, then theprocess is deterministic; if s2— » «, then theprocess is infinitely variable, and the proba-bility of terminating at each boundary is 1/2with zero time delay. If £— » 0, there is nodrift ; if £ — -> ± °o, there is no time delay beforethe boundary is reached. The probability of anonmatch is obtained by taking the limit inEquation A3 and is given by

where Z —> z/8 and A —» a/8. 7+(£), the proba-bility of a match, can be obtained by replacingz with a — z and £ with — £. Then, 7+ + y_ = 1.

The limiting form of the first-passage timedistribution is given by

vX 2Jt-i

sin ( — )«-*\ a )'. (A9)

g-(t, £) is a function of /, a, z, £, and s2, but it isonly necessary to express it as a function of tand £ for further calculations in this section.The first-passage time distribution for a matchg+(t, £) can be found by setting £ = — £ and2 = a — z in Equation A9. Again, note thatg-(t, £) and g+(t, £) are not probability densityfunctions, but g+(t, $ + g.(t, £), g-.(t, £)/7-(£),and g+(t, £)/7+(£) are probability densityfunctions.

It was stated earlier that Equation A9, thefirst-passage time distribution, can be derivedby solving the diffusion equation. The diffusionequation can be derived from Equation A4,using the limiting expressions in Equation A7for % and s2. Equation A4 becomes

dt

Methods for the solution of this equation (Cox& Miller, 1965) lead into the enormous litera-ture on the solution of partial differentialequations (Churchill, 1963; Tikhonov &Samarskii, 1963). By substituting g-(t, £) fromEquation A9 in Equation A10, it can be seenthat g-(t, £) is a solution of Equation A10.

The first-passage time density functiong- (t, £) turns out not to be the best expressionto use in evaluating the maximum and mini-mum of a number of similar processes. Instead,the first-passage time distribution function

G-(t,Q f'e-V,Jo

Qdi? (All)

proves most useful. Now,

G-(t,

2k sin [l ~

however, this series converges slowly, and a

Page 48: Psychological Review - Stanford University

106 ROGER RATCLIFF

more useful alternative is given by

2ksln(—\ a

T — max Ti is!<<<n

-. (A12)

^ a2 /

From this expression, it can be seen that

Relatedness and Diffusion

The main assumption made to relate diffu-sion and relatedness is that relatedness isrepresented by drift £ in the diffusion process.Thus, the greater the relatedness, the greateris the value of drift; the less the relatedness,the less the drift. The scale on which related-ness is measured is not absolute; rather, it isonly sensible to talk about the degree of dis-criminability. The subject sets the zero point(or criterion) of relatedness, so that the averagedrift for a matching comparison is toward thematch boundary, and average drift for anonmatching comparison is toward the non-match boundary. The scale of relatedness islinear in drift, and therefore, the scale isdefined in terms of the comparison process.There are many other possible scales andrelations between relatedness and drift, butthis one seems the most natural. I will use £to represent relatedness of probe-memory-setitem match comparisons and v to representrelatedness of probe-memory-set item non-match comparisons, where v usually has theopposite sign to £.

Decision Process

The decision process is self-terminating oncomparison process matches and exhaustiveon nonmatches. The distributions of '"yes"and "no" responses therefore correspond to thedistributions of minimum and maximumcompletion times of the n comparison processes.

Maximum of n processes. Let TI, . . . , Tn

be independent observations on n pro-cesses, with probability density functionshi(t), . . . , hn(t) and distribution functionsH\(t), , . . , Hn(t). Let us assume eachprocess terminates with probability one.Then, the probability distribution function of

t)Pr( max Ti ^ 0

l < t < «

Pr(Tt ^ / ..... Tn ^ t)

Pr(rt ^ t) . . . Pr(Tn ^ /)

(assuming independence)

(A13)

Then, the probability density function of themaximum of n processes is given by

^maxW = H maxW

= II Hi(t) L jj~. (A14)

For misses and correct rejections, we are notdealing now with probability distributionfunctions because sometimes a process mayterminate at the wrong boundary. Therefore,it is necessary to find the maximum of nprocesses, given that there are different proba-bilities of termination associated with eachprocess. Let G,-_(2) be the first-passage timedistribution function for a nonmatch of processi, and Gmax(i) be the first-passage time distri-bution function for the maximum of the nnonmatch processes. Then,

Gmax(0 = P,(T ^ /)n

from Equation A13. Also,n7- = n 7»-,

(A15)

(A16)

where 7,- is the probability of the ith processnonmatch, and 7_ is the probability that all ofthe comparison processes produce nonmatches.

For correct rejections, the model will beapplied, with each comparison assumed tohave the same first-passage time distributionG-(t, v). Thus, for correct rejections (CR), thefirst-passage time distribution function FCR(I)is given by

GL"(*. v). (A17)Also, for the single nonmatching process whena memory-set item is probed, let the first-passage time distribution function be G_ (t, £) ;thus, for misses (M),

Fu(f) = G-(t, k}G-n~l(t, v). (A18)

Minimum of n processes. For "yes" re-

Page 49: Psychological Review - Stanford University

A THEORY OF MEMORY RETRIEVAL 107

sponses (hits and false alarms), only onecomparison process has to terminate with amatch; therefore, all combinations of thenumber of match terminations have to beconsidered. For example, if only one matchoccurs, then the time for that comparison isthe time for the decision ; if two matches occur,then the decision time is the minimum of thosetwo comparison times. Therefore, it is neces-sary to sum over all possible combinations ofprocesses finishing. First, let us consider theminimum of n processes finishing. Then, thedistribution function Hmin(t) of T — min T,

l< i<"

is found as follows :

Now

..... Tn>f)

= Pr(Tl>t)...Pr(Tn>t)

Therefore,

(A19)

Let Fmin(t) be the first-passage time distri-bution function (not a probability distributionfunction) for the self-terminating process, andlet pi be the probability of a match. Then,

Pr(T>t)=E E [Pr(r><|», ij)y=l all subsets

of size j

ri p.-if d-w:*if-II fc=*l

-all subsets

of size j

x f c l ( l - f c ) , (A20){-,'! h=hi }

using Equation A19. As before, when theprobe was not a memory-set item, all nonmatchcomparison first-passage time distributionfunctions GL (t, v) are assumed equal. Then, forfalse alarms (FA),

n(n- l ) r2G + ( f , iQ _

where G+(t, v) can be found by settingz = a — z and v — — v in Equation A12.

Let the first-passage time distribution of aprobe being compared with a memory-set item(i.e., for a match) be G+(t, (•). Then, for hits(H), the first-passage time distribution func-tion is

rc+(*.»oiM1 L^rJI(A22)

At this stage, we should note that both y+(v)and 7_(£) are usually small (.1 or less) andG±(t) tends to 7± as t becomes large ; therefore,Equations A21 and A22 are adequately ap-proximated by the first few terms in the series.

The probabilities of a correct rejection andmiss are given by

PCS, = 7-"(") = 1 —and (A23)

= 1- PH.

(A21)

The approximation in Equation A22 waschecked by calculating FH(°°) and by makingsure that the value is equal to pn.

Relatedness Distributions

A critical assumption is that relatednessenters the diffusion process as the drift param-eter £. In the last section, it was noted thatthere should be a distribution of relatednessand therefore a distribution over drift values(different from the variance of the diffusionprocess s2). The simplest choice used with mostprecedence (Murdock, 1974, p. 28) is thenormal distribution. This means that thedistribution function G(f) and accuracy y forthe diffusion process have to be averaged overthe relatedness distribution before being usedto calculate distributions and proportions ofhit, miss, correct rejection, and false alarmresponses.

Let relatedness £ be distributed N(u, r;) andrelatedness v be distributed N(v, if). Then, we

Page 50: Psychological Review - Stanford University

108 ROGER RATCLIFF

can compute first-passage time distribution Similarly, 7±(w) and y±(v) can be calculatedfunctions G±(t, u) and G±(/, v), using using Equations A24 and A25, respectively,

substituting 7±(£) and y±(v) for G±(t, £) and

/

°° 1 . G-tO, v), respectively.G±(t' {)VfcH * (A24) The equations developed to this point are00 sufficient for the calculation of error rates and

and reaction time distributions.

x, , N r* ^ , s l _ir.»_i,/ , , , . ^ r s Received December 17, 1976CT_I_ I / U I "— I CJ_L (/ V J — £ UfV \ l\.2iJ i± / „ * V2irr)2 Revision received July 12, 1977 •

Erratum for Librarians and Secondary Services

For all six issues of Volume 84 (1977) and the first issue of Volume 85 (January1978) of Psychological Review, the ISSN Number (8098-7970) on the frontcover is incorrect. The correct ISSN Number is 0033-29SX.