word recognition: context effects without priming

Cognition, 22 (1986) 93-136 1

Word recognition: Context effects without priming*

DENNIS NORRIS

MRC Applied Psychology Unit,Cambridge

Abstract

A model of the effects of context and frequency on word recognition is presented. By employing a post-access checking mechanism which is argued to beessential for resolving lexical and perceptual ambiguity, the model accounts forboth [acilitatory and inhibitory effects of word recognition without the use ofeither lexical priming or an attentional mechanism. The post-access checkingmechanism is assumed to operate by modifying the recognition criteria for asubset of lexical entries determined on the basis of perceptual analysis. Thebehaviour of criterion bias models of this form is discussed in some detail andit is shown that models of this form provide an accurate account of the effectof context on both speed and accuracy of word recognition in reaction timetasks.

Two of the most reliable findings in the word recognition literature are thatthe ease of recognition of a word is influenced by its frequency of occurrencein the language (Rubenstein, Garfield, & Millikan, 1970; Solomon & Howes,1951), and by the context in which it appears (Meyer & Schvaneveldt, 1971;Tulving & Gold, 1963). Although the literature abounds with theoreticaltreatments of these phenomena there is very little general agreement aboutthe nature of the underlying mental processes. However, during the last fewyears a great deal of research has been directed towards reducing the varietyof models of word recognition by attempting to eliminate certain broad classes of model from consideration. For example, very recently criterion biasmodels have come under strong attack. Antos (1979), O'Connor and Forster(1981) and Schvaneveldt and McDonald (1981) have all reported data which,they claim, rule out the possibility that the influence of context on word

'Preparation of this paper was supported in part by grant HR7147 from the Social Science Research Councilof Great Britain. I should like to thank Anne Cutler and Chuck Clifton for their valuable comments on earlierdrafts of this paper. Requests for reprints should be addressed to Dennis Norris. MRC Applied PsychologyUnit. 15 Chaucer Road. Cambridge. CB2 2EF. United Kingdom.

0010-0277/86/$13.70 ©Elsevier Sequoia/Printed in The Netherlands

94 : D. Norris

recognition is entirely the result of the operation of a criterion bias mechanism. If these authors are correct in their interpretation of these results thencriterion bias models such as Morton's (1969) logogen model are clearly inadequate, and there is one less class of model which we need to consider.

Other workers have made strong claims that serial models of word recognition can be rejected. Marslen-Wilson and his colleagues (Marslen-Wilson,1975; Marslen-Wilson & Tyler, 1980; Marslen-Wilson & Welsh, 1978) andRumelhart (1978) have all argued that both spoken and written word recognition consist of a set of highly interactive processes. However, Norris (1980,1982) has outlined proposals for a model of word recognition which is bothserial, insofar as it has a strictly bottom-up flow of information, and in whichthe effect of context is mediated by a criterion bias mechanism. The presentpaper develops the model to provide a more general account of the influenceof context on word recognition and extends the model to account for therelation between context and word frequency. Arguments are also presentedto demonstrate that the criticisms of criterion bias models are based on amisunderstanding of the properties of these models. Far from being inadequate, criterion bias models actually provide a better account of contexteffects than do the alternative search and verification models.

The paper begins by presenting a brief account of four of the most influential models of word recognition: Morton's (1969) logogen model, Forster's(1976) search model, Becker's (1976) verification model, and the cohort model of Marslen-Wilson and Welsh (1978). The central characteristics of eachmodel are described and the ability of each model to handle current data oncontext and frequency effects is reviewed. The second part of the paper isdevoted to developing the new theory and to discussing the behaviour ofcriterion-bias models in some detail. It will be argued that the new theoryprovides a better account of the data than do the alternatives. However,more important than the theory's ability to account for any particular set ofdata is the fact that this new model possesses a number of interesting theoretical properties. First, the theory accounts for the effects of context within asystem in which context has no influence on lexical access, and where theinformation flow within the model is completely bottom-up. Second, thetheory accounts for both facilitatory and inhibitory effects of context bymeans of a single criterion bias mechanism. Inhibition is explained withoutrecourse to attentional processes. Finally, context effects are seen as being aside-effect of a mechanism which has evolved primarily to resolve ambiguity.Viewed in this way, inhibition is seen as a process which is not simply anunfortunate consequence of facilitation, but is in itself a valuable aid to fluentcomprehension.

Word recognition 95

The logogen model

Throughout the last decade the most influential model of word recognitionhas been Morton's (1969) logogen model. The essential characteristic of thelogogen model is that words are recognized by a set of feature counters, orlogogens, each corresponding to a word in the language. Each logogen issensitive to both perceptual and semantic features and a word is recognizedwhenever the feature count of the corresponding logogen exceeds a presetcriterion value. If a logogen is activated by semantic features from a relatedword, then fewer perceptual features will be required for it to exceed thecriterion, Therefore words which have been primed by semantically relatedwords will be recognized faster than unprimed words. The word frequencyeffect is explained by assuming that logogens corresponding to high frequencywords have a higher initial resting level than low frequency words. Such abrief description of the model does little justice to the richness of more recentformulations of the model (Morton, 1979), but it should be sufficient tohighlight the major functional characteristics of the model to contrast themwith the theory to be presented here.

Although the logogen model fared reasonably well in accounting for datafrom threshold experiments, recent attempts to extend the model to accountfor data from reaction time paradigms have proved less successful. Perhapsthe most serious problem for the original model was the finding that contextcould have an inhibitory effect on the time taken to recognize a word as well.as a facilitatory one (Neely, 1976, 1977). As each logogen is simply a devicefor counting features, context can only have the effect of incrementing thecount. There is no facility for context to decrement the count of contextuallyimprobable words. This, in conjunction with the fact. that the system simplyresponds with the first word to exceed its threshold, means that the logogenmodel predicts that context can only have a facilitatory effect on recognitionlatency (although, insofar as it can increase the error rate, inappropriatecontext can inhibit recognition in threshold tasks).

To overcome this difficulty the model has been supplemented with anattentional mechanism (Neely, 1976; Posner & Snyder, 1975) which focusesattention on contextually probable words in the lexicon at the expense of theless probable words. However, exactly how this attentional mechanism issupposed to operate on the lexicon has never been made clear.

A further problem for the logogen model is that some of its predictionsconcerning the relation between context, frequency and stimulus quality appear to be wrong. As interpreted by Becker and Killion (1977), the logogenmodel predicts that both context and frequency should produce comparableinteractions with stimulus quality. However, using a lexical decision task

96 D. Norris

Becker and Killion found that whereas context and stimulus quality interacted, frequency and stimulus quality did not. The relation between context, frequency and stimulus quality will be discussed in greater detail laterin the paper.

A further problem for the logogen model is shared with other attempts toaccount for context effects entirely in terms of lexical priming. How canlexical priming explain the influence of sentential context on word recognition? Coping with the effects of sentential context poses such a problem thatForster (1979) has actually denied that sentences produce any effects of context beyond that produced by the individual words in a sentence. However,there is a growing body of evidence which makes such a standpoint increasingly difficult to justify. Early studies of sentential context (Tulving & Gold,1963) employed tachistoscopic recognition tasks. However, in these experiments context could have been acting simply by facilitating correct guessingrather than by exerting any control over recognition itself. More recentstudies of sentential context have employed reaction time tasks in an effortto gain a more accurate picture of the on-line effects of context. The contextprovided by sentence fragments has been shown to influence pronunciationlatency (Stanovich & West, 1979, 1981, 1983a, 1983b; West & Stanovich,1978), lexical decision (Fischler & Bloom, 1979, 1980; Schuberth & Eimas,1977; Schuberth, Spoehr & Lane, 1981) and phoneme monitoring (Foss,1982; Foss & Blank, 1980; Foss & Ross, 1978; Morton & Long, 1976). However, it is not entirely clear whether these studies are providing a demonstration of context effects at the sentence level, or whether they are simplyrecording the effect of associative relations between the target word andother words in the sentence. While associative relations may have had asignificant influence on the results of some of these studies, the phonememonitoring studies by Morton and Long' and by Foss and Ross provide aclear indication that there is a true effect of sentential context which extendsbeyond any influence of associative context. Morton and Long's study usedsentences such as (1)

1. On the train he looked at his paper/plan ...

The word 'paper' was found to be highly predictable in this sentence andreaction times to detect the initial phoneme of 'paper' were found to be fasterthan those to detect the corresponding phoneme in the less predictable word'plan'. In this experiment only about 25% of the high transition probabilitytarget words were preceded by a semantically related word, and in no casewas the target preceded by more than one related word. Had the context

'Foss and Gernsbacher (1983) have claimed that context effects observed in the phoneme monitoring taskare an artifact of post-target vowel length. However, Mehler and Segui (unpublished manuscript) have recentlydemonstrated sentence context effects in an experiment in which vowel length was controlled.

Word recognition 97

effect here been entirely due to associative priming in the few sentenceswhere there were associatively related words, the materials analysis of theresults would almost certainly not have been significant. Foss and Ross incorporated a control for the effects of simple associations in their experiment;they used sentences where there were strong associative relations betweensome of the words in the sentence and the target, and compared these withscrambled versions of the same sentences which were meaningless, but wherethe serial position of the associated words relative to the target was maintained. Any effects, simply due to association should therefore have beenidentical in both the normal and the scrambled sentences. Foss and Rossfound context effects with both types of materials, but the effects were farstronger in the normal sentences. This suggests that although there can be anassociative component in sentence context effects, at least part of the effectis due to processes operating at the sentence level rather than the lexicallevel. Forster's claim that word recognition can only be influenced by contextat the lexical level can therefore be rejected. But having established thatsentential context is important, how can the effects of sentential context beexplained in terms of lexical priming?

Consider what must happen if 'paper' is to be primed by the precedingcontext in sentence 1. There are no obvious associative relationships between'paper' and any other words in the sentence, so the priming must be derivedfrom some higher level representation of the sentence. But what features orattributes of this representation might prime 'paper' but not 'plan'? The appropriate attribute, or intersecting set of attributes, must be capable of delimiting a very specific set of words. It is difficult to see how anything muchless specific than 'thing typically looked at while in a train' could produce thenecessary differential priming of 'paper' and 'plan'. The trouble with makinglogogens sensitive to such complex features is that the logogens themselvesbecome extremely complicated devices. The system responsible for interpreting sentences will also become more complex as it must operate in such away that these features become available as a fairly automatic consequenceof understanding the sentence.

A similar argument also applies to attempts to provide an explanation offacilitatory and inhibitory effects of context in terms of attentional processes.What is the mechanism which allows attention to be focused on a very specificset of entries in the lexicon? Quite possibly the lexicon might be organizedalong a number of simple dimensions such as concrete-abstract, animate-inanimate. In that case one could readily imagine a system which could deployattention at, the concrete end of the concrete-abstract dimension at the expense of a reduced attentional capacity available for the abstract end of thedimension. Such a simple system would probably do well in accounting forcontext effects in simple priming experiments, but how would the systemoperate with sentential context? Where the logogen model's featural explana-

98 D. Norris

tion requires complex features, the attentional explanation requires a lexiconorganized along complex dimensions. Models which are wholly dependent onpriming or attentional mechanisms are clearly some way from giving a satisfactory explanation of how sentential context can influence word recognitionin the way it does.

Forster's search model

The major alternative to direct access models such as the logogen model hasbeen Forster's (1976) search model. Forster assumes that lexical access takesplace by means of a frequency ordered search through an orthographicallyor phonologically defined subset of the lexicon. Low frequency words willtherefore tend to be recognized more slowly than high frequency words sincetheir lexical entries will be searched later than those of the high frequencywords. In addition to the frequency ordered search, words can also be accessed by means of a semantically driven search operating in parallel with thefrequency ordered search. A word will be recognized as soon as either ofthese searches is successful. The semantic search operates by means of semantic cross-referencing between individual lexical entries. A subject seeing theword 'BREAD' would search through semantically related words such as'KNIFE' and 'BUTTER'. Therefore either of these semantically relatedwords should be recognized faster than an unrelated word of equivalent frequency because the semantic search should typically be shorter than the frequency ordered search. The mechanism explains Becker's (1980) finding thatcontext effects are larger for low frequency words than for high. The recognition of anything but the highest frequency primed words will be largelydetermined by the duration of the semantic search. Therefore frequency willhave very little influence on contextually primed words.

In Forster's overall model of language comprehension lexical access is acompletely autonomous subcomponent of a strictly bottom-up system. Highlevel factors such as sentential context can therefore have no influence on theoperation of the lexical processor. The semantic search is driven entirely bycross-references between individual words. Context provided by syntax orthe semantic interpretation of the sentence is not permitted to affect thesearch process. So, whereas the logogen model can be criticized for failingto provide a detailed account of how a priming mechanism could explainpriming by sentence contexts, the search model suffers the disadvantage thatit predicts that sentence contexts should have no influence on word recognition whatsoever. Forster therefore finds himself in the rather awkward position of trying to prove the null hypothesis. Any data which appear to suggestthat sentential context does affect recognition must be explained away by

Word recognition 99

invoking the operation of some higher level mechanism.Although Forster acknowledges that facilitatory effects due to sentence

contexts are to be found in the lexical decision task, he claims that sucheffects are attributable to decision processes rather than to word recognitionprocesses themselves. All context effects observed in naming, Forster claims,are in fact inhibitory. If a predictable sentence completion is named fasterthan an unpredictable but acceptable completion, then this must be becausethe unpredictable completion is inhibited relative to the predictable one, andnot because of any.facilitation of the predictable item. Ultimately, such aview boils down to a dispute over the choice of a neutral baseline. If onechooses a fast baseline, context effects will appear inhibitory. Choose a slowbaseline, and they will appear facilitatory.

. Forster (1981b) advances two possible accounts of sentence context effects,an 'integration theory' and an 'error checking theory'. Forster acknowledgesthat both of these theories have shortcomings. However, like any theorycompatible with Forster's view of an autonomous lexical processor, bothmodels assume that sentence context effects are purely inhibitory and takeplace after recognition. Such a view would seem to suggest that the effectsof sentence context and stimulus quality should be additive. If the contexteffects take place after recognition then there is no reason why they shouldbe influenced by stimulus quality. Stimulus quality would be expected toinfluence the recognition processes themselves but should have no effect onsubsequent processes. However, Stanovich and West (1981, 1983a) haveshown that the influence of sentence context increases with degraded or difficult stimuli. These data therefore provide very strong evidence against anyaccount of sentence context effects which assumes that the context effecttakes place after the word has been recognized.

With or without the integration or error checking mechanism the searchmodel also faces problems in explaining the results of a lexical decision experiment by Antos (1979). Antos carried out a speed-accuracy trade-off studyusing category members to prime category names. He found that context hadan effect on both signal detectabilityand response bias. The basic searchmodel contains no mechanism by which context could influence responsebias. Context can only influence word recognition by reducing the durationof the search process or increasing the duration of the post-recognition processes. It might appear that this result is equally damaging to criterion biasmodels since they would be expected to predict an effect of context on response bias but not on signal detectability. However, in a more detailedanalysis of speed-accuracy trade-off effects it will be shown that criterion biasmodels can, in fact, handle these data very well.

A single list search model such as Forster's is also unable to account for

100 D. Norris

the results of frequency blocking experiments by Glanzer and Ehrenreich(1979) and Gordon (1983). These studies have examined how lexical decisionlatencies are influenced by the frequency range of other words appearing inthe experiment. Both experiments have shown that high frequency words areresponded to faster when the experimental list contains only high frequencywords and non-words than when the list also contains lower frequency words.Glanzer and Ehrenreich have also reported a similar advantage for low frequency words presented alone rather than in association with high frequencywords. However, the latter result was not statistically significant, and no sucheffect was observed in Gordon's experiment.

Gordon has shown that the speeding of responses to high frequency wordsappearing in pure high frequency lists is readily explained by a criterion biasmodel. The response criterion is simply lowered in pure high frequency listsrelative to lists of high and low frequency words. However, Forster's searchmodel could only explain an advar.tage for pure low frequency lists over lowfrequency words in mixed lists. This would be possible if subjects started theirsearch some way down the list instead of at the top when there were no highfrequency words present. Low frequency words would therefore be recognized faster in the absence of high frequency words. However, with experimental lists of either pure high frequency words, or a mixture of high andlow frequency words, the search must always begin at the top. Therefore themodel incorrectly predicts that recognition should be equally fast in bothconditions. To account for their findings Glanzer and Ehrenreich have proposed a search model with two search lists, one containing only high frequency words and the other containing all words. However, this model makesthe rather implausible claim that high frequency words should often besearched twice, albeit in different lists, when low frequency words are presentin the experiment. This double search of high frequency words is completelyredundant and there seems to be no motivation for this claim other than theneed to account for the data in terms of a search model.

Interestingly, Forster (1981a) has failed to obtain any frequency blockingeffects in a naming task. However, since we do not know whether Forster'sstimuli would show blocking effects in a lexical decision task, it is difficult toknow whether this null result reflects an important difference between thetwo tasks, or is simply a consequence of Forster's stimulus set. Also, giventhat' frequency effects tend to be somewhat smaller in naming than lexicaldecision (Forster & Chambers, 1973; Frederikson & Kroll, 1976; Richardson,1976; Scarborough, Cortese & Scarborough, 1977), the naming paradigmwould be expected to be less sensitive than lexical decision.

A further drawback with the search model is that it contains no mechanismfor using context to resolve perceptual or lexical ambiguity. The only way for

Word recognition 101

the system to resolve ambiguity is for the lexicon to pass on alternativeanalyses or readings of the input to some higher level process which can thendisambiguate the analysis on the basis of contextual information. However,later in the paper it will be argued that such a mechanism alone is sufficientto explain semantic context effects. Therefore if the system possesses such amechanism the semantic cross referencing in the search model is superfluous,since the cross referencing exists only to permit priming.

Becker's verification· model

Becker's (1976) verification model represents a synthesis of concepts fromboth the logogen model and the search model. The presentation of a wordresults in the generation of a perceptually based candidate set; the sensoryset. Candidates in the sensory set are ordered in terms of their frequency,with high frequency words being verified before low frequency words.

The effect of context is to generate a second candidate set; the semanticset. The semantic set contains contextually probable words and is presumablyequivalent to the set of primed words in the logogen model. Like priming inthe logogen model the semantic set becomes available before the word to berecognized is presented. A new word can be verified against the semantic setas soon as it is presented and before the perceptual set can be generated.Words which appear in the semantic set will therefore be recognized fasterthan words which appear only in the sensory set.

Perhaps the most interesting aspect of the verification model is that it givesan account of inhibition which does not depend on the operation of an allentional mechanism or some higher level process. Inhibition is a consequenceof the fact that the semantic set must be searched before the sensory set.Therefore, if there is a large semantic set, verification of items which appearonly in the sensory set will be delayed relative to a neutral context. If wewere also to assume that the semantic set tended to increase over time, thenthe model could also explain Neely's (1976) finding that inhibition increaseswith SOA. However, in order to explain the Antos (1979) finding of inhibition at short SOAs the model would also have to assume that a very largesemantic set can sometimes be developed very quickly. A further interestingproperty of the model is that it predicts that the magnitude of the inhibitionwill simply be a function of the size of the semantic set. A word which ishighly improbable in a given context will take no longer to recognize thanone which is acceptable in the context but not probable enough to appear inthe semantic set. Recognition time for each word will simpy be determinedby its position in the sensory set. So if both words are of equal frequencythey will take equally long to recognize.

102 D. Norris

In support of this theory Becker (1980) has presented data from primingexperiments using different kinds of prime-target relationships. Priming aword with its antonym was found to produce a large facilitatory effect butvery little inhibition. This is what would be predicted from the model if theantonym prime gave rise to only a small semantic set. Also consistent withthe model's predictions was the finding that with category name-eategorymember relationships there was a large inhibitory effect but only a smalleffect of priming. Presumably the category name prime results in a far largercontextual set than does the antonym prime. Antonym primes enable subjectsto make quite specific predictions as to the identity of the decision word onvalidly primed trials. In the case of category name primes the subject onlyknows that validly primed decision words will be one of the many words inthe primed category. However, although the data for the antonym primesfollows the predicted pattern for primes eliciting a small contextual set, otherprime-target relationships which would also be expected to lead to a smallcontextual set show a different pattern of facilitation and inhibition. In thestudy by Antos already described, category members primed category names.Like the antonym primes, category member primes enable the subject tomake very specific predictions as to the identity of the decision word onvalidly primed trials. This should lead to a very small contextual set (possiblya single word) and a similar pattern of facilitation and inhibition to the antonym primes in Becker's experiment. However, in the Antos experiment theinhibition was slightly greater than the facilitation. This is the pattern ofresults which would be expected from a prime producing a large contextualset. It would appear that consideration of the size of the contextual set is notin itself sufficient to provide an accurate account of the relative amount offacilitation and inhibition.

Since Becker's account of context effects is based on the existence of somegeneralized priming mechanism to generate the semantic set it is subject tothe same criticisms which Forster levels against the logogen model: how sucha model can account for sentence context effects is not clear. However, sincethe model is based on a search and verification process it also suffers fromsome of the shortcomings of the search model. As with the search model, theeffect of context is simply to ·speed the process of locating the correct word.Therefore context should only influence d' and not beta. The verificationmodel also faces exactly the same problem as the search model in accountingfor the frequency blocking effects found by Glanzer and Ehrenreich and byGordon.


The cohort model

Marslen-Wilson and Welsh (1978) presented a model of auditory word recognition which differs radically from the other three models described. Theircohort model assumes that the initial selection of word candidates is basedentirely on perceptual, and not contextual evidence. The cohort model assumes that there is a separate active processing element associated with eachword in the lexicon. The lexicon in the cohort model is viewed as a set ofprocessing structures; each corresponding to a word in the language. Eachelement in the lexicon is sensitive to acoustic input, and whenever input isreceived which matches its perceptual.specifications, semantic and syntacticprocessing procedures associated with the structure are brough into operation. For example, on hearing the initial /s/ in the word "strike" the memoryelements corresponding to all words beginning with /s/ would be activated.As more of the word was heard, the size of the set of elements whose perceptual specifications matched the input would rapidly decrease. Marslen-Wilsonand Welsh refer to the set of elements whose perceptual specifications matchthe input as the "cohort". Whenever an element detects a mismatch betweenits perceptual specifications and the input it immediately drops out of thecohort and the input can therefore be uniquely identified when the cohorthas been reduced to a single element.

As well as detecting mismatches with the perceptual input, each memoryelement can also detect mismatches with the semantic and syntactic contextof the word. In this way contextual factors can facilitate word recognition byeliminating members of the cohort which do not fit with the context. Thepresence of the context therefore results in the cohort being reduced to asingle element faster than would be the case in its absence.

One of the most significant features of the cohort model is that all decisionsin the system are of an all or none nature; the system cannot compare therelative merits of two competing analyses. For example, if the cohort modelheard a drunk pronounce "cigarette" as "shigarette" the word could not berecognized at all as the cohort would only contain words beginning with /sh/.The element corresponding to "cigarette" would never be activated and couldtherefore not identify the word, even in the presence of highly constrainingcontext. In effect the failure of the cohort model lies in its inability to makean overall comparison between the input and the perceptual specification ofwords in the lexicon. Therefore it has no means of determining which entryprovides the best fit with the input. The main source of this difficulty resideswith the fact that as soon as a mismatch is detected the clement drops out ofthe cohort, leaving no record of its ever having been present. Therefore assoon as all elements have dropped out of the cohort the system loses all

104 D. Norris

knowledge of whatever analysis has already been performed. One way inwhich the system could determine which element corresponded most closelyto the input would be to perform a second analysis of the input, ignoring thefirst mismatch. A number of such analyses could be performed ignoring onemore mismatch on each pass until one element remained in the cohort at theend of a pass. This element would then correspond most closely to the input.This would appear to be a rather unwieldy approach to the problem of determining the best match to the input. Why not simply count up the number ofmismatches ,all of the time and select the memory element with the fewestmismatches? However, this would undermine the claim that the model isprimarily data driven, as all elements would be active all of the time andwould continuously be able to monitor for mismatches with the input. Therewould no longer be preselection of a set of candidate items on the basis of aperceptual analysis alone.

The cohort model's reliance on all or none decisions has similarly disastrous consequences with the detection of contextual as well as perceptualmismatches. The model should be completely unable to recognize a wordpresented in inappropriate context as the corresponding memory elementshould drop out of the cohort as soon as the mismatch is detected. The systemtherefore fails to account for inhibitory effects as context can only act so asto speed the process of reducing the cohort to a single element. Given aninappropriate context recognition will not be slowed down, it will simply beprevented altogether. This highlights yet again the cohort model's inabilityto compare the merits of alternative analyses. The system has no capacity totrade off perceptual information against contextual information. Memory elements will drop out of the cohort whenever any mismatch is encountered, nomatter how strong any other evidence in favour of that analysis might be.

The checking model

The conventional view of context effects on word recognition expressed inthe four models described above is that context somehow exerts direct controlover the lexicon so as to speed the process of word recognition. In the logogenmodel context speeds access by incrementing the feature count of logogensrelated to the context, and in a search or verification model context canfacilitate recognition by bypassing part of the normal access process. It isusual in these models to think of lexical access and word recognition as beingsynonymous. However, accessing the information contained in a lexical entryneed not, of itself, lead to recognition. In the process of recognizing a singleword, the recognition system might well access information contained in the


lexical entries for many different words. Indeed, in the strictest sense, this iswhat happens in the search or verification models. The search process involves accessing orthographic or phonetic information in each lexical entrysearched in order to perform a comparison. To resolve lexical or perceptualambiguity these models will also need to access semantic information beforea word can be recognized. As was argued earlier, the only way the searchmodel can resolve perceptual ambiguity is by accessing the alternativeanalyses and passing the information on to a higher level process which canmake use of contextual information to identify the word. In this case, semantic as well as perceptual information inust be accessed before a decision canbe made and the word recognized. This process of accessing a number oflexical entries and deciding which best fits the perceptual and contextualconstraints provides a very powerful mechanism which can be extended toexplain how context influences even the recognition of unambiguous work.The most interesting characteristic of such a mechanism is that by locatingthe effect of context between access and recognition, it allows us to dispensewith the traditional notion of lexical priming altogether.

In this alternative view of word recognition, lexical access is assumed tobe an entirely data-driven process completely unaffected by any higher levelprocess or any information associated with the context. It is assumed thatduring the early stages of perceptual processing a number of words is accessed, each of which is roughly consistent with the current perceptualanalysis. While the analysis is still proceeding each of these words can beevaluated against the context, and the plausibility of each word in the contextcan be used to modify its recognition criterion.

Consider what might happen when a subject in a word recognition experiment has been presented with the word 'BREAD' and is now presented withthe word 'BUTTER'. At some very early point in the perceptual analysis ofthe word, the analysis may reach a stage where its output is compatible witha small subset of lexical entries such as 'BATHER' , 'BATTER', 'BUTLER'and 'BUTTER'. In the absence offurther information, unique identificationof the word would obviously have to await further refinement of the analysis.However, if members of this candidate set of words consistent with the perceptual analysis of the word .are now examined for their relationship to thecontext, 'BUTTER' will be found to be more closely related than any of theother words. This information about the relation between members of thecandidate set and the context can be used to modify the recognition criteriafor these words. Words which are found to be very probable in the contextwill have their recognition criteria lowered, while the criteria for very improbable words wiII be raised. In the present example the close relationshipbetween 'BUTTER' and 'BREAD' will lead to a reduction in the criterion

106 . D. Norris

for recognizing 'BUTTER', so that 'BUTTER' will be identified on the basisof less perceptual information than it would have been in the absence of thecontext. As is the case with the logogen model, the less perceptual information required to recognize a word the faster it can be identified. In this modelit is assumed that the initial stage (or stages) of perceptual analysis operatecontinuously, outputting the most up to date results of their analysis to theprocesses responsible for checking the relationship between the candidate setand the context. In this way the number of words in the candidate set willdecrease rapidly as further perceptual analysis of the input generates newerand smaller sets. Effectively these processes are assumed to operate like thecascade processes described by McClelland (1979).

It is proposed that each word in the candidate set is tagged with some indexof the amount of perceptual evidence in its favour. Words are also assumedto have an indication of their frequency associated with them. In the absenceof context, recognition will normally occur when the weighted combinationof the frequency index and the perceptual evidence in favour of a wordexceeds its recognition criterion. Even if there is only one word remaining inthe set, it is assumed that recognition will still not take place until the criterionis reached. Towards the end of perceptual analysis, the candidate set willtherefore tend to contain only a single word which has not yet exceeded itsrecognition criterion. Therefore, although the candidate set could possiblycontain several words, the checking process will still be able to modify therecognition criterion when there is only a single word in the set. Later, thisfeature of the model will be shown to have important consequences for thetreatment of word frequency effects and non-words.

Word recognition in the checking model consists of five main processes.

1. Perceptual analysis identifies the perceptual characteristics of thestimulus. (This could be letter features or some global analysis of the word.The theory is neutral with respect to the exact form of the analysis.)

2. Perceptual information is used to delineate a subset of lexical entrieswhose perceptual specifications are roughly consistent with the perceptualanalysis of the stimulus. Each word in the set is weighted according to itsfrequency and how well its perceptual specification fits the perceptualanalysis.

3. Each word in the set is accessed and checked to determine its plausibilityin the context. The set used by the checking process is determined by themost recent output from 1 and 2. It is assumed that the checking processwaits till the set has been reduced to a manageable size before it starts tooperate since at the instant a word has been presented the set size wouldencompass the whole lexicon.


4. As a result of 3 the recognition criterion is reduced for plausible wordsand increased for implausible words.

5. The system outputs (recognizes) the first word whose combined perceptual and frequency weightings exceed its (continuously modified) recognitioncriterion.

It is important to note that, although the information flow within the system is completely bottom-up, that is, no stage receives output from a stagewith a higher number, all stages are assumed to operate simultaneously andto operate on the most recent output from the previous stage. For example,while the checking process (3) is operating, stages 1 and 2 will continue tooperate and will output a refined perceptual analysis and a diminishing candidate set.

The contents of the candidate set at different points in recognition arerepresented schematically in Figure l.Each word (i) in the set has associated with it an index of its frequency (Fi-n) ,a count of the perceptual evidence in its favour (Pi-n), and its recognitioncriterion (Ci-n). A word will be recognized whenever the combination ofperceptual and frequency based information in its favour exceeds its recognition criterion, that is, whenever F + P> C. During the course of recognitionthe criterion can be either increased or decreased as a result of the checkingprocess. The lower a word's recognition criterion is, the earlier it will berecognized.

In the example shown in Figure 1, T1 represents a very early stage inperceptual analysis. At Tl the set is relatively large, indicating a considerabledegree of uncertainty in the perceptual analysis. By T2 the analysis has progressed to the point where only three words remain in the set. From T3onwards only a single word remains. Note once more though, that recognitiondoes not take place simply when the set is reduced to a single member. Evenwhen there is only a single word left in the set, recognition must still waituntil F + P > C. The checking process can still operate even when the setcontains only a single word.

In the absence of context the recognition criterion for each word will remain constant across time, that is, Cil = Ci2 = Ci3 = Ci4. Only the amountof perceptual evidence for each word will change with time. In the presenceof an appropriate context such as "BREAD", the checking mechanism wouldcause a reduction in the criterion for a related word like "BUTTER". T1 isintended to represent a very early stage in processing where, although thecandidate set has been generated, there has been insufficient time for thechecking process to operate. Therefore Cl will still be the same as the word'sbaseline criterion. As time passes, the checking process will have more oppor- .

Figure 1. Representation of the information in the candidate set at four points in time (Tl-T4) after stimulus presentation.

o00

~

~~<:;.

T1 1'2 1'3

BUITER, n, Pil, Cil + BUITER, n. Pi2, Ci2 + BUITER, n, Pi3, Ci3BUTLER, Fj, Pjl, Cjl BUTLER, Fj, Pj2, Cj2BAITER, Fk, Pkl, Ckl BAITER, Fk, Pk2, Ck2B1ITER, n, PJI, CllBEITER, Fm, Pml, CmlPUITER, Fn, Pnl, CnlPOITER, Fa, Pol, ColBANTER, Fp, Ppl, Cpl

T4

+ BUITER, n, Pi4, Ci4

F = frequency weighting, P = perceptual evidence, C = contextual evidence.


tunity to modify the criterion. In fact, the longer the recognition processtakes, the more opportunity there will be for checking to modify the criterion.The exact point of recognition will deperid on the extent of the criterion shift.If recognition in a neutral context took place at T4 then a small criterionreduction might cause the words to be recognized at T3. However, with avery large reduction the word might be recognized at 1'2, when the perceptualanalysis is still far from complete.

The major assumptions underlying the operation of the checking modelare outlined below. After a brief comparison between the checking modeland the four theories' of word recognition described above, the consequencesof each of these assumptions will be discussed in detail.

1. Dynamics of checkingThere is a dynamic limitation on the operation of the checking mechanism.

Checking can only exert its influence on recognition in the small amount oftime between access and recognition. Therefore any factor which increasesthe time between access and recognition will increase the opportunity for thechecking process to operate. Similarly, faster checking processes will be morelikely to influence recognition than will slow ones.

2. AmbiguityLexical and perceptual ambiguity are handled by the same checking

mechanism as context effects.

3. Sentence contextThe effects of both sentential and single word contexts are mediated by

the same mechanism.

4. Representation of contextBecause there is neither priming nor any possibility of shifting attention

across the lexicon, context effects are unaffected by the state of the lexicon.The lexicon is nothing more than an information store. Context effects aretherefore primarily determined (although see 5 below) by the subject's mentalrepresentation of the context at the time a word is encountered. For example,sentential context effects will be dependent on the specific representation ofthe sentence which the subject constructs. If the sentence is processed simplyas a string of words then only lexical context effects will be observed.

5. Relation to attentional processesNeither lexical access nor context effects are mediated by attentional pro

cesses. However, Shiffrin and Schneider's (1977) distinction between auto-

110 .D. Norris

matic and conscious processes is assumed to apply to the component processes of the model. Access is assumed to be a highly overlearned and automatic process. The degree of automaticity of checking procedures will dependon their familiarity. For example, if subjects are introduced to a novel butsystematic relationship between words and their context, they will need todevelop a new checking procedure to take advantage of the relationship. Thisnew procedure will become more automatic as the experiment progresses.

6. Word frequencyHigh frequency words have a lower recognition criterion than low fre

quency words.

7. Criterion biasContext influences recognition by means of changes in recognition criterion

only.

8. Perceptual analysisWords which are perceptually confusable with other words will have a

higher recognition criterion than non-confusable words.

Comparison with other models

Although the Checking model is radically different in both its structure andits operation from the other models described, it does, nevertheless, sharecertain features with each of those models. The following section sketchesthe points of similarity and difference between the checking model and theother models. The implications of the differences will be discussed more fullyas the behaviour of the checking model is described in greater detail. Table1 lists the major point of similarity and difference between the models.

Insofar as the checking model involves operations carried out on a candidate set, it bears a superficial resemblance to the verification model. However, unlike the verification model, there is only one candidate set which isgenerated on the basis of perceptual information alone. There is no verification process, and no lexical priming.

In common with the cohort model, the initial stages of word recognitionare perceptually driven. However, in the checking model there is no topdown flow of contextual information into the lexicon. Context effects aremediated by changes in recognition criterion and not by all or none decisions.Also, the checking model contains no active elements corresponding to thelexical elements in the cohort system. As a result of these differences, the


Table 1. Comparison of the [undamental properties of the checking model with theother models of word recognition under discussion.

Logogen Search Verification Cohort Checking

StructureSerial x xInteractive x x x

Locus of context effectPre-access x x xPost-access-pre- x x

recognitionPost-recognition- x

pre-decision

Mechanisms of context effectCriterion-bias x xSearch x xAttention xIntegration x

Mechanism of frequencyeffectCriterion-bias x xSearch x xNone x

checking model can maintain a claim to be data-driven without sacrificing theability to recognize mispronounced or misspelled words.

In postulating a post-access account of context effects, the checking modelshares an important feature with Forster's integration and error checkingmodels. However, in the checking model all context and frequency effectsare due to the post-access mechanism and the model employs neither lexicalsearch nor priming. Furthermore, the integration mechanism is not only postaccess but also post-recognition. In the checking model, context effects arepre-recognition.

It might appear that by relying entirely on such a powerful and underspecified process as the checking mechanism for an explanation of contexteffects the checking model loses out to the search model by virtue of beingso much more powerful. However, the search model requires both the unconstrained power of the post-recognition processes and the additional mechanicsof the underlying search mechanism. Far from being more complex, thechecking model is actually simpler than the search model.

112 D. Norris

The greatest overlap in shared features and assumptions is between thechecking model and the logogen model. Both models assume that:

a. Perceptual analysis is a continuous process. That is, word recognitiondoes not involve a series of discrete matches.

b. Context can only influence recognition by modifying the response criterion. Context can have no influence on perceptual analysis.

c. The word frequency is due to response bias. That is, high frequencywords have a lower recognition criterion (higher frequency weighting)than low frequency words.

d. The main effect of changes in stimulus quality is to change the rate ofperceptual analysis.

But despite these similarities, the change from an account of context effectsbased on priming to one based on a post-access check results in a modelwhich behaves very differently. This alternative account of context effectsmeans that the model also possesses a number of important features notshared by any of the other models. The post-access nature of the contexteffect wiII influence the behaviour of the model in ways which have no directparallel in other theories. Most importantly, the checking process has only alimited time to operate. The amount of time available for checking determines what the checking process will be able to do. Thus the dynamic characteristics of the checking process are the most influential factors in determiningthe model's predictions.

Dynamics of the checking process

The process of checking a word against the context is assumed to be timeconsuming. Furthermore, it has only a short space of time in which to operate. Only checking which can be complected before a word is identified onthe basis of perceptual information alone can have an effect on a word'srecognition criterion. Of course, checking could continue after recognition,but the results of this checking wiII not alter the speed with which a word canbe named or identified. However, it could, for instance, alter the interpretation of a lexical ambiguity.

The fact that checking takes time means that only a limited amount ofchecking can take place during word recognition. More importantly, theamount of checking wiII be different for different words. It wiII also differ forthe same words under different perceptual conditions. Any factor which actsto increase the amount of time between generation of the candidate set andthe recognition point wiII increase the opportunity for the checking process


to operate. If the checking process has more time, then there will be a greaterpotential for context to influence recognition. So, for example, given that lowfrequency words take longer to recognize than high frequency words, lowfrequency words will tend to be more influenced by context. Similarly, perceptually degraded words will also be more subject to the effects of context.Degradation will allow more time for checking and therefore increase thelikelihood that context will affect recognition.

The dynamic constraints on the checking process have the effect of makingthe checking model exhibit "compensatory processing" of the form describedby Stanovich (1980) and Stanovich and West (1983a). Stanovich and Westhave reported a number of studies in which factors having a detrimentaleffect on the speed of word recognition have been shown to increase theinfluence of context. They describe this as compensatory processing. Theharder word recognition becomes, the more the recognition system compensates by increasing the influence of context. This pattern of results is exactlywhat the checking model predicts. However, there is no sense in which therecognition system is actively compensating when it encounters difficultwords. Compensatory processing follows as a direct consequence of the postaccess nature of the contextual processes employed in the checking model,and the extra time available for checking 'harder' words.

The amount of time available for checking will also determine which formsof context will influence recognition. Not all contextual relationships willtake the same time to check. Establishing the existence of some relationshipswill be almost trivial. For example, it seems reasonable to assume that simpleassociative relationships like BLACK-WHITE will be stored explicitly inmemory. Checking that BLACK and WHITE are associatively related probably involves little more than memory look-up. Such a process should takerelatively little time and should therefore be able to facilitate the identification even of words which can be recognized very quickly. On the other hand,some relationships may be so complex that a great deal of computation isrequired in order to check a word's plausibility. In extreme cases, the relationship may be so difficult to compute that even if contextual constrainstare very strong, checking may not be completed in time to affect recognition.

Consider how context might influence identification of the word "eight"in sentence 2.

2. The result of dividing two-hundred and sixteen by twenty-seven is eight.

The context here is very highly predictive. However, anyone with normalprowess at mental arithmetic is unlikely to be able to benefit from the contextso as to identify "eight" faster than a less predictable words such as "seven"or "nine". Clearly, predictability and plausibility are not in themselves suffi-

114 D. Norris

cient to determine whether a given context will influence recogmtron.Whether or not context affects recognition depends to a large extent on howquickly the checking process can be carried out, and how much time is available for it to operate. A relationship which might not have time to affect aclearly presented high frequency word could well have a considerable influence on a dimly presented word of low frequency. Later it will be shown howthese kinds of consideration playa very important role in determining thechecking model's predictions about the interaction of context, stimulus qual-ity and word frequency. .

The checking model and ambiguity

The checking model operates by using contextual information to resolve perceptual ambiguity in the analysis of the input. However, the same processeswhich are used to resolve perceptual ambiguity will also serve to resolvelexical ambiguity. The most obvious way to determine which reading of alexically ambiguous word is intended is to evaluate each reading and to decidewhich fits best with the context. This is exactly what must be done in orderto find out whether a perceptually ambiguous word corresponds to 'BUTTER' or 'BUTLER'. The only respect in which the two situations differ isthat in one case the two readings are associated with a single word, whereasin the other they are associated with different words. The checking modeltherefore contains a mechanism which will resolve lexical ambiguity as wellas allow context to influence recognition. This is an important attribute ofthe model because it helps to provide an independent justification for thechecking mechanism.

Although the ability to use contextual information to speed word recognition is of great advantage to a comprehension system, it is by no meansessential. A system which could not utilize context would simply proceed ata slower rate. However, a system which could not resolve lexical (or perceptual) ambiguity would be at a major disadvantage. Ambiguity pervades thelanguage to such an extent that an inability to cope with it would makecomprehension impossible. The obvious way for a system to deal with ambiguity has already been suggested; all readings of the word should beevaluated to determine which is most appropriate in the context. However,if one has such a system, one already has all the basic properties which thechecking model requires for context to influence word recognition. In modelsof word recognition employing a priming mechanism, priming exists solelyto explain how recognition can be influenced by context. Priming serves noother purpos~. But in the checking model context effects are produced by a


mechanism whose primary function is to deal with ambiguity. The fact thatcontext influences word recognition can be considered to be a spin-off fromthe more vital process of resolving ambiguity.

Sentential context and the checking model

Forster suggested that effects of sentential context were unlikely because ofthe difficulty of selectively priming a suitable subset of lexical entries. However, this criticism does not apply to the checking model at all. The validityof Forster's argument rests on the assumption that the only way in whichcontext can influence word identification is by means of priming or by postrecognition processes. In the checking model there is no priming; individuallexical entries are simply information stores. Context effects come aboutentirely through operations carried out on the perceptually derived candidateset. This completely avoids the necessity of priming or predicting contextuallyprobable words. Indeed, Forster himself briefly considered such a possibilityin his 1981 paper. According to the checking model, if a word is found to beprobable in its context its recognition criterion will be reduced and this willapply equally to sentential context and to associative context. The main difference between associative context and sentential context will be that associative relations will typically be represented directly in memory, whereasthe relation between a word and its sentential context will generally have tobe computed on line. In the case of sentential context the checking processwill effectively consist of evaluating the plausibility of candidate words ascontinuations of the sentence. As has already been argued, the processesinvolved in evaluating the plausibility of candidate words are processes whichare essential for efficient comprehension; they are necessary to resolve ambiguity.

Inhibition

Unlike priming models, the checking model accounts for the inhibitory effectsof context by exactly the same mechanism which explains facilitation. Thechecking model modifies the recognition criterion for each word accordingto how well the word fits with the context. This means that as well as decreasing the criterion for very probable words, the system will also increase thecriterion for improbable words. If a member of the candidate set is found tobe implausible in its context its criterion will be increased. Therefore morethan the usual amount of perceptual information will be required for it to berecognized. There is no need to supplement the model with attentional processes simply to explain inhibition.

116 D. Norris

Note that the inhibition of improbable words is yet another.feature of themodel which follows from adopting a mechanism whose main function is toresolve lexical ambiguity. Consider the problem of determining the appropriate reading of 'chest' in sentence 3.

3. The men lifted the chest.

Out of context, 'chest' has a dominant reading corresponding to a part of thebody. However, this would be a rather implausible reading in the context ofsentence 3. .In this context, the appropriate reading is not arrived at becauseit is very probable (it is certainly not very predictable), but because thealternative reading is highly improbable. In this case, resolving the ambiguitydepends on inhibiting the inappropriate reading and not on facilitating theappropriate reading. Inhibition of inappropriate readings of ambiguous wordsis a process which will normally act so as to speed comprehension. Therefore,somewhat paradoxically, inhibition will typically be associated with a benefitrather than a cost. However, when a word has no plausible reading at all,the effect of the checking process will be to inhibit its recognition. Thisexplains why inhibition should sometimes be present in the absence of anyfacilitation (Fischler & Bloom, 1979). In terms of the two-process theory ofattention, observations of cost without benefit make little sense. Inhibitionis simply the cost of focusing attention on probable words in the lexicon.However, in the checking model, inhibition is not a side effect of facilitation,but is itself an important factor in ensuring fluent comprehension.

Superficially the fact that facilitation and inhibition are both attributed tothe same basic checking mechanism might seem to suggest that facilitatoryand inhibitory effects should be symmetrical. However, this is not the case.Sometimes context will provide information which increases the probabilityof a small set of words but does very little to affect the probability of otherwords. On other occasions context will increase the probability of a fairlylarge set of words only slightly, but will make other words very improbableindeed. An example of the first kind of context would be the associativecontext employed in many lexical decision experiments. In an experiment inwhich 50% of the 'prime'-word pairs are related, the context provided bypriming words will increase the probability of a small set of words; thosewhich are strong associates of the prime. However, on the remaining 50% ofthe trials, almost any word in the language could appear. There are not reallyany words which are highly improbable in this context. Of course, in thechecking model the magnitude of the context effect is not determined directlyby a word's probability, but rather by the ease with which the checkingmechanism can determine its probability. What makes a large inhibitory effect improbable in a simple priming experiment is the difficulty of determin-


ing that an unrelated word is improbable. The only way to discover that aword is unrelated would be to perform an exhaustive search through memoryto check that the word is not associatively related to the context. Under thesecircumstances there should therefore be a facilitatory effect of context, butonly a small inhibitory effect of context relative to some neutral baseline.

However, the situation will be very different for words presented in sentential context. As Fischler and Bloom (1980) found, sentential context canproduce inhibition even in the absence of any facilitatory effects. A wordwhich is semantically or syntactically anomalous in its context is not simplyunrelated to the context (in fact it may well be associatively related to wordsin the context), it violates linguistic or pragmatic constraints. If the violationcan be detected, then the criterion for that candidate will be increased and

, that word will take longer to recognize. In normal discourse semantic ori syntactic anomaly is highly unlikely. Therefore if a member of the candidate, set is found to be anomalous it is very unlikely to be the word which is being

recognized (unless the word is ambiguous and has another reading which is! not anomalous). Increasing the recognition criterion for implausible members

of the candidate set will therefore normally help to speed comprehension byensuring that the system does not commit itself to identifying an implausibleword without strong perceptual evidence in its favour. Inhibition is thereforemore likely in sentence context than with single word associative contextsboth because it is easier to identify candidates as implausible, and becauseinhibition of improbable candidates will be a highly practised process designed to facilitate normal comprehension.

The checking model's account of inhibition differs in one important respectfrom that of Posner and Snyder's (1975) two-process theory. Whereas thetwo-process theory predicts that the attentional component of priming andinhibition should be slower acting than the inhibitionless spreading activation,there is no such constraint on the build-up of inhibition within the checkingmodel. Under conditions where it is easy to determine whether a words isimplausible in its context, inhibitory effects should be as fast acting asfacilitatory effects. However, some studies have provided what appears to bevery impressive support for the two-process theory. In a lexical decision task,Neely (1977) found significant effects of both facilitation and inhibition atSOAs of 400 ms and above, but only facilitatory effects at an SOA of 250ms. This asymmetry in the development of facilitation and inhibition isexactly what the two-process theory predicts. However, in a study by Antos(1979) the pattern of results was quite different. Antos used a lexical decisiontask in which category members primed category names. For example,'APPLE' might prime 'FRUIT'. Antos found a significant degree of inhibition even at an SOA of only 200 ms. At this SOA there was no significant

118 D. Norris

effect of facilitation. Such a rapid build up of inhibition seems quite incompatible with the predictions of the attentional model. Why should Neely havefound no evidence of inhibition at short SOAs in his experiment? One criticaldifference between the two experiments may have been the degree of speedaccuracy trade-off. Antos points out that there are quite large differences inthe error rates between conditions in Neely's study. The error rates for themost comparable short SOA conditions where some degree of inhibitionmight have been expected (No Shift Unrelated) were 4.3% greater than theerror rates for the corresponding neutral condition. If this error rate difference reflects a degree of speed-accuracy trade-off, then the large error ratescould be masking some underlying inhibition. At short SOAs Neely's experiment may not have produced any cost in reaction time, but there may wellhave been a cost in errors.

The Antos and Neely experiments also differ in terms of the prime-targetrelationships which each employed. In Neely's experiment there was a widevariation in the nature of the relationship; in the conditions most comparableto the Antos study, category names primed category members. In other conditions category names primed members of different categories. Even onvalidly primed trials, subjects in Neely's experiment could not determinewhich specific words were probable and which were improbable with thesame reliability as the subjects in the Antos study. Therefore in Neely'sexperiment there was less opportunity for the checking process to detect thatunrelated words were impropable and increase their recognition criterion.Thus inhibition would be expected to be smaller than in the Antos study.

Shifting attention

Explanations of context effects in terms of attentional mechanisms have assumed that attention can be selectively deployed within the lexicon. In thechecking model, attention can have no direct influence on the lexicon itself,but processes under conscious control can influence the pattern of facilitationand inhibition by determining the form of the subject's internal representation of the context. If subjects can alter the effective representation of thecontext, then they will be able to control the effect which context has onrecognition. For instance, in Neely's experiment, category names could indicate that the decision word would probably be a member of some othercategory; 'body' could prime the name of part of a building such as 'door'.Neely referred to this as a 'Shift' condition, implying that subjects had to shiftattention from one category to another. Neely assumed that the shift wouldbe a shift of attention in the lexicon. In terms of the checking model what


changes in the 'Shift' condition is not the focus of attention on the lexicon,but the effective representation of the context. What the subject must do istranslate 'body' to 'part of a building'. The only 'shift' required is a shift inthe representation of the context, there is no need to shift the focus of attention in the lexicon. Once the name of the expected category has become theeffective context then it will produce context effects similar to those of acategory name on members of its own category. Note that this consciousprocessing of the context will take time. Therefore the facilitation and inhibition will take longer to become effective than that following from an automatically elicited representation of the context. The checking model thereforemakes predictions very similar to those of the two-process theory, but in thechecking model the critical factor which determines the nature of the contexteffect is the form of the representation of the context when a word is presented, not the time elapsing between a word and the context. The 'Shift'condition in Neely's experiment probably requires subjects to perform arather unusual form of conscious operation to alter the context. However, inthe course of normal reading, subjects will adopt a variety of differentstrategies which lead them to concentrate more or less heavily on variousattributes of the text. Any effects of context produced by the text will dependas much on the representation of the text which the SUbject's goals lead himor her to construct as on any representation which is formed as an automaticconsequence of the reading process.

Word frequency

One of the major phenomena which any detailed model of word recognitionmust account for is the word frequency effect. The checking model proposesthat word frequency is primarily an effect of criterion bias. In common withthe logogen model it is assumed that less perceptual evidence is required toidentify a high frequency word than a low frequency word. In order to keeptrack of frequency information during recognition, words in the candidate setare tagged with an index of their frequency (see Figure 1). Because thechecking model accounts for the word frequency effect by means of a criterionbias mechanism, it has no problems in explaining the frequency blockingeffects observed by Glanzer and Ehrenreich and by Gordon. These resultspresent a major difficulty for the single list search and verification models.However, the checking model can account for the blocking effects in themanner suggested by Gordon. As described earlier, a criterion biasmechanism can explain these data by assuming that a lower criterion isadopted for the pure high frequency lists than the pure low frequency lists,

120 D. Norris

and an intermediate criterion for the mixed high and low frequency lists.Insofar as context and frequency effects are both mediated by a criterion

bias mechanism, it would appear that the checking model should make thesame predictions as the logogen model with regard to the relation betweencontext, frequency and stimulus quality. In the logogen model both contextand frequency act independently to reduce the number of features requiredfor a logogen to exceed its response threshold. The effects of context andfrequency should therefore be additive. Stimulus quality is assumed to influence the logogen system by altering the rate at which featural informationcan be extracted from the input (Becker & Killion, 1977; Sanford, Garrod,& Boyle, 1977; although see McClelland, 1979 for an alternative suggestion).In this way, a reduction in stimulus quality which halved the rate of featureextraction would double the size of a given effect of frequency or context.The logogen model therefore predicts that the effects of context and frequency should interact with stimulus quality.

However, if for the moment we consider only studies using single wordcontexts, we find that only one of these predictions generally holds true.Context interacts with stimulus quality (Becker & Killion, 1977; Meyer,Schvaneveldt, & Ruddy, 1975) but context and frequency interact (Becker,1979). The case of frequency and stimulus quality is more complex. Stanners,Jastrzembski and Westbrook (1975) and Becker and Killion found frequencyand stimulus quality to be additive while Norris (1984a) has reported thatthese two factors can be made to interact. One of the major problems for thelogogen model is Becker and Killion's finding that, given comparable effectsof context and frequency, the context effect interacts with stimulus qualitywhereas the frequency effect appears additive. The logogen model predictsthat both context and frequency should behave in the same way with respectto manipulations in stimulus quality. However, these results actually fit wellwith the predictions of the checking model. Although the checking model isa criterion bias system, the behaviour of the system is also critically dependenton the time available for the checking process to modify recognition criteria(see section on "Dynamics of the checking process"). As a result, the checking model makes rather different predictions from the logogen model.

As in the logogen model the effect of context in the checking model willbe to reduce the criterion for recognizing a word on the basis of perceptualinformation. If this reduction in criterion were independent of stimulus quality, the checking model ought to make exactly the same predictions as thelogogen model. However, there is a second factor which determines the sizeof the context effect in the checking model. The size of the context effect willalso depend on the amount of time between the start of the checking operation and the point at which the criterion is exceeded and the word is recog-


nized. Within certain limits, the greater the time available for the checkingprocess the greater will be the effect of the context. More specifically, thegreater the time the greater the potential for context to influence recognition.Therefore if a word is degraded and the perceptual component of the recognition process is slowed down, the size of the context effect will increasesimply because there is more time for the checking process to operate. In thechecking model context and stimulus quality interact for two quite differentreasons. First, slowing down the rate of feature extraction will increase thebenefit which follows from a given reduction in criterion. This is exactly thesame principle which predicts the context by stimulus quality interaction inthe logogen model. Second, slowing down the rate of feature extraction willalso increase the time available for the checking process, and this will tendto increase the size of the criterion reduction produced by a given context.This mechanism is specific to the checking model. What is more, this secondmechanism only applies to context effects. The relation between frequencyand stimulus quality should be exactly as predicted by the logogen modelbecause the contextual checking process does not contribute to the frequencyeffect. This leads to .the prediction that both context and frequency shouldinteract with stimulus quality but, more importantly, the context by stimulusquality interaction should be larger than the frequency by stimulus qualityinteraction. Incidentally, the prediction that factors increasing the time available for checking increase the size of the context effect also correctly predictsan interaction between context and frequency which is not predicted by thelogogen model. The higher criterion for low frequency words will lead tolonger recognition times and therefore a greater effect of context.

The prediction that the context by stimulus quality interaction should belarger than the frequency by stimulus quality interaction makes the predictions of the checking model compatible with the results of the study by Beckerand Killion. Becker and Killion demonstrated that, for a context effect anda frequency effect of comparable size, the context effect interacts withstimulus quality but the frequency effect does not. Such a result is obviouslyincompatible with the logogen model's prediction that these interactionsshould be equal in size. However, although the checking model predicts thatfrequency should interact with stimulus quality, the frequency by stimulusquality interaction should be smaller than the context by stimulus qualityinteraction. Therefore the model does correctly predict the relative size ofthe two interactions. Although Becker and Killion's experiment was powerfulenough to detect a frequency by stimulus quality interaction which was aslarge as their context by stimulus quality interaction, there is no way ofknowing whether it would detect a smaller interaction. The experiment maywell not have had the power to detect a very much smaller interaction be-

122 D. Norris

tween frequency and stimulus quality. In a lexical decision experiment usingfar larger effects of frequency and degradation than Becker and Killion,Norris demonstrated that frequency and degradation do interact. Such aninteraction provides further support for the checking model, but it also provides clear evidence against current formulations of Becker's verificationmodel and Forster's search model, both of which predict additivity. Norrishas suggested one way in which the verification model could be salvaged. Incases of very severe degradation stimulus quality might influence the speedof the verification cycle as well as the generation of the candidate set. Although such a move increases the power of the verification model it does atleast acknowledge the possibility that different forms of degradation mayhave qualitatively different effects.

The preceding discussion dealt only with the relation of word frequencyto context produced by single word primes. However, if we examine theresults of studies using incomplete sentences as context, the findings areslightly different. Studies by Schuberth and Eimas (1977) and Schuberth etal. (1981) have found additive effects of context and frequency rather thanthe interaction found by Becker. As the size of the context and frequencyeffects in these experiments is similar, this result is rather problematical. Itseems to force us to the view that there is a major qualitative differencebetween single word context and sentential context.

However, Forster's suggestion that some context effects in the lexical decision task may be due to decision processes rather than to recognition processes may provide an explanation for this inconsistency in the data. There is apossibility that what appears at first to be an effect of context on word recognition in these experiments is actually an effect of the overall plausibility ofthe whole sentence influencing the decision mechanism. The argument hereis the same as that presented by Forster (1979) to account for performanceon sentence matching tasks, and to account for data from lexical decisionexperiments (Forster, 1981b). Stanovich and West (1983b) have also presented a version of this argument to explain discrepancies between the resultsof their own experiments using lexical decision and naming tasks. The standard interpretation of lexical decision experiments on sentence context effectsassumes that subjects are responding simply on the basis of decisions aboutthe lexical status of the final word in the sentence. However, this is not theonly way to make a response in these experiments. As with the sentencematching tasks studied by Forster, subjects can also make a correct 'Yes'response if the word completes a plausible sentence. Clearly a non-word cannever form a plausible sentence, so this strategy will never lead to an incorrectresponse. The contextually probable words in the sentences will tend to produce sentences which are more plausible than those produced by less proba-


ble words and, as Forster (1979) has shown in both sentence matching andclassification tasks, more plausible sentences are processed faster than implausible sentences. If subjects do make their responses on the basis of decisions at the sentence level rather than the lexical level, they wiII thereforemake the fastest responses to the sentences ending with the most probablewords. Obviously deciding that a sentence is plausible is contingent on theperception of its constituent words, but as Foss and Swinney (1973) havepointed out in an analysis of phoneme and word monitoring tasks, the orderin which information becomes available for response execution may not reflect the order of perception. So although deciding that a sentence is plausibleis contingent on an analysis of its individual words, a subject may neverthelessbe able to respond on the basis of a decision at the sentence level before aresponse can be made at the lexical level.

Although this analysis provides an account of some context effects in termsof a mechanism which operates only after recognition, it is quite differentfrom Forster's (1981b) post-access integration mechanism. The integrationmechanism acts by blocking the output of the lexicon while a word is beingintegrated with the context. The integration mechanism should operate in alltasks where subjects are dealing with sentence contexts. However, the account of plausibility effects is a race model. It rests entirely on the fact thatthere are two alternative sources of information on which subjects could basetheir responses; lexical information, and information about plausibility. Asentence level judgement about plausibility can only help to facilitate theresponse if it becomes available before a decision can be made at the lexicallevel. In a different task, such as naming latency, subjects would be unableto base their response simply on plausibility judgements. To know that asentence is plausible says nothing about the identity of the word to be named.In a naming task, responses must be based on lexical information. The overallplausibility of the sentence can have no effect on the response, beyond itseffect on recognition itself. Any context effects occurring after recognitionshould be unable to influence word naming.

If all, or even some, of the context effect in these lexical decision experiments is attributable to factors influencing processing at the sentence levelrather than the lexical level, then the implication of these experiments formodels of word recognition becomes quite different. Both the checking modeland Becker's verification model predict that context and word frequencyshould interact, as found in Becker's experiment. However, neither modelwould predict any interaction between the overall plausibility of a sentenceand the frequency of its final word. A post-recognition effect of sentenceplausibility can have no influence on the word recognition process itself. Ifthe apparent additivity of frequency and context in these experiments is due

124 D. Norris

in part to an effect of plausibility rather than context, then there is no longerany conflict between the data from sentential and non-sentential context.However, it is important to remember that these criticisms do not apply totasks such as naming. Therefore if frequency and context were found to beadditive in a naming task with sentence contexts, but interactive with singleword contexts, this would necessitate a revision of the present account ofword frequency.

Criterion bias mechanisms and context

Like the logogen model, the checking model relies entirely on criterion biasmechanisms to account for both context and word frequency. However, iscriterion bias a sufficient mechanism to explain these phenomena, or is itnecessary to incorporate some more complex mechanism into the model?Both Antos (1979) and O'Connor and Forster (1981) have obtained primingeffects in experiments using non-words which differed only slightly from theprimed words. In the Antos study non-words differed by only a single letterfrom the primed words and in the O'Connor and Forster study non-wordswere formed by transposing adjacent letters in words used in the experiment.In both of these studies it was claimed that the fact that priming was obtainedwith such similar words and non-words is evidence against criterion bias models of priming. The argument runs as follows: if a word must be distinguishedfrom a very similar non-word, then this will require almost all of the availablefeatural information to be extracted from the stimulus in order to respondaccurately. Therefore it will be impossible to speed recognition by reducingthe response criterion, since if the criterion is reduced beyond the level required to perform the lexical decision accurately, the error rate will risedramatically.

However, the validity of this line of argument rests largely on the assumption that the 'small' differences between the words and non-words used inthese experiments actually lead to only small differences in the access codeswhich they generate. It is quite possible that even when stimuli differ by onlya single letter, there is a sufficiently large difference in access code to allowthe system to reduce the recognition criterion for a word by quite a substantialamount in the presence of appropriate context, without the criterion beingreduced so low as to lead to a very high error rate.

In lexical decision experiments the degree of speed-accuracy trade-off willbe influenced by a number of factors. One of the most important of thesewill be the level of the subject's baseline criterion. For example, considerwhat will happen when the rate of increase in accuracy of responding is an


inverse function of processing time. This corresponds to the case where information is being extracted from the stimulus very rapidly at first and the rateof information extraction decreases overtime. When subjects are respondingvery quickly and very inaccurately (i.e., they are adopting a low criterion) asmall change in response time will be associated with a very large change inaccuracy. However, with slower and more accurate responding (a high criterion), the same change in response time will only be associated with a verysmall change in accuracy. Of course, if subjects are overly conservative theymight adopt a criterion even higher than that required to ensure 100% accuracy. In this case it will be possible to reduce the criterion and speed recognition without any increase in error rate whatsoever. Note also that evenwhen a subject in a lexical decision experiment is adopting a sufficiently highcriterion to ensure accurate identification there is still a possibility that mistakes may be made in response execution. So the fact that subjects makeerrors in any given experiment does not necessarily imply that they are usinga low recognition criterion.

The preceding considerations provide an explanation of the results of apair of experiments which Schvaneveldt and McDonald (1981) have presented as evidence against an account of context effects purely in terms ofcriterion bias. Schvaneveldt and McDonald had subjects perform a standardlexical decision priming experiment in which they recorded both speed andaccuracy of responding. A second group of subjects performed a version ofthe experiment in which there was only a brief tachistoscopic presentation ofthe decision word, and the subjects' task was simply to decide whether thestimulus was a word. Subjects in this second experiment were under no timepressure. In the standard reaction-time version of the experiments,Schvaneveldt and McDonald found that although there was a significant effect of priming, error rates to both related and unrelated decision words werealmost identical at about 5%. There was no indication of any speed-accuracytrade-off, and this was considered to be incompatible with a criterion biasmodel. However, in the tachistoscopic report experiment there were 12%fewer errors to related than to unrelated words. Schvaneveldt and McDonaldinterpret this as evidence that two quite different types of semantic primingare taking place in the two experiments: one which is mediated by criterionbias and one which is not. However, it has already been pointed out that acriterion bias system will often be able to use context to speed respondingwithout any very large change in error rate. If the rate of information extraction decreases over time this will be particularly so when the SUbject's baselinecriterion is very high. A criterion which is set high could be lowered withoutany drastic increase in error rate. However, if the criterion is initially set soas to produce a large error rate, any further shifts in criterion would tend to

126 D. Norris

lead to further changes in error rate. From the rather low error rates observedin Schvaneveldt and McDonald's reaction time experiment one would inferthat the criterion was set fairly high. However, in the tachistoscopic experiment the presentation conditions were deliberately manipulated so as to produce an overall error rate of about 25%. Under these conditions subjectsmust be adopting a very much lower criterion. It would appear that ratherthan reflecting the operation of two qualitatively different primingmechanisms, these results are simply a consequence of observing the operation of the same system at two very different baseline criteria.

Criterion bias and signal detectability

Evidence which has a far more direct bearing on the problem of differentiating between alternative models of word recognition is provided by anotherof the experiments in the study by Antos. In a speed-accuracy trade-offstudy of context effects in a lexical decision task, Antos found that contextinfluenced both criterion bias (beta) and d'. As the checking model is acriterion bias system it might appear that it should predict that the contextshould lead to a change in beta but no change in d'. However, the predictionsfrom signal detection theory are not as straightforward as they might firstappear. Although the effects of context in the checking model are mediatedby changes in recognition criterion, these changes operate in a manner whichleads to changes in d' as well as beta. This is due to the fact that the primingword in a lexical decision experiment will almost inevitably lead to a reductionin stimulus uncertainty which will allow even a criterion bias system to effectan increase in d' on validity primed trials.

When interpreting the results of a signal detection theory analysis it isimportant to remember that the analysis simply gives a description of thebehaviour of the whole system under observation. It says nothing about thenature of the component processes in that system. Discovering that contextalters d' does not rule out the possibility that the change in d' is mediated bya criterion bias mechanism.

To make this rather counterintuitive claim clear, consider what might happen in a lexical decision experiment where the non-words are selected atrandom. That is, assume that, unlike the Antos and the O'Connor and Forster experiments, the non-words are not deliberately selected so as to besimilar to primed words. When the subject has only partly analysed a wordand has identified the letters 'BUT ER', the candidate set could contain thewords 'BUTTER' and 'BUTLER'. At this point there is still some uncertaintyin the analysis and the subject will not be able to make a totally confident


judgement as to whether the stimulus is a word or not. There is a possibilitythat the stimulus could be a non-word such as 'BUTFER'. To be sure ofmaking the correct response the subject will have to wait for a more detailedperceptual analysis. However, when the decision word is preceded by theprime 'BREAD', the subject will lower his response criterion for 'BUTTER'.If the criterion is lowered sufficiently, the subject will respond 'Yes'. That is,responses will be faster and more accurate for primed than for unprimedwords. Whether this leads to a change in beta or d' depends on whether thisshift in criterion is accompanied by a proportional increase in the error ratefor non-words. If the non-words in the experiment are rarely very similar tothe primed words, the subject will be able to make large reductions in recognition criterion without running the risk of making errors. If typical primeword and prime-non-word trials in the experiment are BREAD-BUTTERand BREAD-HOFLOT, then the candidate set generated by the non-wordswill be highly unlikely to contain a word related to the prime. Context willtherefore lead to faster responses to primed words but very little increase inerror rate to non-words. In other words, context will lead to a large increasein d'. If the prime-non-word trials were very similar to the prime-word trials,for example, BREAD-BUTFER, then reducing the response criterion wouldstill lead to faster responses to primed words. However, there would be a fargreater chance of incorrectly classifying similar non-words as words if thecriterion was lowered too far. This increase in error rate associated with thereduction in criterion means that context will lead to a change in beta and afar smaller change in d' than when the word-non-word discrimination iseasier.

So long as the context effects are associated with an increase in the nonword error rate it might seem that context should alter beta and not d'. But,lexical decision provides a rather imperfect approximation to the ideal signaldetection task. If we wish to generate non-words which are as similar aspossible to the primed words in an experiment, the best we can do is tochoose non-words that differ from the primed word by only a single letter.However, we have no guarantee that these non-words will really be perceptually more similar to the primed words than to some other word. For example, BUTFER might generate an access code more similar to BUTLER thanto BUTTER. If that were true we would have more headroom for reducingthe criterion for BUTTER, since it would have to be reduced to at least thelevel of the criterion for BUTLER before BUTFER would be wrongly classified. Indeed, perhaps BUTTER might not appear in the candidate set atall. If that were the case then BREAD-BUTFER would be no different fromBREAD-HOFLOT. Until we know more about what makes words perceptually similar we will be unable to use a signal detection theory approach to

128 D. Norris

rule out even the simplest form of criterion bias model possible.One additional point should be noted in connection with signal detection

analyses of lexical decision. Norris (1984b) has presented data which suggestthat lexical decision often involves a spelling check. This check uses the wordfor which there is most perceptual and contextual evidence as its reference.If the stimulus is spelled in the same way as the reference then it is classifiedas a word. If the check fails, the stimulus is classified as a non-word. Ifcontext effects are mediated by a criterion bias mechanism then primed wordswill be more likely to be classified correctly because of the greater probabilityof selecting the correct word as a reference for the spelling check. However,if on BREAD-BUTFER trials BUTTER is selected as the reference, thenon-word will fail the check and the correct response will still be made. Theeffect of the spelling check will therefore be to allow a criterion biasmechanism to operate without the large increase in non-word errors whichmight otherwise be expected. The lexical decision task may therefore exaggerate the effects of context on d'.

According to the checking model the effect of context on beta will begreater the more similar are the words and non-words. However, it is veryunlikely that words and non-words in a lexical decision task could possiblybe made so similar that the context effects produced by a criterion bias systemwill only influence beta and not d'. Words and non-words are about as similaras they can be if they differ by only a single letter. Therefore the accordingto the checking model context should influence both d' and beta in lexicaldecision experiments. However, in both Forster's search model and Becker'sverification model context has its effect by speeding up the access process.In the search model context operates by bypassing the search through theperipheral access files, and in the verification model the contextually produced candidate set can be used to identify a word before the featurallyproduced set. Neither of these models has any mechanism for altering theresponse criterion. Therefore neither model should predict any effect of context on beta. Context should only influence d'. Since Antos found that context influenced both d' and beta, both of these models should be rejected infavour of a criterion bias model such as the checking model or logogen model.

Perceptual analysis in the checking model

For the post-access checking process required by the checking model to operate effectively, it needs to be supplied with a list of words whose perceptualspecifications are approximately consistent with the perceptual analysis of theinput. It is actually very important that the list should not be restricted to


words whose specifications are entirely consistent with the analysis, otherwisethe system would not be able to recognize words which were slightly mispronounced or misspelled. The candidate set needs to contain a set of wordswhose perceptual specifications are closely centered around that of theanalysis of the input. Treisman (1978) has described a model of word recognition which comes very close to fitting the requirements for the early perceptual stages of the checking model. Treisman has suggested that complexstimuli such as words should be considered to be represented in terms of amultidimensional perceptual space. In Treisman's model, perceptual analysiscan be thought of as delimiting a subvolume of perceptual space which willcontain the stimulus word, and possibly other words if the stimulus is notclearly identifiable. The size of the subvolume is effectively an index of theresidual uncertainty in the perceptual analysis. When input information islimited, the subvolume will be large and will therefore tend to contain morewords than a small subvolume. Treisman assumes that there will be equalperceptual evidence for all words in the perceptual subvolume. To equatethis subvolume with the candidate set in the checking model would thereforeresult in the set containing only words whose specifications were entirelyconsistent with the perceptual analysis. However, as already mentioned inthe discussion of the cohort model, this would leave the model unable torecognize slightly mispronounced or misspelled words, even if there wereconsiderable contextual evidence in their favour. The candidate set musttherefore also contain words which are some small distance outside the perceptual subvolume. A further requirement of the model is that members ofthe candidate set should be tagged with a measure of the perceptual evidencein their favour. As the size of the subvolume itself is a measure of the certainty of the analysis, this would provide a suitable index of perceptual evidence for words within the subvolume. Words outside the subvolume clearlyhave far less evidence in their favour than those within the subvolume. Themore distant these words are from the periphery of the subvolume, the lessevidence there is in their favour. In fact, the distance of a' word from thesubvolume actually provides an index of how inconsistent a word is with theperceptual analysis.

Other than when a word is mispronounced or misspelled the only occasionwhen the perceptual subvolume will not contain a word is when a subject istrying to identify a non-word. In the absence of any context, a subject in alexical decision experiment will be able to respond 'No' whenever the subvolume no longer contains any words. That is, whenever the candidate setno longer contains any words consistent with the perceptual analysis, thesubject can be sure that, unless there is an error in the analysis, the stimulusis a non-word. In the logogen model there is no clear mechanism for identify-

130 D. Norris

ing non-words. It has been suggested that non-word responses in a lexicaldecision task could be made by default if no logogen had fired after a certainamount of time (Coltheart, Davelaar, Jonasson, & Besner, 1976). However,this would lead to the prediction that all non-words should take equally longto identify. Coltheart et al. have demonstrated that certain kinds of non-wordsare classified faster than others in a lexical decision task. Non-words whichare very similar to words take longer to classify than non-words which bearlittle similarity to any real words. In the present model it is assumed thatperceptual, analysis of a word involves identifying the point in perceptualspace corresponding to the word's perceptual specifications. However, nonwords can also be represented by a point in perceptual space. Non-wordswhich are similar to a real word will be close to that word in perceptual space.Dissimilar non-words will be quite distant from real words. This means thatalthough the correct response to a dissimilar non-word can be made whenthe perceptual analysis has delimited quite a large subvolume, classificationof a similar non-word must wait until the subvolume is very small. Treisman'ssuggestion that word recognition should be considered to involve analysingthe perceptual specifications of the input in terms of coordinates in a multidimensional perceptual space clearly fits the requirements of the checkingmodel well. Most importantly, structuring the perceptual analysis in thechecking model in this manner provides an explanation of the perception ofnon-words as well as words.

Conclusions

In the most general terms, the model of word recognition described herebears a superficial resemblance to the logogen model. Both models are criterion bias systems in which perceptual information can be traded off againstcontextual information. However, unlike the logogen model, the checkingmodel does not require a dedicated feature counter for each word in thelexicon. All that the checking model requires is a system which can identifya set of words, each of which is roughly consistent with the perceptual analysisof the input, and tag each word with an index of the goodness of fit betweenthe analysis and the word's perceptual specification in the lexicon. Quiteclearly the system does need some means of keeping track of the perceptualand contextual evidence in favour of each candidate, but the candidate setshould be small enough for this to be done in a temporary store. There is noneed to keep track of this information for all words in the language in theway that the logogen model does. This results in a great simplification of the

. early stages of word recognition. The lexicon itself is simply an information


store from which lexical information can be accessed once the perceptualspecifications of a word have been determined; it contains no feature counters, or active devices such as the lexical elements in the cohort model. Oneconsequence of adopting this approach is that there are no longer any problems associated with increases in vocabulary. Acquiring a new word is simplya question of adding more information to the store. There is no need eitherto create a new logogen or to assume that we are endowed at birth with allthe logogens we might ever require. Also, there are no longer difficultiesassociated with the recognition of non-words. Non-words will be recognizedby the same perceptual mechanism as required for words.

The most significant feature of the checking model which distinguishes itfrom the logogen model is that it is a bottom-up system. In the logogen modelhigh level contextual information must be fed into the logogen system toproduce priming. A number of authors have made strong claims that spokenword recognition (Marslen-Wilson, 1975) and visual word recognition arehighly interactive processes. It seems that the mere fact that context doesinfluence word recognition is taken as evidence for an interaction betweenprocesses. However, the principles employed in the checking model providea clear demonstration of how context can influence word recognition withina model with a completely bottom-up flow of information. This is achievedby having the lexicon continuously produce multiple analyses of the inputwhich are output to subsequent stages. If the lexicon waited until it couldproduce an unambiguous analysis of the input then a given word would alwaystake the same time to identify regardless of the context. By producing anoutput which is ambiguous the lexicon provides subsequent stages with theopportunity to accept an incomplete analysis when it is found to be veryprobable in the context. This allows higher level processes to influence thepoint at which sufficient evidence has accumulated for a word to be recognized. But note that there is no single process which is solely responsible forword recognition. Although, in the absence of context, unique identificationof words will involve only perceptual analysis and lexical access, word recognition will also involve higher level processes when there are contextual constraints. However, no higher level process will have any influence on eitherperceptual analysis or lexical access. These stages are completely autonomous. So although the checking model utilizes both perceptual and contextualinformation in determining the identity of a word, perceptual and contextualprocessing operations are completely independent.

By virtue of its organization as a data-driven system, the checking modelovercomes the problem faced by other models of word recognition of primingor predicting a set of words which have a high probability of occurrence in aparticular context. This poses a particular problem for other models in at-

132 D. Norris

tempting to give an account of the effects of sentential context. By restrictingitself to the determination of relationships between context and words forwhich there is some perceptual evidence, the checking model avoids expending computational effort in priming a set of contextually probable words.Although a small proportion of words in any discourse may be so constrainedby context that they can readily be predicted with a minimum of effort, thechecking model affords an economical means of utilising context to facilitatethe recognition of words even where the context does not permit any specificpredictions to be made with a high degree of accuracy (although such weakcontext might only be effective when stimulus degradation provides extratime for the checking process to operate).

The second important feature of the checking model is that it is a criterionbias system. Recent criticisms of criterion bias models were shown to bebased on a number of false assumptions. It was shown that criterion biasmodels can use contextual information so as to reduce stimulus uncertainty.This means that under normal circumstances, criterion bias models will predict that context leads to a change in d' as well as a change in responsecriterion. In contrast, models such as Forster's search model and Becker'sverification model incorrectly predict that context should only influence d'.

As the checking model operates by determining the relation between lexical candidates and the context, it is able to account for the effect of bothsentential and associative context by the same mechanism. The differencebetween the two forms of context lies largely with the fact that the evaluationof associative relations between a word and related words will typically onlyinvolve determining the strength of an existing association between the twoin memory. In the case of sentential context the relation will not generallybe specified directly in memory, and some computation may be required inorder to evaluate the relation.

A further important characteristic of the checking model is that the effectiveness of context in facilitating recognition depends not only on the degreeto which context constrains the identity of a word, but also on the speed withwhich the relation between context and the word can be determined because,if the checking process is to influence word identification, it must be completed before the word can be uniquely identified on the basis of perceptualinformation alone. This property of the model leads to at least one predictionwhich is the opposite of what would be expected from the logogen model. Inthe other models of word recognition, the lexical entries for contextuallyrelated words have in some sense been primed or activated immediately thecontext has been processed. In the present model some initial analysis musttake place in order to derive a set of candidate entries which may be checkedagainst the context. As a consequence, there should be some early stage in


processing where a set of candidates has been produced but insufficient timehas been available in which to check them against the context. Thus, if subjects were required to perform, say, a lexical decision task based on theinformation available at this point, their decision should be based solely onthe perceptual analysis and should therefore be unaffected by context. Thedecision would, of course, tend to be very inaccurate as it would be basedon a very superficial analysis of the input. As subjects waited longer beforeresponding their responses should become more accurate and the increasedtime available fOJ; the checking operation should increase the effects of context. Other models of contextual priming should predict that effects of contextshould be apparent from the earliest .stage in processing, since the eventsresponsible for priming occur "as an immediate consequence of processing thecontext. In the logogen model the context would increment the counts ofwords related to the context before the decision word had been presented.If subjects had to make a very fast response, the main factor determining theactivation level of the logogens at an early point in processing would be thecontextual priming. Context should therefore have a stronger rather than aweaker effect on speeded word recognition. However, in the speed-accuracytrade-off experiment by Antos it was found that, as the checking modelpredicts, the context effect decreased with faster responding.

Further support for the checking model comes from work on the relationbetween word frequency, context and stimulus quality. Superficially, the account of the word frequency effect given by the checking model is very similarto that of the logogen model. It is assumed that there is a lower responsecriterion for high frequency words than for low. However, despite the factthat the checking model treats both frequency and context in essentially thesame manner as the logogen model, the details of the models give rise to twoquite different sets of predictions as to the relation between context, frequency and stimulus quality. Whereas the logogen model predicts that context and frequency should be additive in their effects, the checking modelcorrectly predicts an interaction between these factors. As low frequencywords take longer to identify, more time is available for the operation of thechecking processes. This means that there is a greater potential for contextto influence recognition of low frequency words. A further difference between the two models concerns the interactions of context and frequencywith stimulus quality. Whereas the interactions between context and stimulusquality and between frequency and stimulus quality should be comparableaccording to the logogen model, the checking model correctly predicts thatthe interaction of context and stimulus quality should be greater than theinteraction of stimulus quality with frequency. Both Becker's verificationmodel and Forster's search model wrongly predict that frequency and degra-

134 . D. Norris

dation should be additive in their effects.However, the ability of current instantiations of various models to handle

all of the existing data is perhaps not the only criterion against which weshould evaluate the merits of competing theories. To take just one example;no current theory explains the unusual pattern of facilitation observed byKoriat (1981). Koriat found that backward associations produced facilitationin the early part of an experiment, while only forward associations producedfacilitation in later stages. This result presents a problem for all currenttheories, but can hardly be considered to justify abandoning all and startinganew. Each theory embodies useful and interesting theoretical properties,many of which will be employed in any new theory to emerge. The strengthof the current theory is in its novel combination of theoretical properties, inits rejection of lexical priming and top-down processing, and in the claim thata particular account of ambiguity resolution can not only account for the dataon context effects, but can also explain why the system should behave in theway it does.

References

Antos, S.J. (1979). Processing facilitation in a lexical decision task. Journal of Experimental Psychology:Human Perception and Performance, 5, 527-545.

Becker, C.A. (1976). Semantic context and word frequency effects in visual word recognition. Journal ofExperimental Psychology: Human Perception and Performance, 2, 556-566.

Becker, C.A. (1979). Scmantic context and word frequency effects in visual word recognition. Journal ofExperimental Psychology: Human Perception and Performance, 5, 252-259.

Becker, C.A. (1980). Semantic context effects in visual word recognition: An analysis of semantic strategies.Memory and Cognition, 8, 493-511.

Becker, C.A., & Killion, T.H. (1977). Interaction of visual and cognitive effects in word recognition. Journalof Experimental Psychology: Human Perception and Performance, 3,389-401.

Coltheart, M., Davelaar, E., Jonasson, J.T., & Besner, D. (1976). Access to the internallcxicon.ln S. Dornic(Ed.), Attention and Performance VI. Hillsdale, NJ: Erlbaum.

Fischler, I., & Bloom, P.A. (1979). Automatic and attentional processes in the effects of sentence contextson word recognition. Journal of Verbal Learning and Verbal Memory, 18, 1-20.

Fischler, I., & Bloom, P.A. (1980). Rapid processing of the meaning of sentences. Memory and Cognition,8,216-225.

Forster, K.I. (1976). Accessing the mental lexicon. In R.J. Wales & E.C.T. Walker (Eds.), New approachesto language mechanisms. North-Holland: Amsterdam.

Forster, K.1. (1979). Levels of processing and the structure of the language processor. In W.E. Cooper &E.C.T. Walkcr (Eds.), Sentence processing: Psycholinguistic studies presented to Merril Garrett. Cambridge, MA: MIT Press.

Forster, K.1. (1981a). Frequency blocking and lexical access: One mental lexicon or two? Journal of VerbalLearning and Verbal Behaviour, 20, 190-203.

Forster, K.I. (1981b). Priming and the effects of sentence and lexical contexts on naming time: Evidence forautonomous Icxical processing, Quarterly Journal of Experimental Psychology, 33A, 465-496.

Forster, K.I., & Chambers, S.M. (1973). Lexical access and naming time. Journal of Verbal Learning andVerbal Behaviour, 12,627-{i35.


Foss, D.J. (1982). A discourse on semantic priming. Cognitive Psychology, 14, 59(}-6()7.Foss, D.J., & Blank, M.A. (1980). Identifying the speech code. Cognitive Psychology, 12, 1-31.Foss, D.J., & Gernsbacher, M.A. (1983). Cracking the dual code: Toward a unitary model of phoneme

identification. Journal of Verbal Learning and Verbal Behaviour, 22, 609-632.Foss, D.J., & Ross, J.R. (1978). Great expectation: Context effects during sentence processing. Paper pre

sented at the 12th International Congress of Psychology, July 1978.Foss, D.J., & Swinney, D.A. (1973). On the psychological reality of the phoneme: Perception, identification

and consciousness. Journal of Verbal Learning and Verbal Memory, 12,246-257.Frederikson, J.R., & Kroll, J.F. (1976)"Spclling and sound: Approaches to the internal lexicon. Journal of

Experimental Psychology: Human Perception and Performance, 2, 361-379.Glanzer, M., & Ehrenreich, S.L. (1979). Structure and search of the mental lexicon. Journal of Verbal

Learning and Verbal Behaviour, 18,381-398.Gordon, B. (1983). Lexical access and lexical decision: Mechanisms of frequency sensitivity. Journal of Verbal

Learning and Verbal Behaviour, 21, 24-44.Koriat, A. (1981). Semantic facilitation in lexical decision as a function of prime-target association. Memory

and Cognition, 9, 587-598.Marslen-Wilson, W.D. (1975). Sentence perception as an interactive parallel process. Science, 189,226-228.Marslen-Wilson, W.D., & Tyler, L.K. (1980). The temporal structureof spoken language understanding: The

perception of words in sentences. Cognition, 8, 1-71.Marslen-Wilson, W.D., & Welsh, A. (1978). Processing interactions and lexical access during word recognition

in continuous speech. Cognition, 10, 29-63.McClelland, J.L. (1979). On the time relations of mental processes: An examination of systems of processes

in cascade. Psychological Review, 86, 287-330.Meyer, D.E., & Schvaneveldt, R.W. (1971). Facilitation in recognising pairs of words: Evidence of a depen

dence between retrieval operations. Journal of Experimental Psychology, 90, 227-334.Meyer, D.E., Schvaneveldt, R.W., & Ruddy, M.G. (1975). Loci of contextual effects on word recognition.

In P.M.A. Rabbitt & S. Dornic (Eds.), Attention and Performance V. Academic Press: London.Morton, J. (1969). Interaction of information in word recognition. Psychological Review, 76, 165-178.Morton, J. (1979). Word recognition. In J. Morton and J.e. Marshall (Eds.), Psycholinguistics, Series 2:

Structures and processes. London: Paul Elck.Morton, J., & Long, J. (1976). Effect of word transition probability on phoneme idcntification. Journal of

Verbal Learning and Verbal Memory, 15,43-51.Neely, J.H. (1976). Semantic priming and retrieval from lexical memory: Evidence for facilitatory and inhibit

ory processes. Memory and Cognition, 4, 648-654.Neely, J.H. (1977). Semantic priming and retrieval from lexical memory: Roles of inhibition less spreading

activation and limited capacity attention. Journal of Experimental Psychology, 106, 226-254.Norris, D.G. (1980). Serial and interactive models of comprehension. Unpublished D.Phil. Thesis, University

of Sussex.Norris, D.G. (1982). Autonomous processes in comprehension: A reply to Marslen-wilson and Tyler. Cogni

tion, 11,97-101.Norris, D.G. (1984a). Thc effects of frequency, repetition and stimulus quality in visual word recognition.

Quarterly Journal of Experimental Psychology, 36A, 507-518.Norris, D.G. (1984b). The mispriming effect: Evidence of an orthographic check in the lexical decision task.

Memory and Cognition, 12, (I), 470-476.O'Connor, R.E., & Forster, K.I. (1981). Criterion bias and search sequence bias in word recognition. Memory

and Cognition, 9, 78-92.Posner, M.I., & Snyder, C.R.R. (1975). Facilitation and inhibition in the processing of signals. In P.M.A.

Rabbitt & S. Dornic (Eds.), Attention and Performance V. Academic Press: London.Richardson, J.T.E. (1976). The effects of stimulus attributes upon latency of word recognition. British Journal

of Psychology, 67, 315-325.

136 D. Norris

Rubenstein, H., Garfield, L., & Millikan, J.A. (1970). Homographic entries in the internal lexicon. Journalof Verbal Learning and Verbal Behaviour, 9, 487-494.

Rumelhart, D.E. (1978). Toward an interactive model of reading. In S. Dornic (Ed.), Attention and Performance VI. Hillsdale, NJ: Erlbaum.

Sanford, A.J., Garrod, S. & Boyle, J.M. (1977). An independence of mechanism in the origins of readingand classification-related semantic distance effects. Memory and Cognition, 5, 214-220.

Scarborough, D.H., Cortese, C., & Scarborough, H.S. (1977). Frequency and repetition effects in lexicalmemory. Journal of Experimental Psychology: Human Perception and Performance, 3, 1-17.

Schuberth, R.E., Spoehr, K.T., & Lane, D.M. (1981). Effects of stimulus and contextual information on thelexical decision process. Memory and Cognition, 9,68--77.

Schuberth, R.E., & Eimas, P.D. (1977). Effects of context on the classification of words and non-words.Journal of Experimental Psychology, Human Perception and Performance, 3, 27-36.

Schvaneveldt, R.W., & McDonald, J.E. (1981). Semantic context and the encoding of words: Evidence fortwo modes of stimulus analysis. Journal of Experimental Psychology, Human Perception and Performance, 7,673-687.

Shiffrin, R.M., & Schneider, W. (1977). Controlled and automatic human information processing: " Perceptuallearning, automatic attending, and a general theory. Psychological Review, 85, 127-190.

Solomon, R.L., & Howes, D.H. (1951). Word frequency, personal values and visual duration thresholds.Psychological Review, 58, 256-270.

Stanners, R.F., Jastrzembski, J.E., & Westbrook, A. (1975). Frequency and visual quality in a word-nonwordclassification task. Journal of Verbal Memory, 14, 259-264.

Stanovich, K.E. (1980). Toward an interactive-compensatory model of individual differences in the development of fluent reading. Reading Research Quarterly, 16, 32-71.

Stanovich, K.E., & West, R.F. (1979). Mechanisms of sentence context effects in reading: Automatic activation and conscious attention. Memory and Cognition, 7, 77-85.

Stanovich, K.E., & West, R.F. (1981). The effect of sentence context on ongoing word recognition: Tests of atwo process theory. Journal of Experimental Psychology: Human Perception and Performance, 7,658-672.

Stanovich, K.E., & West, R.F. (1983a). The generalizability of context effects on word recognition: A reconsideration of the roles of parafoveal priming and sentence context. Memory and Cognition, I1,49-58.

Stanovich, K.E., & West, R.F. (1983b). On priming by a sentence context. Journal of Experimental Psychology: General, Il2, 1-36.

Treisman, M. (1978). A theory of the identification of complex stimuli with an application to word recognition.Psychological Review, 85, (6), 525-570.

Tulving, E., & Gold, C. (1963). Stimulus information and contextual information as determinants of tachistoscopic recognition of words. Journal of Experimental Psychology, 66, 319-327.

West, R.F., & Stanovich, K.E. (1982). Source of inhibition in experiments on the effect of sentence contexton word recognition. Journal of Experimental Psychology: Learning, Memory and Cognition, 8, 385399.

Resume

Les auteurs prcscntcnt un modele des effets du contexte et de la frequcnce sur la reconnaissance des mots.Un mecanisme de verification operant apres l'acces au lexique permet d'cxpliqucr les effets de "priming" etd'inhihition sans faire appel a des mccanisrncs de facilitation lexicale ou a des mccanismcs attentionnels. Cemecanisme, qui joue aussi un role essentiel dans la resolution des arnbiguites lexieales et syntaxiques, a poureffet de modifier les critercs de reconnaissance des entrees lexicales fournies par l'analysc perceptive. Lesproprictes du modele son! analysccs, et les auteurs mont rent qu'il explique adcquatcrncnt l'influence ducontexte sur la vitcsse et sur la precision de la reconnaissance des mots, telles qu'elles sont rncsurccs par dest,"';,.. ....... co ..-In .............. '" ...I.... rAn"n"' ....

word recognition: context effects without priming

Documents