The Pros and Cons of Masked Priming

Kenneth I. Forster

University of Arizona

To be published in a special edition of the Journal of Psycholinguistic Research.


     Masked priming paradigms offer the promise of tapping automatic, strategy-free lexical processing, as evidenced by the lack of expectancy disconfirmation effects, and proportionality effects in semantic priming experiments. But several recent findings suggest the effects may be prelexical. These findings concern nonword priming effects in lexical decision and naming, the effects of mixed case presentation on nonword priming, and the dependence of priming on the nature of the distractors in lexical decision, suggesting possible strategy effects. The theory underlying each of these effects is discussed, and alternative explanations are developed that do not preclude a lexical basis for masked priming effects.

      In some quarters, there is concern that one of the standard procedures for studying lexical processing may be coming unstuck. This procedure is, or ought to be, very simple. It involves comparing measures of processing speed for two random samples of words drawn from populations that differ on some potentially relevant treatment variable, such as the number of meanings, or the number of neighbors. What threatens this procedure is the widely recognized difficulty of matching the two samples on all other variables that may possibly affect processing speed (e.g., see Cutler, 1981), with the result that there are many examples in the literature where the debate is mainly concentrated on possible confounds rather than the nature of lexical processing itself (e.g., Andrews, 1992; Balota & Chumbley, 1984; Carroll & White, 1973; Gernsbacher, 1984; Grainger, 1992; Monsell, Doyle, & Haggard, 1989). This state of affairs would not be so serious if most investigators were able to obtain the same results most of the time. We would then at least know what the facts were, although there may be disagreement about how the facts should be interpreted. But it is becoming increasingly clear that we do not always have a strong grasp of the facts. Replication failures with different sets of items are almost the norm. To some extent, statistical procedures can provide assistance. For example, requiring a significant effect in an item analysis helps protect against random matching errors, but this is useless against systematic or partial confounds.

    Priming techniques offer a powerful alternative procedure, since we can then hold the target materials constant, and instead vary the type of prime. Of course, this changes the type of question that can be asked. Instead of asking whether words of type X are accessed faster than words of type Y, we have to ask whether primes of type X are more effective than primes of type Y with targets of type Z. Confounds still exist, because it is still necessary to decide what properties of the prime were responsible for any variations in priming, but at least these are not confounds with the major dependent variable. However, the problem still remains of deciding whether the changes in processing time have anything to do with the lexical processor itself. As an example, take the case of long-term repetition priming. There is some doubt about whether repetition priming effects in lexical decision are the result of repeated access of the same lexical entry, or whether they instead reflect the influence of a memory trace of the original priming episode on the decision process itself (e.g., Feustel, Shiffrin & Salasoo, 1983; Forster & Davis, 1984). Indeed, many investigators in the implicit memory area assume an episodic interpretation as a matter of course (e.g., Jacoby, 1983).

     In recent years, interest in a new technique in the area of visual word recognition has developed, the masked priming paradigm (e.g., Forster & Davis, 1984; Grainger & Segui, 1990; Lukatela & Turvey, 1996; Perea & Gotor, 1997; Rajaram & Neely, 1992; Sereno, 1991). In this case, the prime is presented visually with a very short prime-target interval (50-60 ms), and is both forward and backward masked, the result being that for most subjects, the prime is not visible. However, this does not appear to diminish the prime's effectiveness. Indeed, quite the reverse, since masked primes can be more effective than visible primes (Forster & Veres, in press; Humphreys, Evett, Quinlan, & Besner, 1987). The reason that this procedure has promise is that it offers a method of studying processing effects that might be free of extra-linguistic influences. The hope is that we can study the operation of the lexical processor isolated from inputs from other parts of the brain, most notably the frontal lobes. If subjects do not realize that there is a prime, they are unlikely to consciously take it into account when making a decision about the target stimulus. Nor are they likely to be influenced by an episodic trace of the prime, since it is seems unlikely that any such memory trace would be formed without awareness of the prime. Even if such a thing were possible, it seems doubtful whether such a trace could be established, and then utilised effectively within such a short period of time (60 ms). Another reason for thinking that masked priming might be a useful tool is that some effects only become apparent when the prime is masked. For example, if the prime is orthographically related to the target (e.g., hideous-HIDEOUT), there is no benefit at all to the target if the prime is clearly visible (Colombo, 1986; Humphreys, Evett, Quinlan, & Besner, 1987; Martin & Jensen, 1988). However, when masked, such primes can produce quite robust facilitatory effects (Forster, Davis, Schoknecht & Carter, 1987; Forster & Veres, in press).

    The measure of this paradigm's effectiveness will be the extent to which different investigators are able to obtain the same effects, despite changes in items and subjects, not to mention languages. However, there are a number of experimental findings coming to light suggesting that our hopes for masked priming might be unfounded, either because the findings suggest that masked priming is insensitive to lexical operations, or because it is clearly subject to strategic influences. The purpose of this paper is to consider some of these findings, and what they might mean.

     But first, it might be useful to examine some evidence that masking the prime does indeed eliminate unwanted "extra-lexical" effects. This evidence concerns an expectancy effect in priming. In this experiment, the prime is an incomplete word, such as colos, and the target is the completion, COLOSSAL. The task is lexical decision on the target word. Under normal presentation conditions (i.e., the prime is visible), the incomplete word appears to set up a strong expectancy for the completion, which would normally enhance the priming effect. The interesting question is whether the same thing happens when the prime is masked.

Masked Primes and the Expectancy Effect.

     As shown by Posner and Snyder (1975), if the prime leads to a strong expectancy for the target, it leads to a facilitation of expected targets, and an inhibition of unexpected targets (see also Neely, 1991, for discussion). In this experiment, we contrast two types of primes. The first is an incomplete prime, which consists of the first five letters of an eight-letter word, and the target is the completed word, e.g., colos-COLOSSAL, playm-PLAYMATE. As these examples illustrate, the target is highly predictable from the prime. The second type of prime is in fact the same word as the target, although in lower case, e.g., playmate-PLAYMATE. This is termed a complete prime. Based on intuitive experience, and comments from subjects, the incomplete prime appears to elicit a strong tendency to produce the completion, and a corresponding expectancy for that completion. The complete prime, on the other hand, does not appear to elicit any special expectation. If this is correct, then according to the Posner-Snyder theory, a stronger priming effect should be observed with incomplete primes than with complete primes, due to the inhibitory effect in the incomplete condition when the target is a completely unrelated word. Such an effect would, according to the Posner-Snyder analysis, depend on strategic, conscious processes. Accordingly, if masking the prime eliminates such effects, then a different pattern should be observed with masked primes.

     The design of the items used in the experiment is shown in Table 1.


Design of the Expectancy experiment. Primes were either visible (1000 ms SOA), or masked (60 ms SOA).

    Condition             Word Target             Nonword Target        

                      Prime      Target         Prime      Target     

Complete Prime industry INDUSTRY Control sapphire INDUSTRY abdicate SHENTACE Incomplete Prime indus INDUSTRY Control sapph INDUSTRY uphea SHENTACE

    Two presentation conditions were used. In the first, the prime was presented in lower case letters for 500 ms, followed by a 500 msec presentation of a warning signal (a pair of angle brackets), which was then followed by the target item in upper case letters, also presented for 500 msec. Thus a typical sequence was as follows: colos (500 ms), < > (500 ms), < COLOSSAL > (500 ms). The task was to decide whether the target word (the word in upper case letters) was a word or not. The subject was aware of the prime, and had ample time in which to deploy a strategy to handle the target. In the second condition, the prime was presented in a format similar to that used in previous masked priming experiments (Forster and Davis, 1984; Forster, et al., 1987). The prime was preceded by a forward masking stimulus consisting of a row of 8 hash-marks (duration 500 msec). This was immediately followed by the prime (duration 60 msec), which was in turn immediately followed by the target (duration 500 msec). All stimuli were centered on the viewing screen. Under these conditions, subjects can usually tell that something preceded the target word, but they are quite unable to identify what it was (Forster et al., 1987).

     A total of 128 eight-letter words were selected at random. Half of these were used as targets in the non-masked experiment, and half were used in the masked experiment. These item sets were further subdivided, so that half were used as targets in the complete condition, half in the incomplete condition, making a total of 16 items per condition. Within each experiment, two sets of items were prepared. A given word target that appeared in either the complete or the incomplete condition in the first set appeared in the second set in the appropriate control condition. Thus targets are counterbalanced across the prime-control condition, but not across the complete-incomplete condition. In the complete condition, the prime was the same word as the target. In the control for this condition, the prime was a randomly chosen 8-letter word. The same was true in the incomplete condition and its control, except that only the first five letters of the prime were presented. In addition, a further set of 128 legal nonwords were constructed that bore no special relationship to words. Because the primes were clearly visible in the non-masked condition, the primes for nonword targets could not be complete or incomplete nonwords (otherwise the subject could determine the correct response from the prime). Instead, they were either complete or incomplete words. However, these primes were quite unrelated in form to the nonword targets, e.g., abdicate-SHENTACE, uphea-FLONTINE.

     A total of six highly practised subjects was used, with each subject being tested in both the masked and non-masked experiments.

    After discarding the RTs for error responses, and applying cutoffs two standard deviation units above and below the mean RT for each individual subject, the mean lexical decision times and error rates in each condition were as shown in Table 2.


Mean lexical decision times (in msec) and percent error rates (in parentheses) for word targets preceded by complete (identity) primes or incomplete primes (first 5 letters) as a function of masking of the prime.

Condition   Example Non-masked Prime Masked Prime
Complete     colossal-COLOSSAL 371   (2.1) 404    (1.6)
Control sapphire-COLOSSAL 460   (20.3) 488    (17.7)
Priming   89   (18.2)   84    (16.1)
Incomplete colos-COLOSSAL 390    (0.5) 444     (5.2)
Control sapph-COLOSSAL 518    (34.9) 485     (12.5)
Priming 127    (34.4)   41      (7.0)

    The first feature to note is that in the non-masked condition, there is indeed an enhancement of the priming effect with incomplete primes (127 ms) compared with complete primes (89 ms). The overall priming effect for decision times was highly significant, minF'(1,13) = 72.1, p <.01, and there was a significant interaction with the completeness of the prime, minF'(1,14) = 3.52, p <.10. Exactly the same was true of the error rates. When the prime was masked, however, the interaction went in the reverse direction, with incomplete primes producing a smaller effect (41 ms) than complete primes (84 ms). Again, the overall priming effect for decision times was significant, minF'(1,33) = 66.19, p <.01, as was the interaction with completeness of the prime, minF'(1,39) = 6.46, p <.05. Once again, the same was true for errors.

     From a pure priming viewpoint, the most remarkable aspect of these results is that in the non-masked situation, incomplete primes were more effective than complete primes. Normally, identity priming produces the strongest effects, and since incomplete primes only partially match the target, one would expect to find weaker priming with incomplete primes, or at least no difference. But if strong expectancies were elicited by the incomplete primes, then the priming effect would be enhanced by the inhibitory component produced by a disconfirmed expectancy. Post-experimental debriefing confirmed that in the non-masked incomplete prime condition, subjects felt that they were biased to respond "No" when the target was not the completion of the prime. This analysis is supported by the very high error rate in this condition (34.9%), and by the fact that performance in this condition was much slower than in any of the other control conditions.

     However, this effect appears to be absent in the masked situation. Not only is there a smaller priming effect for incomplete primes than for complete primes (as would be expected), but performance in the control condition is 33 msec faster than in the non-masked condition, and is also much more accurate. So, for incomplete primes, masking reduces the effectiveness of a related prime, but eliminates much of the disruptive effect of an unrelated prime. The most obvious interpretation of this change is that the expectancy effect created by a visible incomplete prime is completely absent when the subjects are unaware of the prime. Note that it is not the case that masking generally reduces the priming effect, since in the complete prime condition, masked and non-masked primes produced comparable effects.

     Strictly speaking, we are not entitled to assume that awareness of the prime is the critical factor. It might instead be the case that the change in performance is due simply to the fact there was insufficient time for the expectancy to be formed. From the point of view of studying form-priming, it is not critical to decide which of these accounts is correct. Either the lack of awareness of the prime, or the short SOA, has the effect of eliminating an unwanted source of interference. If one wished to make the stronger claim that awareness of the prime was the crucial factor, then it would be necessary to design an experiment in which a longer SOA was used, but the prime was still masked by a stimulus which intervened between the prime and the target. Such an experiment has not yet been carried out.

     Another point to note is that responses in the masked condition tend to be slower than in the non-masked condition (except for the incomplete control condition, where the disconfirmed expectancies are involved). This suggests that the rapidly presented prime has an inhibitory effect on recognition of the target, whether it is related to the target or not. This could be a purely visual effect (e.g, the prime might interfere with the visual processing of the target, as in forward masking), or it could be a higher-level effect, such as the fact that lexical access for the target must begin while access of the prime is still in progress. In any event, it is not surprising that a very rapid sequence of stimuli leads to slower response times.

     A remarkable feature of this experiment is the very substantial effect of the prime on error rates in both masked and non-masked conditions. In the non-masked incomplete condition, the reduction in the error rate in the related condition can be seen as a consequence of the response bias, but in the masked conditions, some other factor must be operative. It should be pointed out that masked priming effects on decision times are not automatically accompanied by corresponding reductions in error rates.

     Based on these results, it seems reasonable to conclude that the masked procedure reveals an underlying form-priming effect, without the added effects generated by conscious expectancies. Another interesting example of this type of effect concerns the proportionality effect in semantic priming. As shown by Tweedy, Lapinski and Schvaneveldt (1977), the size of the priming effect increases as a function of the proportion of related items. The more related items there are, the greater the semantic priming effect. This strongly implies some kind of strategic adaptation to the conditions of the experiment, and is not consistent with the notion of semantic priming as an automatic effect. However, this strategic effect does not apparently occur with masked primes (Perea, Rosa, & Algarabel, 1997). The size of the masked semantic priming effect, although small, is not enhanced in any way by increasing the proportion of related items. Of course, subjects in these experiments were quite unaware of the existence of the prime, and obviously were therefore also unaware of how often a related item occurred, so perhaps it is not surprising that this manipulation did not have any effect. But that is actually the point. By masking the prime, the opportunity for higher processes to influence the response to the target word was eliminated. Indeed, this might explain why the masked effect is so small compared with the effect obtained with visible primes. This strongly suggests that the priming effect has several components, and that the dominant components require conscious awareness of the prime.

    These examples serve to support the claim that masking the prime eliminates some effects, but not others. Whether the eliminated effects are "unwanted" or not depends on one's viewpoint, but whatever position is adopted on this issue, it must be admitted that a comparison of masked and non-masked priming effects is likely to be highly instructive as far as the architecture of the cognitive system is concerned. However, the value of this comparison would diminish if it turned out that the reason for the differences in effects is that masked priming takes place at a shallower processing level than non-masked priming. It is to this issue that we turn next.

Is masked priming prelexical or lexical?

     What would it mean to say that masked priming is a prelexical effect? It would mean that the prime only influences processing that takes place prior to lexical interpretation. It implies that the prime affects only orthographic processing, and has no direct effect on the lexical representation of the target word. That is, priming is restricted to the recognition of the sublexical constituents of a word (e.g., letters, bodies, or syllables). If this were true, it would explain a number of findings. For example, it would explain why masked priming effects are independent of the frequency of the target word (Forster & Davis, 1984). In addition, it provides a simple explanation for the finding mentioned above, namely that hideous primes HIDEOUT only with a very short SOA. It could be that it is only under these conditions that the orthographic processing of the prime can influence the orthographic processing of the target.

    However, it is reasonably certain that this cannot be the whole story. For one thing we have the existence of semantic priming effects with masked primes (Perea & Gotor, 1997; Sereno, 1991). Although the effects are small, there seems to be no doubt that they exist. A stronger indicator of lexical involvement is the existence of masked priming effects between noncognate translation equivalent words in two languages with completely different scripts, such as Hebrew and English (Gollan, Forster & Frost, 1997), or Chinese and English (Jiang, 1997). There is simply no way that primes and targets in these languages could share any sublexical features, either orthographic or phonological. However, obtaining priming under these conditions merely establishes that masked priming can take place at a lexical level, but this does not mean that it always does. What is needed is some kind of diagnostic test that detects the presence (or absence) of sublexical effects in any particular experiment.

     One test for sublexical effects is that they should also occur for nonword targets. So, if the prime is identical to the target (although in a different case), then the sublexical constituents of the target should be recognized faster no matter whether the target is a word or a nonword. In general, the preponderance of evidence suggests that priming effects for nonword targets are very small and difficult to detect. In a quick survey of 40 experiments, I found only three effects that were significant at the 5% level or better (two would be expected by chance). However, in 31 of those experiments, the effect was positive, with a mean value of 8.7 ms, which was significantly different from zero. This suggests that there is a real effect, but it is small in comparison to the effects obtained for words (50-80 ms). It might be suggested that this difference in size of effect could be used as a diagnostic test. If a particular effect is significantly greater than the effect observed for nonwords, then it cannot be explained in sublexical terms. However, a quite reasonable objection to this procedure is that the method of reaching a "No" decision in a lexical decision experiment may obscure the true effect. For example, one argument is that the prime also increases the perceived familiarity of the nonword target, which induces a bias to respond "Yes" although the correct response is "No", and this counteracts the sublexical facilitation effect. Another argument is that if a "No" decision is made only when a deadline is reached, then sublexical priming would go undetected unless the deadline was adjusted to take into account the nature of the prime. That is, in order for a priming effect to be detected, the deadline would have to be shortened if the prime was related to the target. Since the deadline is usually thought of as being controlled by some central decision-making system (relatively close to consciousness), and not by the lexical processor itself, it seems unlikely that the central system would have access to any information about the prime.

     So demonstrating that a given priming effect is greater for words than for nonwords does not necessarily guarantee that there is a significant lexical component to the effect. Actually, the reasoning here is not entirely straightforward. For example, Sereno (1991) reported equally strong repetition priming effects for word and nonword targets and concluded that the effects were purely sublexical. However, in subsequent experiments she went on to obtain semantic and syntactic priming effects, which must surely be lexical in nature. So here there is a problem. How could there be both a lexical and a sublexical component to priming, and yet nonword targets show the same amount of priming as word targets? This must mean that priming effects are not additive. Adding a lexical effect to an already existing sublexical effect does not increase the amount of priming. This conclusion would have widespread implications, especially for activation models, and hence a conservative approach seems sensible. This involves considering the possibility that Sereno's result is an anomaly.

     There is another aspect of Sereno's result that is puzzling. What happened to the mechanisms that are assumed normally to reduce priming for nonwords, such as the familiarity bias, or the deadline mechanism? Why weren't they operative in this experiment? Obviously, one cannot argue that these mechanisms only operate when there is no priming for nonwords, since this would be entirely circular. This might be another reason to be skeptical about the results. However, Sereno is not the only investigator to obtain such an effect. For example, Perea and Rosa (1997) have reported the same result. What is common to both investigations is that the nonwords were designed to closely resemble a word, and differed from that word by only one letter. Clearly, there is something wrong with the sublexical account if it turns out that priming effects are obtained for nonwords only when they closely resemble words.

    In fact, it has been argued that the entry-opening model of masked priming originally outlined in Forster and Davis (1984) actually predicts priming effects for such nonwords, and that these effects are really lexical in nature (Forster, 1992). This model is illustrated in Figure 1. It assumes a table look-up procedure, in which a fast search process flags lexical entries that closely match the input. As each entry is flagged, a pointer to it is placed on a queue of candidates. In order to evaluate these candidates, each candidate entry must first be "opened" so that information within the entry can be extracted. This opening procedure is analogous to opening a file in a disk operating system, and is assumed to take an appreciable time, but once an entry has been opened, it remains in the open state temporarily. This process begins as soon as an entry is placed on the candidate queue, and operates in parallel, so that the time taken to open an entry is independent of the number of entries ahead of it in the queue. The final process is an evaluation routine that checks to see whether the candidate has the right orthographic properties (i.e., a post-access orthographic check). If a candidate entry survives this test, it is kept open, so that the contents can be made available to other processing systems. However, if it fails this test, the entry is closed down.

Figure 1. The Entry-Opening model of masked priming (based on Forster & Davis, 1984).

   Another way of looking at this model is to consider it as a two-stage filtering operation. The initial filter (the fast search) is fairly coarse, but reasonably rapid. The second filter (the evaluation process) is very fine, and therefore takes considerable time. Applying just the second filter to the entire lexicon would be very slow and demand considerable resources. Instead, it is more efficient to use the coarse filter first to eliminate the weak candidates, and then use the fine filter on what remains.

     Priming comes about in this model when the entry for the target word is opened by the prime. This occurs whenever the entry for the target word is a candidate for the analysis of the prime. By the time the target word is actually presented, its entry is already in the process of being opened, and hence the evaluation of this entry begins sooner. Thus priming is seen as a savings effect. However, when the prime and target are non-identical, a further requirement for priming is that the evaluation of the prime should not reach completion, otherwise the entry for the target word would be closed down (since it fails to match the prime), and no savings would occur. The critical assumption of the model is that the masking procedure applied to the prime blocks this final stage of closing down the entries for the unsuccessful candidates. This leads to the prediction that form-priming (i.e., non-identical prime and target) should occur only when the prime is masked, and this appears to be the case when the prime is a word (Forster & Veres, in press).

     Now consider a nonword that is one letter different from a long word that has no neighbors, e.g., fagulous. It is immediately obvious what word this is based on, and for this reason, it is termed a "close" nonword (Forster & Veres, in press). This property has something to do with the length of the underlying word, since with shorter items, the underlying word is not always so clear (e.g., jota, hulan, prinon). Now, it seems that a close nonword such as fagulous could scarcely be rejected in a lexical decision experiment without the entry for fabulous being opened, and a detailed comparison of its orthographic properties being made with the input. This would, of course, lead to a longer rejection time compared with a nonword such as verolous, which does not closely resemble a word. However, if fagulous had been primed by an identity prime, the entry for fabulous would already be open when the target nonword was presented, and hence the contents of the entry would be available sooner. This ought to lead to a faster rejection of the nonword as a satisfactory match, and this in turn should lead to faster rejection times in a lexical decision experiment.

    Such an effect has in fact been obtained (Forster, 1992). Nonword rejection times in a lexical decision experiment were faster for close nonwords such as fagulous when preceded by an identity prime, and also when primed by the actual underlying word (i.e., fabulous). However a prime such as pagulous had no effect. This shows that the effect is not simply due to orthographic overlap between the prime and the target. This prime is also one-letter-different from the target, but is two-letters-different from the underlying word, and hence is less likely to open the entry for that word.

     So here is a case where the entry-opening model definitely predicts a priming effect for nonwords. However, there is a catch. The decision would have to be triggered by the post-access checking mechanism in order for priming to be observed, i.e., the decision must be made as soon as the mismatch is detected. This means that the decision is not being controlled by a deadline mechanism. This implication raises still more questions. For example, four-letter nonwords are almost certain to have a word neighbor (unless deliberately designed to be hermits), and therefore we should expect to find identity priming whenever short nonwords are used (e.g., nace-NACE). But this appears not to be the case (e.g., Forster & Davis, 1984). Why should this be? The convenient answer used to be that nonword decisions are timed by a deadline, but that answer will not work now. A more plausible answer is that form-priming effects are absent when the target has many neighbors (Forster et al., 1987; Forster & Taft, 1994). Thus, there is no priming for the pair nace-FACE, which suggests that nace does not open the entry for FACE. So if this is the case, then the prime nace probably does not open the entry for any words at all, and hence there is no way that priming for the target NACE could be obtained.

     This explanation predicts that priming for nonwords could only be obtained if the nonword is one-letter-different from a word with very few neighbors, which is just what fagulous is. However, there is an unsatisfactory aspect to this explanation. Consider the N effect, namely that decision times for nonwords (but not words) increase as a function of the number of neighbors (Coltheart, Davelaar, Jonasson, & Besner, 1977; Forster & Shen, 1996). The most natural way for the entry-opening model to explain the N effect is to propose that the decision is delayed until all candidates have been eliminated. If a nonword closely resembles N different words, then a decision will require the entries for all N words to be opened, and a detailed comparison to be made with each. If these comparisons are carried out serially, then the decision will be delayed by an amount directly proportional to the number of neighbors. However, if we assume that a four-letter nonword is unlikely to open the entries for any words (because they will usually be high-density words), then there ought not to be any N effect for these types of nonwords. However this is clearly incorrect.

    There is a potential solution to this problem that deserves to be explored further. This solution retains the assumption that the entries for high-density words are not opened by a form-prime, at least not initially. However, if the stimulus is a nonword, then no entry will be found that exactly matches the input stimulus. Although this is a common occurrence in a lexical decision experiment, it is not a normal state of affairs in normal reading, and it would usually signify an error of some sort. Accordingly, the lexical system may adapt by relaxing the match criterion slightly, and repeating the look-up procedure, this time generating a number of candidates. Eliminating the candidates generated by this second pass through the lexicon then produces an N effect in a single word lexical decision experiment. But in a priming experiment, there might be no priming because there has not been sufficient time for the second pass to begin opening entries.

    Thus we can explain why priming is restricted to low-density nonwords such as fagulous, and at the same time preserve the entry-opening account of the N effect. It is interesting to note that this account also explains why neighborhood density might be relevant to priming of nonwords, as suggested by Masson and Isaak (in press). It seems reasonable to conclude, then, that the existence of priming effects for nonwords can be explained with the same lexical mechanism (entry-opening) that explains priming for words, and does not necessarily imply a sublexical component to priming. However, this conclusion is restricted to nonwords that are constructed by changing one letter of a low-density word. If it were shown that priming can also be obtained for other types of nonwords, then a stronger case for a sublexical component could be made.

     One small problem still remains. As we argued earlier, if subjects waited for a deadline to be reached before responding to a nonword, then no priming for any nonword should be detected. Yet such effects do exist. This means either giving up the deadline model altogether, or postulating that the deadline is adjusted according to the relatedness of the prime and target. Neither alternative is particularly attractive, but the first is perhaps slightly less unattractive. This is partly because it is already clear that the deadline model must be modified to account for N effects. Coltheart et al (1977) modified the model so that the deadline was extended in proportion to the total amount of activation, which is directly proportional to the number of neighbors. The entry-opening version of this proposal discussed earlier was to suggest that a "No" decision had to be delayed until the last candidate had been eliminated, which amounts to much the same thing. In both models, the more neighbors there are, the longer the "No" decision is delayed. So, if the "No" decision is triggered as soon as the last candidate has been eliminated, then a priming effect for nonwords should be observed if that candidate entry has already been opened by the prime. So this modification allows for both N effects and priming for nonwords.

     But what if no close-matching candidates have been identified? In this case, the default deadline is applied, and this should be long enough to allow for two passes through the lexicon (i.e., the deadline allows enough time for candidates to be detected on the second pass). If no candidates have been identified after two passes, then the deadline is invoked, and therefore no priming effects can be detected. Of course, if there are no candidates, then we would not expect any priming anyway.

     But what if the default deadline is shortened, as in a pure high-frequency word list (Glanzer & Ehrenreich, 1979)? When all the words are very high frequency words, nonword decisions become a lot faster, presumably because a shorter deadline is adopted. If the deadline is so short that only very high-frequency candidates are detected, then presumably, most of the N effect for nonwords will disappear, as will the priming effect for nonwords. So, in such a list, this analysis predicts that the nonword target fagulous should not show priming, nor should it take any longer to reject than pagulous, because the deadline will have expired long before the entry for the low-frequency word fabulous has been identified as a candidate.

    Clearly, we need to know more about deadlines and how they are set. For example, what kind of events signal the onset of a new stimulus? Is the deadline timed from the beginning of the target, or is it timed from the beginning of the prime? The answer may depend on whether the target is perceived as a separate event from the prime. If it is not, then the deadline will be timed from the onset of the prime. But if the target is recognized as a new stimulus, then the deadline will be timed from the onset of the target. This presents an interesting possibility. If the (abstract) orthographic similarity of the prime and target plays a role in this determination, that is, similar primes and targets are more likely to be treated as a single event, then the deadline would be timed from the onset of the prime in the case of related pairs, and from the onset of the target in the case of unrelated pairs. This would produce an apparent priming effect for nonwords, since the deadline would expire earlier for related pairs than for unrelated pairs. So here is another possible explanation of nonword priming effects.

    Time will tell whether these intricate arguments lead to greater understanding, or an endless proliferation of additional mechanisms. In the meantime, it is at least encouraging to observe how the model forces more and more inferences as the implications of the data are explored. This is the real work that a theory is supposed to do, and hopefully, it will ultimately lead to a simplification.

     Before leaving this topic, it should be acknowledged that there are some nonword priming effects that cannot be explained as lexical effects. For example, in a study of lexical acquisition, reliable priming effects were observed in a lexical decision task for nonwords that were deliberately designed not to resemble words (Forster, 1985, Experiments 1). Typical examples of these nonwords were as follows: bonance, tovit, lopel, mectey, leath. Ironically, in the same experiment, no priming at all was observed for a group of nonwords that were designed to be wordlike, e.g., bellowbag, hangment, pensify, boormate. What is to be said about such a result? Unless one can explain why one set of nonwords shows priming and the other does not, there is nothing that can be said. It is perhaps of interest to note that a replication using the same materials failed to reach significance (Forster, 1985, Experiment 2), although a substantial trend was apparent. This raises the possibility of a type I error. One should never forget that at least 5% of nonword priming effects will prove to be significant, even if there is no basis for the effect.

Repetition priming for nonwords in the naming task.

     Masson and Isaak (in press) argued that it might be possible to obtain stronger and more consistent nonword priming effects if the task did not involve decision-making. They then showed that in a naming task, where decision processes are not involved, there is a clear nonword priming effect. Such an effect could only be due to sublexical processes, so it seems.

     This argument is very reasonable, but tangential to the main argument. Showing that naming involves a sublexical priming component does not automatically demonstrate that the same component is involved in other tasks, such as lexical decision. In point of fact, there is strong evidence that naming does involve a special form of priming that is apparently nonlexical (Forster & Davis, 1991). This effect produces what is termed the onset effect, which consists of a small advantage in the naming of a word that is preceded by a masked prime that shares only the onset (e.g., belly-BREAK). The nonlexical nature of this effect is demonstrated by the fact that the onset effect is also obtained for nonword targets. However, there is no onset effect in lexical decision, either for words or nonwords.

     Masson and Isaak point out that the onset effect cannot be responsible for their result, since both the related and unrelated primes always shared the same onset with the target. However, if one considers the mechanism that might be responsible for the onset effect, it is readily apparent that shared segments other than the onset could also lead to facilitation. To explain the onset effect, Forster and Davis (1991) argued that the subject essentially attempts to pronounce the prime as well as the target. When the prime is presented, the subject begins to assemble the necessary articulatory commands to pronounce the prime. When the prime is replaced by the target, the assembly process continues, using the target as input. The onset effect is then attributed to a Stroop-like competition between incompatible naming responses, one for the prime and the other for the target, which is reduced when the prime and target share the same onset. Now, this effect would certainly be eliminated by matching onsets. However, there could be another effect altogether, in which the work done planning the pronunciation of the prime is simply applied to the task of pronouncing the target, provided that it is appropriate (i.e., the prime and target have the same pronunciation). This would lead to considerable savings, with the amount depending on how much articulatory planning for the prime had been completed. This account proposes that the priming is essentially articulatory in nature, having nothing to do with word recognition processes themselves. In addition, it predicts that priming should also occur if the prime is a homophone or a pseudohomophone of the target, which of course, is the case (e.g., Lukatela & Turvey, 1994; Ferrand & Grainger, 1992), although this is usually interpreted as evidence for phonological involvement in visual word recognition.

     Hence it can be argued that the existence of priming for nonword targets in naming is ambiguous at this stage. It might indicate the presence of sublexical priming, and equally, it might reflect an articulatory priming process. To settle this issue, we need some other task to perform on nonwords that does not involve deadlines, or articulation of the target. One possibility is semantic categorization. For example, if the category was "living thing", we might compare performance on non-exemplars such as the word TABLE and the nonword FLINK. Since we know that masked priming effects still occur when the correct response is "No" (i.e., TABLE would be classified faster when primed by table rather than house), there seems to be no reason why we should not be able to detect a similar effect for nonword targets such as FLINK, if sublexical effects exist.

Priming for nonword targets in lexical decision with mixed case presentation.

     Bodner and Masson (1997) have reported reliable priming for nonword targets in lexical decision when the target is presented in mixed case (e.g., cUsTaRd). In this experiment, the priming effect for nonwords was extraordinarily large (93 ms), and was greater than the effect observed for words (75 ms). What could produce this effect? Bodner and Masson argue that the unusual format prevents subjects from basing their lexical decisions on the visual familiarity of the stimulus. Since everything looks unfamiliar, familiarity is no guide. This would mean that a related prime would no longer induce a bias to respond "Yes", since familiarity is being ignored, and therefore there would no longer be any decision conflict for nonwords. According to Bodner and Masson, this allows the priming for nonwords to emerge, since it was the conflict produced by the familiarity bias that blocked priming with a normal format.

     This explanation makes a number of assumptions: it assumes that a masked repetition prime increases familiarity, that lexical decisions are based in part on familiarity, that mixed case eliminates this familiarity effect, and that priming can be blocked by decision conflict. The major difficulty for this argument is the lack of direct evidence in support of the last two of these assumptions. Since priming here is supposed to be prelexical, it is hard to see why a decision conflict would eliminate this effect. Also, since the mixed case words look very unfamiliar, it is puzzling that there is no decision conflict for the words. That is, the lack of familiarity should induce a bias towards a "No" decision, when the correct decision is "Yes". If such a conflict existed, then Bodner and Masson would be forced to predict no priming for words in a mixed case presentation, which is not the case. They avoid this by assuming that mixed case forces the subject to ignore familiarity as a cue altogether. However, this implies that if a word in pure case were presented, it would not be responded to any faster than the mixed case words. This seems unlikely.

     To be fair to Bodner and Masson, it must be acknowledged that the mixed case effect is so perplexing that it is hardly surprising that the argument required to explain it is somewhat tenuous. The following alternative explanation also is somewhat speculative. Like Bodner and Masson's account, it assumes that mixed case alters the way in which decisions are reached, and focusses more emphasis on letter-level processing, but it does not assume that mixed case presentation allows a previously hidden effect to be revealed. Rather, it assumes that mixed case presentation introduces a new effect. On this view, the major effect of mixed case is to delay decisions until each of the letters has been explicitly identified. Under normal circumstances, explicit letter identification is not required, and information is passed directly from the letter recognition system to the lexical access system (i.e., implicit recognition). But when mixed case targets are involved, more detailed orthographic processing of the individual letters might be required. This additional process of explicit letter identification might be where a priming effect for nonwords creeps in, since we know that masked priming effects do occur at the letter level when explicit letter identification is required. For example, in a character-noncharacter decision experiment, Jacobs and Grainger (1991) observed repetition priming effects for single letter targets. Also, Davis (1997) has found masked priming effects when the task was to decide whether a given letter is contained in a target string. For example, if the task was to detect the presence of the letter C, then decisions were faster when the prime also contained the same letter in the same position, e.g., excuse-DOCTOR, even though this was the only similarity between them. Most importantly, this was equally true when both prime and target were nonwords.

    Under normal conditions, priming for letter detectors seems rather pointless, simply because the same limited set of detectors are being used over and over again, and hence most would be in the open, or active state most of the time. However, this may not be so when explicit detection or naming of letters is required. A task that involves explicit letter recognition may involve additional representations from those in normal reading. Presumably, letters have entries just as words do, which contain information about the letter, such as whether it is a consonant or a vowel, how its name is pronounced, how it appears in print, etc. Hence priming could occur if the contents of the letter entries have to be extracted. In normal reading, these details are not required, but if the task was to name the letter, or to classify it in some way, then obviously the entry for that letter would have to be opened.

    An important issue here is the exact role played by explicit letter identification in a mixed case experiment. It could be that explicit letter identification is required before lexical access can begin. However, it is typically assumed that letter information is rapidly converted into an abstract orthographic code , and therefore it would be irrelevant whether mixed case was used or not. Further, it has been found that mixed-case primes are just as effective as pure-case primes in a masked priming experiment (Forster & Guess, 1996). That is, the mixed-case prime dEnTiSt primes DENTIST just as well as the pure-case prime dentist. This result strongly suggests that mixed-case words are accessed at normal speeds, which would not be possible if access was delayed until explicit identification of all letters had been completed. The alternative is to suggest that mixed case has its effect after access, when the contents of the lexical entry are checked against the input, as proposed by Besner (1983). On this analysis, the abstract letter code contained in the lexical entry is converted into some kind of template that corresponds to what the input stimulus should look like. Obviously, this involves knowledge of the font that is being used, its size, and the case of the letter. With pure-case stimuli, this conversion can be carried out at a whole-word level, with all letters being converted into the appropriate case in parallel. But with mixed-case stimuli, this conversion must be carried out letter-by-letter, since the case of the input letter varies unpredictably. This procedure is much slower, and this accounts for the slower processing with mixed-case stimuli. Extrapolating from this analysis-by-synthesis model slightly, it could be suggested that determining which is the appropriate case to use may involve explicit identification of each letter in the input string. This involves extracting information from letter entries, and if these entries are already open, then that information can be extracted more rapidly. This leads to a more rapid synthesis of the checking template, and consequently a priming effect is obtained.

     The weakness in this argument is that a nonword should not require a post-access check to be carried out, since no matching entry would have been found. It might be countered that if the nonword is a close match to an actual word, then a check would be required to eliminate that word as a candidate. There is some support for this notion, since Kinoshita (1987) has reported that the extent to which decision times for nonwords are slowed by mixed case presentation depends on their similarity to words. However, it will be recalled that our earlier explanation of nonword priming in lexical decision (see the discussion of the fagulous case) was restricted to nonwords that are constructed by changing one letter of a long, low-density word. In the Bodner and Masson mixed-case experiment, there is no suggestion that the nonwords were like this. The nonwords were all 4-6 letters in length, and the two examples given (BREEM and FEAP) are not neighbors of a single low-density word. If they were, they would have shown priming in a pure-case experiment, and this was not the case. Instead, the nonwords were probably high-density, and our assumption earlier was that the entries of the close matches would not be opened on the first pass.

    So the priming cannot be explained in terms of candidate elimination. A simpler solution is to adopt the argument advanced by Forster and Guess (1996), who also needed to explain why mixed case affected nonword decisions. Their argument was simply that a "No" decision could not be made until the internal, abstract letter code used for access is checked against the actual input stimulus. This check is necessary, since a possible reason for not finding a matching entry might simply be that the wrong access code was used, and that possibility needs to be checked. Priming then occurs during the construction of the required template since the letter entries will all be in the open state.

     This account of the mixed case effect is fairly speculative, but it should be easy to test. One obvious prediction is that the priming effect should depend on the length of the nonword. If this account is correct, then we have an explanation of the circumstances under which letter-level priming effects might emerge. If it is incorrect, then other alternatives will need to be considered, such as the degradation argument put forward by Bodner and Masson (1997). However, until more is understood about how this effect occurs, it would be premature to draw any conclusions about the nature of masked priming based purely on this finding.

Priming for words in lexical decision with illegal distractors.

     The previous examples have been cases where the interpretation of nonword priming effects is at issue. In addition, there are also cases where a priming effect for words is obtained when it seems there ought not to be any such effect. For example, it is generally accepted that if the distractors in a lexical decision experiment consist of illegal letter strings, then lexical processing would not be required at all, and decisions could be made purely on the basis of orthographic legality. This ought to eliminate priming for word targets if priming is a lexical phenomenon, but not if it is a purely orthographic (sublexical) phenomenon.

     The initial test of this hypothesis provided results encouraging for a lexical view (Forster, 1992). This experiment compared the size of the masked priming effect in a lexical decision experiment with legal nonword distractors (e.g., limmer, bresh), and illegal distractors (e.g., pldwnxk, lfrksw). Subjects were tested in a two-phase design. For half the subjects, legal distractors were used in phase 1, and illegal distractors in phase 2. The reverse arrangement was used for the remaining subjects. Half of the words were high-frequency words (70 - 240 occurrences per million), and half were low frequency (6 - 9 occurrences per million). The purpose of including a frequency contrast was to provide a marker of lexical processing. As expected, the overall decision times dropped sharply from 552 ms to 470 ms when illegal distractors were used, indicating that the difficulty of the decision had decreased. In addition, the masked priming effect dropped from 62 ms with legal distractors to 18 ms with illegal distractors (a significant effect). By itself, this result is ambiguous. The reduction in the size of the priming effect is consistent with a lexical view, but the fact that a significant priming effect remained, even though lexical processing was not required, is more consistent with a prelexical view.

     Fortunately, the frequency contrast helps to eliminate this ambiguity. The size of the frequency effect also decreased, dropping from 75 ms with legal distractors to 21 ms with illegal distractors. However, the latter effect was also significant, indicating that decisions were still based on lexical properties to some extent even when the distractors were illegal. What this suggests is that some words were accessed so quickly that decisions about lexical status could be made faster on a lexical basis than on an orthographic basis. If this is the case, then a lexical theory of priming would predict that they should show a priming effect.

     Strong support for this account was provided by an examination of the priming effects for high and low frequency words separately. With illegal distractors, high frequency words showed a significant priming effect of 29 ms, but for low frequency words, the priming effect was only 10 ms, which was not significant. The inference is that low frequency words take so long to access that it is faster to base the decision on orthographic properties, and therefore there is no priming. So the fact that no significant priming effect was obtained under these conditions can be interpreted as strong evidence against a prelexical site for masked priming . If decisions were being made solely on an orthographic basis, then the detection of orthographic properties would assume greater significance than normal, and hence one would expect an enhancement of prelexical effects, not a reduction.

    Bodner and Masson (1997, Exp. 4) carried out a similar experiment, with a similar outcome, but reached rather different conclusions. They reported a significant masked priming effect of 22 ms for word targets with illegal nonword distractors, a result which they interpret as consistent with a prelexical account. However, Bodner and Masson specifically chose very high frequency words (100-400 occurrences per million) in order to make the task as easy as possible. So in this respect, the two experiments give exactly the same outcome: high frequency words show a small masked priming effect in lexical decision even with illegal distractors. But the logic of this experiment was slightly different. Bodner and Masson's purpose in using illegal distractors was to make the decision as easy as possible, so that subjects would rely more heavily on familiarity as a basis for their decision. Under these conditions, it was argued that one might be able to detect an inhibitory masked priming effect for nonword targets due to the increase in familiarity produced by an identity prime. In fact, such an effect was obtained, although the effect was extremely small (9 ms).

     Although the comparable experiment reported by Forster (1992) was not designed specifically to examine the nonwords, a reanalysis of the data from that experiment showed no difference in decision times for the illegal nonwords (461 ms for the primed items vs 463 ms for the control items). However, there was a significant difference in error rates (7.6% in the primed condition vs 5.4% in the control condition), which lends support to the findings of Bodner and Masson. However, a similar effect occurred for legal distractors as well (18.2% vs 14.4%), suggesting that this effect had nothing to do with the legality of the distractors, or the difficulty of the decision.

     This reverse priming effect for nonwords is not so easily explained away. However, even if it is accepted that an identity prime could increase the familiarity of a nonword target, the effect is obviously not very large, which might suggest that familiarity exerts only a very small influence, even under the most favorable conditions. Of course, accepting the idea that familiarity played a role in the nonword decisions does not necessarily entail that the same effect occurs for word decisions. In the absence of any clear signal from the lexical processor, the decision system may well take into account additional sources of evidence. Thus, a word might be recognized as a word purely on the basis of the output from the lexical processor, regardless of its familiarity, but if the lexical output is weak or absent, then familiarity might well begin to play a role.

     The familiarity argument used by Bodner and Masson is not without its problems. For example, making the task easier by using illegal distractors was supposed to enhance the effect of familiarity, but priming was substantially reduced for words in both experiments. This would suggest that familiarity does not play a role in priming for words. Moreover, this approach would lead one to expect a stronger impact of familiarity for low-frequency words, where it might be argued that the lexical signal is relatively weak. But quite the reverse is the case. In the Forster (1992) study, no priming at all was observed for low-frequency words.

     Given these arguments, it seems reasonable to conclude that the evidence from the illegal distractor manipulation is generally consistent with a lexical source for priming for words. However, it must be conceded that additional factors may play a role in the case of nonwords.

Form-priming in lexical decision is subject to strategic factors.

     We now turn to another recently uncovered phenomenon, which suggests that masked form-priming effects might be influenced by strategic factors, and therefore threatens the assumption that masked priming effects are purely automatic. The background to this issue concerns the role of competition between word units in priming. If form-priming effects are modeled in terms of persisting activation, then an interactive activation model with a relatively high setting of the word-to-word inhibitory parameter (e.g., 0.25) predicts that nonwords should be superior to words as form-primes (Forster & Veres, in press). That is, there should be stronger priming for a pair such as bunction-FUNCTION than for a pair such as junction-FUNCTION. This prime lexicality effect is due to the fact that a word prime directly competes with the target, whereas a nonword prime does so only indirectly. This interpretation is supported by the fact that if the inhibitory parameter is substantially reduced (e.g., 0.05), this is no longer the case, and the two types of primes are equivalent.

     This issue of the effect of prime lexicality was examined in a lexical decision task by Forster and Veres (in press). This study was based on earlier work by Veres (1986) and Humphreys et al (1987), who had shown that form-priming was obtained only when the prime was masked. That is, there was no priming for pairs such as junction-FUNCTION when there was a 500 ms SOA, but there was with a 50 ms SOA. Veres (1986) showed that this also depended on the lexical status of the prime. If the prime was a nonword, e.g., bunction, then priming was obtained regardless of masking, which rules out any low-level interpretation of the difference between masked and nonmasked priming. Forster and Veres (in press) pointed out that this difference between the effectiveness of word and nonword primes was relevant to models of word recognition based on competition between lexical units, and attempted to replicate these results. They confirmed the previous findings with nonmasked primes, but added a further complication for masked primes. The surprising conclusion from their experiments was that the effect depended on the type of nonword distractors used. With distractors that closely resembled a particular word (e.g., UMBROLLA), masked priming resembled nonmasked priming in that priming was obtained with nonword primes, but not with word primes. But with slightly more distant nonword distractors (e.g. AMBROLLA), masked priming with word primes was restored, and there was no longer any difference between word and nonword primes.

    It seems fairly clear from these experiments that for many individuals, close distractors alter the way in which lexical decisions are made. The most obvious indication is that decision times are about 50 ms slower under these conditions. It also seems fairly clear that this alteration somehow blocks form-priming, but only when the prime is itself a word. Forster and Veres (in press) offer several possible explanations of this effect, but in each case, there is an assumption that a strategic adaptation to the conditions of the experiment has altered the processing, and hence reduced the size of the priming effect. From this it would seem to follow that masked priming is not an automatic effect.

    But in reality, this conclusion does not follow at all, since the process that underlies priming must be distinguished from the consequences of this process. Strategies can affect the consequences of priming, but may have no effect on the process underlying priming. To be specific, it can still be argued that the prime opens the entry for the target regardless of the type of distractors, but because of alterations in the way the decision about the target is made, the effects of this priming action are lost. Moreover, we should not be surprised that masked priming effects depend on strategic adaptation to the conditions of the experiment. For example, consider what happens when the distractors in a lexical decision experiment are illegal letter strings (see discussion above). For low frequency words, the subject strategically adapts to these conditions by basing the decision on orthographic legality instead of lexicality, and as a consequence, the masked priming effect for these words is eliminated. So this is another case where priming effects are modified by a strategic adaptation, but from this it certainly does not follow that priming is not automatic. Indeed, priming in the form of lexical activation or entry-opening may still take place, but because of the altered task conditions, this priming no longer has any detectable effect.

     The explanation of the prime lexicality effect can be seen in the same terms. As argued by Forster and Veres (in press), the prime opens the entry for the target regardless of the lexical status of the prime, and regardless of the nature of the distractors. However, with close distractors, subjects are forced to adopt special measures to maintain an acceptable error rate, and one way to minimize errors is to reanalyze the target from scratch whenever it seems likely that a processing error has occurred. This would eliminate priming, since the prime no longer exists. But why should this reanalysis occur when the prime is a word, and not when it is a nonword? Forster and Veres assumed this was because an error signal was generated when the prime was a word, due to the fact that two orthographically distinct entries were opened for the same stimulus (apparently). This state of affairs is anomalous, and must indicate that an error has occurred. But if the prime is a nonword, then only one perfect match will be located, namely that for the target, and hence no error signal is created. Thus the effect can be explained by assuming that the closeness of the distractors affects the operation of a module that controls the lexical module (by ordering a restart), not the lexical module itself.

     Of course, this is not the only interpretation. One could assume that the distractors have a direct effect on the lexical module itself, e.g., by assuming that the closeness of the distractors raises the criterion for a match, with the result that a close match no longer opens the entry for the target. The problem here is that such a criterion shift would occur for all priming stimuli, whether words or nonwords. It is difficult to see why this would occur only when the prime is a word.

    Whatever the right analysis might be, it seems likely that it will be couched in terms of a suppression effect. That is, the close distractors suppress an existing priming effect for word primes. This kind of strategic effect is less disturbing than the expectancy effect discussed at the beginning of this paper. In that case, it seemed that expectancies created an effect that had extra-lexical origins (i.e. bias introduced by disconfirmation). An analogous interpretation of the distractor effect would have to be that the distractors induced a strategy that created an effect for word primes, where none existed before. It is not immediately obvious how to construct such an argument.

Is Masked Priming Ever Sublexical?

     The position adopted here is that tasks that require the identification of lexical properties of the target will exhibit lexical priming effects. Tasks that require the identification of other types of properties will exhibit other types of priming. So a task in which letter-level properties of the target must be detected should show sublexical priming effects. One such example is the letter detection task used by Davis (1997). If the task is to detect the presence of a given letter in the target, then responses are faster if the prime also contains that letter in the same position, even when this is the only similarity between the prime and target. This effect is indifferent to the lexical status of the target, and hence qualifies as a sublexical effect. However, as argued earlier, this effect only occurs when explicit letter identification is required.

     Another task that requires explicit letter identification is the tachistoscopic identification task (Evett & Humphreys, 1981), and here it also appears that priming effects are sublexical. In this procedure, both the prime and target are briefly presented, and these two stimuli are preceded and succeeded by masking stimuli. The subject's task is to identify the letters in the target, and with suitably brief exposures, subjects are unaware that two stimili were exposed. As shown in a number of experiments, primes that are orthographically similar to the target facilitate recognition of the target. For several reasons, it seems likely that this effect is also sublexical. First, priming effects can be obtained with this task for nonword targets (Forster, 1993). Further, Humphreys, Evett and Quinlan (1990) found priming with this procedure when the prime shared as few as 1/4 letters, or 2/5 letters with the target. This degree of similarity between a stimulus and an internal lexical representation does not seem sufficiently great to warrant lexical activation. Humphreys et al conclude that the effects seem better explained in terms of activating the representations that mediate word recognition, i.e., letters.

     With brief exposure of both prime and target, an additional process needs to be taken into account, namely temporal integration. In this case, a purely visual (i.e. iconic) representation of the prime may be fused with that of the target, creating a superimposed image of the two stimuli. Obviously, identification of the target will be delayed if it is not legible when fused with the prime. Davis and Forster (1994) showed that such an effect must occur in the tachistoscopic identification task by manipulating the legibility of the target when the prime was superimposed on it. When the legibility of the target was low, performance in the tachistoscopic identification task was poor, even though the prime was presented prior to the target. However, such an effect was absent in a lexical decision task. This can be attributed to the fact that in a lexical decision task, the target is normally displayed for an extended period (e.g., 500 ms). This is shown by the fact that when the target was briefly presented (40 ms, the same as the prime), legibility effects were apparent in the lexical decision task as well. What this suggests is that temporal integration effects are absent when the target is presented for an extended period. This corresponds well with results of visual integration experiments using Di Lollo's missing-dot paradigm (Di Lollo, 1980; Dixon & Di Lollo, 1994). In this paradigm, visual integration of two stimuli is far easier to achieve when both are brief. When either is presented for a long interval, integration is impossible.


     The aim of this paper has been to show that the masked priming technique can still be taken as an indicator of completely automatic processes occuring deep within the lexical processor, despite disconcerting evidence to the contrary. It has been shown that the existence of priming effects for nonwords in a lexical decision task can be understood in lexical terms, if the nonwords are close neighbors of actual words. The source of priming here is the more rapid elimination of inappropriate candidates. Priming for nonwords in the naming task does not appear to be explicable in similar terms, since there is no indication that this effect is obtained only with nonwords that are similar to words. But given the evidence that subjects attempt to pronounce the prime as well as the target, it is entirely possible that priming could occur at a purely articulatory level. This makes priming effects with the naming task difficult to interpret, and it would be sensible to explore alternative procedures, such as inserting a constant, nonmasked stimulus between the prime and the target to eliminate this possibility.

     The emergence of nonword priming in a mixed case lexical decision experiment is far more difficult to explain. We have offered one account, namely, that mixed case requires explicit letter identification, which creates the possibility of letter-level priming. This interpretation does not deny Bodner and Masson's claim that the priming here is sublexical. This point is conceded. However, Bodner and Masson would like to argue that mixed case conditions permit the observer to see the true underlying nature of priming that is obscured when pure case stimuli are used. The argument here is exactly the reverse. We argue that the use of mixed case changes the nature of the task in a fundamental way, so as to introduce effects that are not otherwise present.

     A rather narrow interpretation of the argument presented here seems to be that masked priming effects of a lexical nature can only be studied in a lexical decision task with pure-case stimuli, long exposure of the target, and legal distractors that do not closely resemble a particular word. Put like this, the conclusion seems faintly ridiculous. Could a phenomenon be of interest if it was only observable under such a narrow range of conditions? Of course not. The task ahead is to develop a range of tasks that will reveal different aspects of priming, both lexical and nonlexical, so that arguments can be based on a combination of tasks that support more powerful inferences than those based on single-task designs. We have already seen that masked priming effects can be observed in several tasks other than lexical decision and naming -- namely, semantic categorization and speeded episodic recognition (Forster, 1985), stem completion and fragment completion (Forster, Booker, Schacter & Davis, 1990), letter detection and same-different matching (Davis, 1997), and picture naming (Ferrand, Grainger, & Segui, 1994). By combining different tasks with different types of stimuli (e.g., cross-language priming with different scripts), we can begin to develop a methodology powerful enough to selectively study different facets of the process of word recognition without contamination from other processes. Current indications are that it would be impossible to explain all the effects observed with masked priming by any single mechanism, whether lexical or sublexical. It would therefore be unwise to discard such a promising tool because under some conditions it shows sensitivity to undesirable variables.


Andrews, S. (1992). Frequency and neighborhood effects on lexical access: Lexical similarity or orthographic redundancy? Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 234-254.

Balota, D.A. & Chumbley, J.I. (1984). Are lexical decisions a good measure of lexical access? The role of word frequency in the neglected decision stage. Journal of Experimental Psychology: Human Perception and Performance, 10, 340-357.

Besner, D. (1983). Basic decoding components in reading: Two dissociable feature extraction processes. Canadian Journal of Psychology, 37, 429-438.

Bodner, G.E., & Masson, M.E.J. (1997). Masked repetition priming of words and nonwords: Evidence for a nonlexical basis for priming. Journal of Memory & Language, 37, 268-293.

Carroll, J.B., & White, M.N. (1973). Word frequency and age of acquisition as determiners of picture naming latencies. Quarterly Journal of Experimental Psychology, 24, 85-95.

Colombo, L. (1986). Activation and inhibition with orthographically similar words. Journal of Experimental Psychology: Human Perception and Performance, 12, 226-234.

Coltheart, M., Davelaar, E., Jonasson, J.T., & Besner, D. (1977). Access to the internal lexicon. In S. Dornic (Ed.), Attention and Performance VI. (pp. 535-555). London: Academic Press.

Cutler, A. (1981). Making up materials is a confounded nuisance: or Will we be able to run any psycholinguistic experiments at all in 1990? Cognition, 10, 65-70.

Dalrymple-Alford, E.C. (1972). Sound similarity and color-word interference in the Stroop task. Psychonomic Science, 28, 209-210.

Davis, C. (1990). Masked priming effects in visual word recognition. Unpublished doctoral dissertation, Monash University, Melbourne, Australia.

Davis, C. (1997). Letter-level effects in masked priming. Paper in preparation.

Davis, C., & Forster, K.I. (1994). Masked orthographic priming: the effect of prime-target legibility. Quarterly Journal of Experimental Psychology, 47A, 673-697.

Di Lollo, V. (1980). Temporal integration in visual memory. Journal of Experimental Psychology: General, 109, 75-97.

Dixon, P., & Di Lollo, V. (1994). Beyond visible persistence: An alternative account of temporal integration and segregration in visual processing. Cognitive Psychology, 26, 33-63.

Evett, L.J., & Humphries, G.W. (1981). The use of abstract graphemic information in lexical access. Quarterly Journal of Experimental Psychology, 33, 325-350.

Ferrand, L. & Grainger, J. (1992). Phonology and orthography in visual word recognition: Evidence from masked nonword priming. Quarterly Journal of Experimental Psychology, 45, 353-372.

Ferrand, L., Grainger, J., & Segui, J. (1994). A study of masked form priming in picture and word naming. Memory & Cognition, 22, 431-441.

Feustel, T.C., Shiffrin, R.M., & Salasoo, A. (1983). Episodic and lexical contributions to the repetition effect in word identification. Journal of Experimental Psychology: General, 112, 309-346.

Forster, K.I. (1985). Lexical acquisition and the modular lexicon. Language and Cognitive Processes, 1, 87-108.

Forster, K.I. (1992). Lexical effects in masked form-priming. Paper presented at the 33rd Annual Meeting of the Psychonomics Society, St. Louis.

Forster, K.I. (1993). Form priming and temporal integration in word recognition. In G. Altmann & R. Shillcock (eds.), Cognitive models of speech processing: The Second Sperlonga Meeting. (pp. 467-495). Hove: Erlbaum.

Forster, K.I. (1994). Visual interaction effects in masked priming. Paper presented at the 35th Annual Meeting of the Psychonomic Society, St. Louis.

Forster, K.I., Booker, J., Schacter, D.L., & Davis (1990). Masked repetition priming: Lexical activation or novel memory trace? Bulletin of the Psychonomic Society, 28, 341-345.

Forster, K.I., & Davis, C. (1984). Repetition priming and frequency attenuation in lexical access. Journal of Experimental Psychology: Learning, Memory, and Cognition, 10, 680-698.

Forster, K.I., & Davis, C. (1991). The density constraint on form-priming in the naming task: Interference effects from a masked prime. Journal of Memory and Language, 30, 1-25.

Forster, K.I., Davis, C., Schoknecht, C., & Carter, R. (1987). Masked priming with graphemically related forms: Repetition or partial activation? Quarterly Journal of Experimental Psychology, 39, 211-251.

Forster, K.I., & Guess. K. (1996). Effects of prime duration and visual degradation in masked priming. Paper presented at the 37th Annual Meeting of the Psychonomics Society, Chicago.

Forster, K.I., & Shen, D. (1996). No enemies in the neighborhood: Absence of inhibitory neighborhood effects in lexical decision and semantic categorization. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 696-713.

Forster, K.I., & Taft, M. (1994). Bodies, antibodies, and neighborhood density effects in masked form-priming. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 844-863.

Frost, R., Forster, K.I., & Deutsch, A. (1997). What can we learn from the morphology of Hebrew? A masked priming investigation of morphological representation. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23, 829-856.

Gernsbacher, M.A. (1984). Resolving 20 years of inconsistent interactions between lexical familiarity and orthography, concreteness, and polysemy. Journal of Experimental Psychology: General, 113,

Glanzer, M., & Ehrenreich, S. L. (1979). Structure and search of the internal lexicon. Journal of Verbal Learning and Verbal Behavior, 18, 381-398.

Gollan, T.H., Forster, K.I. & Frost, R. (1997). Translation priming with different scripts: Masked priming with cognates and noncognates in Hebrew-English bilinguals. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23, 1122-1139

Grainger, J. (1992). Orthographic neighborhoods and visual word recognition. In in R.Frost and L.Katz (eds.), Orthography, Phonology, Morphology, and Meaning. Amsterdam: Elsevier Science.. (pp. 131-147).

Grainger, J., & Segui, J. (1990). Neighborhood frequency effects in visual word recognition: A comparison of lexical decision and masked identification latencies. Perception & Psychophysics, 47, 191-198.

Humphreys, G.W., Evett, L.J., & Quinlan, P.T. (1990). Orthographic processing in visual word identification. Cognitive Psychology, 22, 517-560.

Humphreys, G.W., Evett, L.J., Quinlan, P.T., & Besner, D. (1987). Orthographic priming: Qualitative differences between priming from identified and unidentified primes. In M.Coltheart (Ed.), Attention and Performance XII. (pp. 201-219) Hillsdale,

Jacobs, A.J. & Grainger, (1991). Automatic letter priming in an alphabetic decision task. Perception & Psychophysics, 49, 43-52.

Jacoby, L.L. (1983). Perceptual enhancement: Persistent effects of an experience. Journal of Experimental Psychology: Learning, Memory, and Cognition, 9, 21-38.

Jiang, N. (1997). Testing alternative explanations for asymmetrical cross-language priming. Paper submitted for publication, University of Arizona.

Kahneman, D., Treisman, A., & Gibbs, B.J. (1992). The reviewing of object files: Object-specific integration of information. Cognitive Psychology, 24, 175-219.

Kinoshita, S. (1987). Case alternation effect: Two types of word recognition? Quarterly Journal of Experimental Psychology, 39A, 701-720.

Lukatela, G., & Turvey, M. (1996). Inhibition of naming by rhyming primes. Perception & Psychophysics, 58, 823-835.

Lukatela, G., & Turvey, M.T. (1994). Visual lexical access is initially phonological: 2. Evidence from phonological priming by homophones and pseudohomophones. Journal of Experimental Psychology: General, 123, 331-353.

Martin, R.C., and Jensen, C.R. (1988). Phonological priming in the lexical decision task: A failure to replicate. Memory & Cognition, 16, 505-521.

Masson, M.E.J., & Isaak, M.I. (in press). Masked priming of words and nonwords in a naming task: Further evidence for a nonlexical basis of priming. Journal of Memory and Language.

Monsell, S., Doyle, M.C., & Haggard, P.N. (1989). The effects of frequency upon visual word recognition: Where are they? Journal of Experimental Psychology: General, 118, 43-71.

Neely, J.H. (1991). Semantic priming effects in visual word recognition: A selective review of current findings and theories. In D. Besner & G. Humphreys (Eds.), Basic Processes in Reading: Visual Word Recognition. Hillsdale, N.J.: Erlbaum.

Perea, M. & Gotor, A. (1997). Associative and semantic priming effects occur at very short SOAs in lexical decision and naming. Cognition, 67, 223-240.

Perea, M., & Rosa, E. (1997). Repetition and orthographic priming interact with neighborhood density at brief stimulus-onset asynchronies. Unpublished paper, Universitat de València.

Perea, M., Rosa, E., & Algarabel, S. (1997). Relatedness proportion effects influence associative priming at very brief SOAs in lexical decision tasks but not in naming tasks. Unpublished paper, Universitat de València.

Posner, M.I., & Snyder, C.R.R. (1975). Facilitation and inhibition in the processing of signals. In P.M.A. Rabbitt & S.Dornic (Eds.), Attention & Performance V. New York: Academic Press.

Rajaram, S., & Neely, J.H. (1992). Dissociative masked repetition priming and word frequency effects in lexical decision and episodic recognition tasks. Journal of Memory and Language, 31, 152-182.

Sereno, J.A. (1991). Graphemic, associative, and syntactic priming effects at brief stimulus onset asynchrony in lexical decision and naming. Journal of Experimental Psychology: Learning, Memory, and Cognition, 17, 459-477.

Tweedy, J.R., Lapinski, R.H., & Schvaneveldt, R.W. (1977). Semantic-context effects on word recognition: Influence of varying the proportion of items presented in an appropriate context. Memory & Cognition, 5, 84-89.

Veres, C. (1986). Factors affecting word selection in a masked prime paradigm. Unpublished honours thesis, Monash University, Melbourne, Australia.


This research was supported in part by National Multipurpose Research and Training Grant DC01409 from the National Institute on Deafness and Other Communication Disorders to the National Center for Neurogenic Communications Disorders at the University of Arizona, and by a grant from the McDonnell-Pew Foundation Cognitive Neurosciences Program. I am indebted to Chris Davis, who read several drafts of this paper, and to the Language Group at Macquarie University who provided the impetus for this paper, which is based on a talk given at Macquarie University in June, 1997.


1. These experiments were taken from the following publications: Davis (1990), Forster & Davis (1984), Forster et al. (1987), Forster (1985), Forster & Davis (1991), Forster & Taft (1994), Forster & Veres (in press). Only one of these papers was specifically concerned with the question of nonword priming.

2. A copy of this paper is available at the following Website:

3. An interesting point to note here is that this interpretation is critically dependent on the entry-opening model. If we adopt an activation view instead, then we should expect the prime to have an inhibitory effect, since activation of the neighboring word by the target nonword would combine with the activation produced by the prime, producing a strong bias in favor of a "Yes" response

4. It is interesting to note that one of the pieces of evidence used by Bodner and Masson (1997) to support a nonlexical interpretation of masked priming is the fact that priming for nonwords in a lexical decision task is obtained when the nonwords are pseudohomophones (e.g., werse, reer). Their intention was merely to make the task more difficult, but given the preceding discussion, it should be obvious what kind of response would be offered here.

5. The same phenomenon occurs in the Stroop task. There is no interference if the color name and the interfering word have the same onset (Dalrymple-Alford, 1972).

6. Greg Savage (personal comunication) has obtained very similar identity priming effects for nonword targets, controlling for the onset effect.

7. If the 10 ms priming effect observed for low frequency words proves to be a real effect, then it must be conceded that there is a sublexical source for priming, although it is much weaker than the lexical source.

8. A survey of previously published findings for legal nonwords showed a similar trend in about 60% of studies only. The most remarkable example of an identity prime increasing the error rate for nonwords was reported in Forster and Veres (in press, Exp. 1). In this experiment, the prime was not masked, and was very close to a word (e.g., UNIVORSE).

9. This was pointed out by Max Coltheart.

10. There is another purely visual effect that needs to be considered when interpreting priming effects. On more than one occasion, we have noticed that identity priming appears to be reduced when the prime and target are presented in different-sized fonts, and the target is substantially larger than the prime (Forster, 1994; Forster & Guess, 1996). Humphreys et al (1987) originally suggested that form-priming might depend on whether the prime and target are perceived as a single event, and Kahneman, Treisman and Gibbs (1992) have recently reported that repetition priming depends on the prime and target being taken as different instances of the same object. This suggests that when prime and target differ in size, they are more likely to be treated initially as two quite different objects, and this reduces the priming effect, even though they consist of the same letters. This effect serves to illustrate that a theory of masked priming cannot be divorced from a broader theory of object perception.