neural network dynamics of cortical inhibition: metacontrast masking

I N F O R M A T I O N SCIENCES

AN [NI~K~A~ONAL TGL[RNA

ELSEVIER Journal of Information Sciences 107 (1998) 287--296

Neural network dynamics of cortical inhibition: Metacontrast masking

Gregory Francis l Purdue University, Department of Psychological Sciences, 1364 Ps?,chological Sciences Building,

West Lafayette, IN 47907-1364, USA

Received 1 July 1995; accepted 10 June 1997

Communicated by Jeffrey P. Sutton

Abstract

The dynamic properties of a neural network model of boundary segmentation, called the Boundary Contour System, explains characteristics of metacontrast masking. Com- puter simulations of the model, with a single set of parameters, demonstrate that it accounts for the key findings of metacontrast masking; strongest masking occurs at positive stimulus onset asynchronies (SOA); masking is weak for negative SOAs; masking effects weaken with spatial separation. The properties of metacontrast masking arise from interactions between positive feedback and lateral inhibition in cortical neural circuits. The model links properties of metacontrast masking with aspects of visual persistence, spatial vision, and cells in visual cortex. © 1998 Elsevier Science Inc. All rights reserved.

Kevwords: Vision; Masking; Neural networks

1. Introduction

H o w does a visual system (whether biological or machine) respond to the rapidly changing inputs generated by a visual environment? Studies o f machine vision largely ignore this issue by preprocessing visual inputs. Such preprocess-

Tel.: +l 765 494-6934; fax: +l 765 496-1264; e-mail: [email protected]; www: http:// www.psych.purdue.edu.

0020-0255/98/$19.00 © 1998 Elsevier Science Inc. All rights reserved. PII: S 0 0 2 0 - 0 2 5 5 ( 9 7 ) 1 0 0 5 2 - 4

288 G. Francis / Journal of hfformation Sciences 107 (1998) 287 296

ing often involves converting the continuous visual input into a sequence of frames. The onset and offset of each frame is identified and the vision system is often reset between frames to allow unbiased processing of each image. While this approach may be successful at recognition and identification tasks, it is severely limited by the need for some other system (the researcher) to identify the onset and offset of each frame. A more powerful visual system would respond in real time to a continuous flow of visual inputs.

Biological visual systems are able to handle continuous visual inputs, and suggest mechanisms for computer vision. Psychophysical studies have explored the properties of visual perception and reveal that the mechanisms of perception are quite different from those suggested by computer vision. This paper considers the implications of a particular class of psychophysical findings called metacontrast masking. The results of these studies provide important information about how continuous visual information is processed in biological visual pathways. This paper also identifies a neural network model of visual pathways that explains the key properties of metacontrast masking. This network links the properties of dynamic visual perception to properties of spatial vision.

Psychophysicists use masking displays to investigate the time course of processing. When a target is briefly flashed the subject's visual system starts to process information about the target. If the display does not include the mask, the target is easily perceived, but if a masking stimulus is flashed nearby the target, it may interfere with the percept being generated by the target. By varying the on- sets, offsets, and spatial properties of the target and the mask, such displays can identify what parts of the target have been processed when the mask arrives.

Fig. 1 schematizes a visual display often used for these types of studies. The target is a vertical bar and the mask is a pair of flanking bars. The subject's task may be to identify some property of the target (e.g., brightness, shape, clarity,...). In such a display the target is often perceptually weaker (e.g., dimmer) and in some cases subjects fail to perceive the target at all. Of most significance when considering the dynamic responses of the visual system, the strongest effects occur when the mask follows the target by a non-zero blank interval. When the target and mask are presented simultaneously, very little masking occurs.

These results challenge many theories of visual perception. Like theories of computer vision, many models of biological systems assume that processing occurs as a sequence of stages. In the context of these models, the masking effects suggest that the mask somehow bypasses, or is processed more quickly by, several stages in order to catch-up with the target and interfere with processing. Such an interpretation is problematic and raises questions on the validity of these models.

To anticipate the main results, the model outlined below hypothesizes that ac- curate percepts of brightness (and other stimulus properties) require persisting signals that code the location and orientation of edges. To compare simulated results to psychophysical data, the simulations assume that the brightness percept is related to the total duration of boundary signals generated by stimulus

G. Francis / Journal qf It!/brmation Sciences 107 (1998) 287 296 289

edges. Through computer simulations, this paper demonstrates that the model accounts for three well-known properties of these types of masking displays: • Metacontrast: the strongest effect of masking occurs at an intermediate, pos-

itive stimulus onset asynchrony (SOA). • Paracontrast: when the mask precedes the target there is little masking. • Distance: the strength of metacontrast masking weakens as the target and

mask elements are moved further apart. Psychophysical results demonstrating these properties are shown in Fig. 3(a), and discussed in detail below.

2. BCS architecture

Grossberg and Mingolla [15,16] introduced the Boundary Contour System (BCS) to model how the visual system detects, completes, and regularizes the

BLANK FRAME

Fig. 1. Masking studies compare the perception of a target display when a masking stimulus is spatially and temporally nearby.

290 G. Francis / Journal of Information Sciences 107 (1998) 287-296

boundary segmentations of retinal images. The BCS (schematized at a few pix- el locations in Fig. 2) detects boundaries through a series of local stages, one global stage, and feedback. The first local stage (schematized as an unoriented annulus in Fig. 2) acts like on-center, off-surround cells at the retinal or LGN level. These signals feed into oriented cells with different contrast polarity and different orientations (schematized as black/white ellipses in Fig. 2) that act like simple cells in area V 1. The next stage of the model combines signals from simple cells of the same orientation, but different contrast sensitivity to create model complex cells (as in area V1). These complex cells (schematized as white

-\ "",,j

Spatial sharpening (excitatory and inhibitory feedback)

Cooperative stage (bipole cells)

Second (~1 competitive ~ t I stage / ~ I (hyper- ,fLy' ~A complex cells) ( )

k . )

o Oo. t . . F,r. contrast no contrast competitive

Unoriented polarity polarity stage (LGN cells) (simple (complex (end-stopped •

cells) cells) cells) ~ ~ ~

Fig. 2. Schematic diagram of the Boundary Contour System. Cooperative (solid lines) and competitive (dashed lines with circle terminators) interactions embedded in a feedback network process visual segmentations.

G. Francis I Journal o/'Injbrmation Sciences 107 (1998) 287 296 291

ellipses in Fig. 2) are sensitive to orientation but insensitive to the direction of luminance contrast. The remaining cells in the network inherit these properties.

The next stage (first competitive stage) of the model involves feedforward excitation to a level ofhypercomplex cells of the same orientation and position, and involves inhibition to hypercomplex cells of the same orientation and nearby po- sitions. (The boxes indicate habituating transmitter gates, as described below.) One role of this spatial competition (schematized as the dashed pathways branch- ing out from the vertical complex cell in Fig. 2) is to spatially sharpen the neural responses to an oriented luminance edge. Another role is to initiate the process, called end cutting, whereby boundaries are formed that abut line ends at orientations perpendicular or oblique to the orientation of the line [10,16]. The end cutting mechanism is analogous to the neurophysiological process of endstopping, whereby hypercomplex cell receptive fields are derived from interacting complex cell output signals [20,22,23]. Orban et al. [23] show that the inhibition responsi- ble for creating endstopped cells is spatially dependent and orientation specific, which are also properties of the first competitive stage.

The subsequent stage (second competitive stage) involves excitation to a level of hypercomplex cells of the same orientation and position, and involves inhibition to hypercomplex cells of the opposite orientation and same position. This inhibition (schematized as the crossing dashed pathways in Fig. 2) acts to sharpen the orientational responses to a luminance edge. The second competitive stage is hypothesized to exist in either area VI or V2 of visual cortex.

The next stage of the model contains bipole cells, which receive excitatory and inhibitory inputs from large regions of visual space. Fig. 2 schematizes a horizontal bipole that receives excitation from horizontal hypercomplex cells and inhibition from vertical hypercomplex cells that fall within its receptive field. The existence of bipole cells was predicted in [3] and [9] shortly before their discovery in area V2 of visual cortex [24]. Like the cells reported to exist in area V2 this type of model cell requires substantial excitatory input on each half of its bow-tie shaped receptive field. Bipole cell outputs feed back down to the cells of the second competitive stage, where they excite cells of the same orientation and position and inhibit cells of the same orientation and nearby po- sitions (spatial sharpening). These interactions sharpen the spatial responses of bipote cells.

The BCS architecture and its network interactions account for spatial properties of visual perception. Each stage processes information necessary to lo- cate oriented boundaries. Other systems use the boundaries across marly locations for object recognition [16], filling in of brightness, color, and depth perception [14,15,18], and motion detection [6,12]. In each case, boundary detection is a critical step toward processing visual images. The feedback loop plays a critical role in completing boundaries. An assumption of the current simulations is that displays that produce longer duration boundary signals will be brighter (more time for filling in).

292 G. Francis / Journal o[Injbrmation Sciences 107 (1998) 287 296

3. BCS dynamics

Each cell in the BCS has its own local dynamics involving activation by inputs and passive decay (of the order of simulated milliseconds). However, the excitatory feedback loops dominate the temporal aspects of the BCS. When inputs feed into the BCS they trigger reverberatory circuits that are not easily stopped. Indeed, simulations in [8] demonstrate that, if left unchecked, these reverberations can last for hundreds of simulated milliseconds.

At stimulus offset, cell activities within the feedback loop can continue to resonate due solely to the activity already in the loop. However, the spatial structure of the bipole cells' receptive fields limits the persistence of a segmentation. A bipole cell requires excitatory inputs on both sides of its receptive field; thus, when the visual inputs disappear, boundaries in the middle of the segmentation receive strong bipole feedback, but boundaries near the end receive no feedback. At stimulus offset the cell activities coding boundaries at the end of a segmentation passively decay away. This exposes a new cell as the contour end, which stops receiving bipole feedback and passively decays away as well. This erosion process continues from the contour ends to the middle of the contour. Moreover, as more boundary signals drop out of the feedback loop, the loop contains less activity thereby weakening the feedback signals. As a result, erosion speeds up over time, until finally the feedback loop no longer contains enough activity to support itself and the resonance collapses.

Erosion occurs slowly relative to the duration of a brief stimulus, and if unchecked could lead to undesirably long boundary persistence after stimulus offset and thus image smearing in response to image motion. The problem for the BCS is to accelerate boundary erosion in response to rapidly changing imagery. More generally, the BCS needs to use resonant feedback to main- tain a segmentation of unmoving scenic objects, even as it actively resets the segmentations corresponding to rapidly changing scenic objects to control image smearing in a form-sensitive way. Remarkably, the same BCS mechanisms that create resonant boundaries are also used to reset them. Francis et al. [8] identified two mechanisms embedded in the BCS design that reset segmentations. One mechanism, called a gated dipole, produces reset signals at stimulus offset that actively inhibit the persisting segmentation. Habituation of chemical transmitters (boxes in Fig. 2) shifts the balance of activity among competing pathways in the second competitive stage. This shift creates rebounds of activity in nonstimuluated pathways at the offset of a stimulus. Within the BCS, these rebounds inhibit the persisting segmentation. Details of the rebound properties are in [8]; for the present analysis a second mechanism, lateral inhibition, is more significant. The following simulations des- cribe its role.

G. Francis / Journal (~/h?/brmathm Science,s' 1(17 /199~) 287 296

4. Simulations

293

All simulations use the same equations and parameters. Moreover, the simulations are consistent with earlier simulations of persistence [4,8], apparent motion [6], cortical afterimages [7], and temporal integration [5].

4.1. Metacontrast

The primary property of metacontrast masking is that the strongest masking effects occur at positive SOAs. The range of SOAs for maximal masking tends to be around 50-100 ms, but it varies for subjects and stimuli. Fig. 3(a) shows the effects of metacontrast masking for one subject in [19]. In this study, subjects observed a target (vertical line) with a mask (flanking vertical lines) at varying SOAs (At) and varying edge-to-edge distances. Subjects judged the target's brightness by setting a filter to a standard to produce equivalent brightness percepts; a stronger filter indicates stronger masking. Each curve in Fig. 3(a) shows the masking function for a specific edge-to-edge separation. Within each curve, the strongest masking effect occurs for an SOA of about 90 ms.

The model shows similar nonlinear effects of boundary duration against SOA. While in both short and long SOA simulations the inhibitory signals sent from the mask to the target are the same, activities in the feedback loop are less sensitive to that inhibition for a short SOA because the resonance in the loop is strong. Alter a long SOA the resonance has weakened and the remaining activities are more sensitive to the inhibition. As a result, increasing the SOA over a limited range, while causing the inhibition to arrive later, decreases the duration of boundary signals, which is the main metacontrast effect. Fig. 3(b) shows simulation results measuring the duration of a target's boundary signals as a function of SOA and spatial separation from the masks. The results are qualitatively similar to the data in Fig. 3(a) reported by [19]. (Note that the v-axis runs in reverse because shorter boundary durations correspond to less time for filling in of brightness and dimmer percepts.)

4.2. Paracontrast

Another property in Fig. 3(a), is that masking effects are weak for negative SOAs, when the mask onset precedes the target onset. Such masking is called paracontrast masking and indicates a temporal asymmetry in masking effects.

Fig. 3(b) shows that the model has only weak paracontrast masking. This characteristic exists within the model because the feedforward inhibitory interactions are linked to the presence of the masking stimuli. When the mask

294 G. Francis I Journal of Information Sciences 107 (1998) 287 296

A

c-

a

..i-, o _

LL

1.4

1.2

1

0.8

0.6

0.4

0.2

0

Separation RG - - -o__ 1 , / o ~

- * - 5 ' / \ - , = - - 1o' / , , \ - t - 3 0 ' / / ',~

- - - * - I ° g x ~ ' ~

/ , / ,

-80 -40 0 40 80 120 160 200

A t (ms)

B 120 Separation

- - o - 0.60 ° / -~ _g - . - 07so ¢ \ ~" 125 - - o - - 0.90 ° At , ,_,, o _ . _ l"°s o y . _ . ~ .

130 "13 c-

O cD

135

I I I I

-50 0 50 • 100 150

A t (ms)

Fig. 3. Masking is strongest for intermediate SOAs (At). Paracontrast masking is weak. Masking falls off with distance. (a) Psychophysical data from [19] with permission. (b) Masking effects on duration of signals in the model. Shorter durations correspond to stronger masking. Note that the y-axis runs in reverse.

precedes the target, the inhibition builds in strength and decays before gener- ation of the target boundaries. As a result, paracontrast masking is weak. The paracontrast masking effects observed in Fig. 3(b) are due to decaying inhibitory interactions from the mask to the target.

G. Francis / Journal of lnformation Sciences 107 (1998) 287 296

4.3. Distance

295

Finally, the data in Fig. 3(a) from [19] show that increasing the edge-to-edge separation of the target and mask produces weaker metacontrast masking. Masking effects in this study disappear by three degrees of separation.

In the model, as in the data, the effect of metacontrast masking weakens with spatial separation. In the model the strength of lateral inhibition in the first competitive stage weakens with distance. Weaker lateral inhibition results in weaker masking. Fig. 3(b) shows that the effects are qualitatively the same as the data shown in Fig. 3(a).

5. Conclusions

The current simulation results use the same model mechanisms as [4-8]. The BCS model unifies diverse psychophysical data on dynamic vision. Beyond the dynamic properties of vision discussed here, the BCS mechanisms have also been shown to be consistent with a large set of spatial characteristics. The dynamic emergent properties used to explain metacontrast masking are consistent with, and depend upon, the BCS's role.s in boundary completion [15], texture segregation [16], shape-from-shading [17], brightness perception [18], 3-D vision [11,13,14] and motion processing [6,12], among others.

The theory explains not only how metacontrast masking occurs; but why. The mechanisms in the model that produce metacontrast effects were first pro- posed to explain properties of spatial perception. Thus, the theory makes strong links between dynamic and spatial vision, links that other theories of metacontrast masking have not made [1,2,21]. Moreover, the theory relates the mechanisms underlying metacontrast masking to known properties of visual cortex, thereby allowing neurophysiological tests of the theory.

References

[1] B. Breitmeyer, Visual Masking: An Integrative Approach, Oxford University Press, New York, 1984.

[2] B. Breitmeyer, L. Ganz, Implications of sustained and transient channels for theories of visual pattern masking, saccadic suppression, and information processing, Psychological Review 83 (1976) 1 36.

[3] M. Cohen, S. Grossberg, Neural dynamics of brightness perception: Features, boundaries, diffusion, and resonance, Perception & Psychophysics 36 (1984) 428456.

[4] G. Francis, Cortical dynamics of lateral inhibition: Visual persistence and ISI, Perception & Psychophysics 58 (1996) 1103-1109.

[5] G. Francis, Cortical dynamics of visual persistence and temporal integration, Perception & Psychophysics 58 (1996) 1203 1212.

[6] G. Francis, S. Grossberg, Cortical dynamics of form and motion integration: Persistence, apparent motion, and illusory contours, Vision Research 36 (1996) 149 174.

296 G Francis/Journal of Information Sciences 107 (1998) 287-296

[7] G. Francis, S. Grossberg, Cortical dynamics of boundary segmentation and reset: Persistence of residual traces and orientational after effects, Perception 25 (1996) 543-567.

[8] G. Francis, S. Grossberg, E. Mingolla, Cortical dynamics of feature binding and reset: Control of visual persistence, Vision Research 34 (1994) 1089-1104.

[9] S. Grossberg, Outline of a theory of brightness, color, and form perception, in: E. Degreef, J. van Buggenhaut (Eds.), Trends in mathematical psychology, Elsevier, Amsterdam, 1984, pp. 59 86.

[10] S. Grossberg, Cortical dynamics of three-dimensional form, color, and brightness perception I: Monocular theory, Perception & Psychophysics 41 (1987) 97-116.

[11] S. Grossberg, Cortical dynamics of three-dimensional form, color, and brightness perception II: Binocular theory, Perception & Psychophysics 41 (1987) 117 158.

[12] S. Grossberg, Why do parallel cortical systems exist for the perception of static form and moving form?, Perception & Psychophysics 49 (1991 ) 117-141.

[13] S. Grossberg, A solution of the figure-ground problem for biological vision, Neural Networks 6 (1993) 463484.

[14] S. Grossberg, 3-D vision and figure-ground separation by visual cortex, Perception & Psychophysics 55 (1994) 48-120.

[15] S. Grossberg, E. Mingolla, Neural dynamics of form perception: Boundary completion, illusory figures, and neon color spreading, Psychological Review 92 (1985) 173 211.

[16] S. Grossberg, E. Mingolla, Neural dynamics of perceptual grouping: Textures, boundaries, and emergent segmentations, Perception & Psychophysics 38 (1985) 141-171.

[17] S. Grossberg, E. Mingolla, Neural dynamics of surface perception: Boundary webs, illuminants, and shape-from-shading, Computer Vision, Graphics, & Image Processing 37 (1987) 116 165.

[18] S. Grossberg, D. Todorovi6, Neural dynamics of 1-D and 2-D brightness perception: A unified model of classical and recent phenomena, Perception & Psychophysics 43 (1988) 241 277.

[19] R. Growney, N. Weisstein, S. Cox, Metacontrast as a function of spatial separation with narrow line targets and masks, Vision Research 17 (1977) 1201 1205.

[20] D. Hubel, T. Wiesel, Receptive fields and functional architecture in two nonstriate visual areas (18 and 19) of the cat, Journal of Neurophysiology 28 (1965) 229-289.

[21] H. O/green, A neural theory of retino-cortical dynamics, Neural Networks 6 (1993) 245-273. [22] G. Orban, H. Kato, P. Bishop, End-zone region in receptive fields of hypercomplex and other

striate neurons in the cat, Journal of Neurophysiology 42 (1979) 818 832. [23] G. Orban, H. Kato, P. Bishop, Dimensions and properties of end-zone inhibitory areas in

receptive fields of hypercomplex cells in cat striate cortex, Journal of Neurophysiology 42 (1979) 833 849.

[24] R. von der Heydt, E. Peterhans, G. Baumgartner, Illusory contours and cortical neuron responses, Science 224 (1984) 1260-1262.

neural network dynamics of cortical inhibition: metacontrast masking

Documents