Download - Is perception continuous with cognition?

Is perception continuous with cognition?

The Cognitive Impenetrability of Vision

Read Seeing & Visualizing Chapter 2 or the BBS article on my web site: ruccs.rutgers.edu/faculty/pylyshyn.html

The accepted answer goes along with intellectual (and political) fashions

The zeitgeist in the second half of the 20th century was one of populist values which emphasized egalitarianism and the limitless possibility of the human mind. In keeping with this spirit, many scholars mistakenly repudiated innateness and, emphasizing plasticity, embraced learning as the defining character of human nature.

It was in this era that the belief arose that everything we perceived or thought was dependent on our cultural or linguistic context. Hence the popularity of Worf and Sapir, of Bruner’s New Look in perception as well as deconstruction movements in European (and, of course, Californian) thought.

The cultural ethos gave strong support to the view that the mind is highly plastic and at the mercy of the environment

Through the last half of 1900s both the general public and the social science community assumed that perception and cognition were continuous – that you could not distinguish between the two. Bruner’s New Look in Perception: Perception modeled on Science –

hypothesis formation, verification & modification Bruner, J. S. (1957). On perceptual readiness. Psychological

Review, 64, 123-152. Effect on Philosophy of Science: Feyerabend, Hanson, Kuhn (These

philosophers assumed that there is no “innocent eye” so all observations are “theory-laden”. Quine’s “The myth of the given”)

What are some reasons for thinking that vision is cognitively penetrable (and not modular)Expectation and the perception of patterns

Perception in noise of words, sentences, and probable sequences Assimilation of perception to the norms (Postman) Visual recognition as involving the framing hypotheses (Potter) Perceptual learning and expertise (bird watchers, wine tasters…) Apparent motion and “problem solving” in vision (Rock) The effect of hints and foreknowledge on fragmented figures or

stereogramsNeuroscience evidence for top-down effectsThe experience in computer vision (Heterarchy)

Why should we doubt the continuity thesis? Illusions…. Methodological concerns (signal detection theory)

What are some reasons for thinking that vision is cognitively penetrable (and not modular)

Expectancy and the perception of patterns Perception in noise of words vs nonwords, sentences, statistical

properties of sequences Assimilation of perception to the norms (Postman)

Explanation in terms of framing hypotheses (Potter)

Perceptual learning (bird watchers, wine tasters…) Apparent motion and “problem solving” in vision (Rock) The effect of hints and foreknowledge on closure figures or stereograms

● Neuroscience evidence for top-down effects● The experience in computer vision (Heterarchy –Shirai)

● Why should we doubt the continuity thesis? Illusions…. Methodological concerns (signal detection theory)

Some reasons to assume that vision is cognitively penetrable

Informal Magic tricks The proofreader’s problemSpeech perception

Phonetic monitoring Cross modal effect on hearing (listening while reading) Phoneme restoration effect

Most people believe that we when we read we make predictions about the word next which is whywhy we often substitute the wrong word for an unexpected word and is also why the predictability index (the Cloze Score) is a useful measure of readability, widely adopted in the past by newspapers and language teachers.

If you think that mental imagery involves (uses) visual processes, it is important to know which aspects or stages or visual functions it uses. If it involves reasoning and inferences based on what is sees then it’s not an interesting thesis since mental imagery is assumed to be a modality of thought. In that case the thesis turns out to be the claim that imagery shares cortical resources with thought which nobody doubted. If, on the other hand, visual perception is used to interpret mental images then this is a strong thesis since it suggests that images themselves are (or can be) seen.

Is a mental image something that is seen (with the mind’s eye)?

Seeing Mental Images• The central tenet of the Picture Theory of mental imagery is

that mental images constitute the same kind of spatial representations as are found on, say, the retinal or primary visual cortex.

• If, on the other hand, a mental image is an already-interpreted form of representation, this would support the thesis I am putting forward; that images are only phenomenologically picture-like while in their function (i.e., in their causal role) they are conceptual and therefore symbolic descriptions.

• This not only makes a big difference in one’s view about the mind but also has repercussions for the treatment of reports of conscious experience in scientific theories.

Is vision like science itself?• Does vision involve hypothesis formation and testing,

as was once believed to be the method used in science? Potter and Bruner’s hypothesis testing experiment

• If the answer is YES then the question whether mental imagery uses vision becomes circular (or empty) since clearly vision serves thought by providing new information.

• So the question whether there are general top-down hypotheses-proposing or hypothesis-tendering) effects becomes central.

Familiarity and the reconstruction of partially hidden patterns

Familiarity and the reconstruction of partially hidden sounds

Signal detection theory helps to isolate stages in information processing Signal detection theory helps to isolate slages in information processing

VERNALIT INTERVAL TRLAVNEI

Meaning and perception of phonemesThe ‘phoneme restoration effect’

1. The pitcher’s thoughts about the dangerous batt▒ made him nervous

2. The soldier’s thoughts about the dangerous batt▒ made him nervous

How to get someone to see fragmented figures?

Autostereogram or Magic Eye® figures

Apparent motion is a function of early vision, yet is subject to

various “intelligent” interpretations

In the “Ternus Configuration” short time delays result in “single element motion” (the middle object persists as the

“same object” so it does not appear to move)

Long time delays result in “group motion” because the middle object does not persist but is perceived as a new

object each time it reappears

But long delays, when the disappearance appears to be due to occlusion by an opaque surface, maintain objecthood, and therefore behave like short delays

Apparent “problem solving” in vision

Time 1

Time 2

Condition A Condition B

Does the circle appear to move or are there two circles being covered?

Apparent “problem solving” in vision (Rock)

Time 1

Time 2

Condition A Condition B

Another example from Irving RockBecause the rectangle is seen as transparent, the circle is seen to move in apparent motion

Here the rectangle is seen as opaque so it ‘explains’ the circle’s visible-invisible cycle. Therefore the circle is not seen to move.

Many reasons for thinking that vision is cognitively penetrable

Expectancy and the perception of patterns Perception in noise of words vs nonwords, sentences, statistical

properties of sequences Assimilation of perception to the norms Explanation in terms of readiness: Seeing as… Perceptual learning (bird watchers, wine tasters…) The effect of hints and foreknowledge on closure figures or

stereograms

►Neuroscience evidence for top-down effectsThe experience in computer visionWhy should we doubt the continuity thesis?

Illusions…. Methodological concerns (signal detection theory)

Centrifugal neural pathways

There are almost as many outward (efferent) nerve fibers as inward fibers

There is evidence of top-down control of sensors and top-down effects on percepts (e.g., filling-in effect for blind spot and other scotomas) Early attentional gating of a cat’s auditory signals

Hernandez-Péon, R., Scherrer, R. H., & Jouvet, M. (1956). Modification of electrical activity in the cochlear nucleus during "attention" in unanesthetized cats. Science, 123, 331-332.

Many reasons for thinking that vision is cognitively penetrable

Expectancy and the perception of patterns Perception in noise of words vs nonwords, sentences,

statistical properties of sequences Assimilation of perception to the norms Explanation in terms of readiness: Seeing as… Perceptual learning (bird watchers, wine tasters…) The effect of hints and foreknowledge on closure figures

or stereograms● Neuroscience evidence for top-down effects►The experience in computer visionWhy should we doubt the continuity thesis?

Illusions…. Methodological concerns (signal detection theory)

Early experience in computational perception (and AI) suggested that knowledge-based

perception leads to better performanceMinsky & Papert’s: “Heterarchy, not hierarchy” or the knowledge- based approach to perception, Shirai’s success in building an edge detector and other model-based vision systems.Riseman & Hanson (1987): “It appears that human vision is fundamentally organized to exploit the use of contextual knowledge and expectations in the organization of spatial primitives… Thus the inclusion of knowledge-driven processes at some level in the image interpretation task, where there is still a great degree of ambiguity in the organization of the visual primitives, appears inevitable (286).”

Common Blackboard Communication Area

Accoustic/optical expert

Phonetic/edge-finding expert

Phone-pattern/contour expert

Word/2D shape expert

Syntax/3D shape expert

Semantic/object-recognition expert

The Blackboard Architecture used in many AI applications is highly non-modular because all parts can communicate with one another.

e.g. ‘Hearsay’ speech recognition system

PandemoniumAn early architecture, similar to the blackboard architecture, was proposed by Selfridge in 1959. This idea continues to be at the heart of many psycho-logical models, including ones implemented as neural net (or connectionist) models.

On the other hand ….From a function perspective it makes sense

that the earliest stage of vision should be built to be fast and very often (though not necessarily always) veridical.

The parable of the blind clockmaker and Simon’s “partially decomposable systems”

The influence of David Marr’s Principle of least commitment: Do not do something that you may later wish to undo (e.g., depth first search)

The beginnings of a modular view in computer vision

David Marr (1982):

“The principle of least commitment… requires not doing something that may later have to be undone, and I believe that it applies to all situations in which performance is fluent. It states that algorithms that are constructed according to a hypothesize-and-test strategy should be avoided because there is probably a better method.”

There is a lot of prima facie evidence that vision works independently of what we believe and what we expect.

In order to explain why that seems to be so we need to distinguish between the part of vision that is unique to vision and the part that is shared by all intellectual processes. The unique part is called Early Vision.

What evidence is there that vision is a modular process?

Irvin Rock produced a lot of the evidence that is cited in support of the view that perception involves “Inference” and “Problem Solving” in order to account for the visual input, but he also says:

“The major difference between perception and thought is that perception is based on a rather narrow range of internalized knowledge, as far as inference and problem solving are concerned… Perception must rigidly adhere to the appropriate internalized rules, so that it often seems unintelligent and inflexible in its imperviousness to other forms of knowledge (p 340)”.

From: Rock, I. (1983). The Logic of Perception. Cambridge, Mass.: MIT Press.

Illusions don’t depend on what you believe!

A B C

Do hints speed up closure of fragmented figures?According to Reynolds (1985) the only thing that makes a difference to ease of closure is knowing that there is a sensible reading of the fragmented figure. Knowing the name of the figure does not help.

Meaning and difficult percepts What helps us see them?

Fragmented figures; autostereograms (“magiceye”); Random dot stereograms Category hints? Description? Model of what you should see? Knowing where to look/attend? Knowing they are ambiguous? Having seen them once before?

Saye, A., & Frisby, J. P. (1975). The role of monocularly conspicuous features in facilitating stereopsis from random-dot stereograms. Perception, 4(2), 159-171.Frisby, J. P., & Clatworthy, J. L. (1975). Learning to see complex random-dot stereograms. Perception, 4(2), 173-178.

Amodal completion is automatic and non-inferential (not rational)What would the figures look like if the black square occluders were removed? Try to see through them.

Independence of feature detection and object (pattern) perception

Evidence from brain-damage: visual agnosic with spared nonvisual pattern-recognition [Humphreys & Riddoch, 1987. To see but not to see: a case study of visual agnosia] This patient could had severe agnosia and could not visually recognize

familiar things (including his wife’s face) or discriminate shapes. But he had normal eye movements and sensory abilities (including

stereo and motion detection) He could see local features and, with enough time and effort, could

often infer the identity of the object (just as the New Look suggests) He could describe and draw objects from memory and could recognize

objects by touch, so his pattern memory was normal It “supports the view that the perceptual representation used in this

matching process can be ‘driven’ solely by stimulus information, so that it is unaffected by contextual knowledge.” H&R, p104)

Signal Detection Paradigm

Signal

Present Absent

Response

Yes I see it Correct Detection

False positive

No I don’t see it

False negative (misses)

Correct rejection

If correct detection improves without increase in false positives it’s an increase in sensitivity. If it improves but so does the false positive rate it suggests an increase in bias towards acceptance.

Examples from the language module

Phoneme restoration effect appears to be a response bias effect

Lexical ambiguity appears to be resolved after a period of time, before which all options are available.

Meaning and perception of phonemes

1. The pitcher’s thoughts about the dangerous batter made him nervous2. The pitcher’s thoughts about the dangerous battle made him nervous3. The pitcher’s thoughts about the dangerous batt▒ made him nervous4. The soldier’s thoughts about the dangerous batter made him nervous5. The soldier’s thoughts about the dangerous battle made him nervous6. The soldier’s thoughts about the dangerous batt▒ made him nervous

Signal detection analysis of responses shows that the effect is connected to the response selection stage*

*Samuel, A. G. (1981). Phonemic restoration: Insights from a new methodology. Journal of Experimental Psychology: General, 110(4), 474-494.

Maybe cognition has a post-perceptual selection function?

Swinney study of resolution of lexical ambiguity

Context Ambiguous UnambiguousNone The man was not surprised when

he found several bugs▲ in the corner of his room.

The man was not surprised when he found several insects▲ in the corner of his room.

Biased-insect

The man was not surprised when he found several spiders, roaches, and other bugs▲ in the corner of his room.

The man was not surprised when he found several spiders, roaches, and other insects▲ in the corner of his room.

Biased-Spying

The man was not surprised when he found several microphones, recorders, cameras, and other bugs▲ in the corner of his room.

The man was not surprised when he found several microphones, recorders, cameras, and other illegal devices▲ in the corner of his room.

Visual reading task at ▲: Ant (related), Spy (inappropriate), bag (unrelated)

Swinney ambiguous-word priming experiment No context:

Rumor had it that for years the government building had been plagued with problems. The man was not surprised when he found several bugs▲ in the corner of his room.

Context biased to insects:Rumor had it that for years the government building had been plagued with problems. The man was not surprised when he found several spiders, roaches, and other bugs▲ in the corner of his room.

Context biased to spying:Rumor had it that for years the government building had been plagued with problems. The man was not surprised when he found several microphones, recorders, cameras, and other bugs▲ in the corner of his room.

At points marked with ▲ a word was presented visually and subjects had to decide as fast as possible whether it was a word (a lexical decision task, where half of the time it was not a word). Examples of these words are ant, spy or sew. Nonwords were formed from the same letters: tna, ysp, swe.

Swinney ambiguous priming experiment

Swinney found that both senses of the ambiguous word (eg “bug”) primed the decision task – so both spy and ant were primed relative to the neutral word (sew)

The priming effect for the inappropriate sense of the word disappeared after about 0.7 to 1.0 seconds. After this only words related to the appropriate sense were primed.

Swinney, D. A. (1979). Lexical access during sentence comprehension: (Re)consideration of context effects. Journal of Verbal Learning & Verbal Behavior, 18(6), 645-659.

Visual Detection of Anomalous objects

Biederman

Alteration of perception with practice: The case of expert perceivers

Visual expertise often arises from learning what to attend to (pre-visual) as well as which patterns are diagnostic (post-visual).

Shiffrar & Biederman study of expert chicken sexers Perception & recall of board positions by chess masters Studies of athletes’ perception

Other fluent perceptions once thought to require access to knowledge

Noninvertibility of 3D-to-2D mappingThere is an important difference between constraints on visual interpretation being built in to the architecture through evolution and the use of knowledge about the likelihood of particular scenes contents in visual interpretation of a particular scene David Marr and Natural Constraints Natural constrains involve only optical-geometrical

properties – not physical constrains or statistical properties

Every perspective projection of edges is infinitely ambiguous, yet is almost always perceived univocally

Any set of 2D edges could have arisen for an unlimited number of 3D configurations. What makes the perception unique is the ‘assumption’ that certain configurations are non-accidental – i,e. they would not change with a small change in perspective.

Label propagation as an illustration of a natural constraintFour types of edge labels (convex, concave, 2 boundaries)

These are the only valid junction labels for a world of polyhedraThey are the only physically possible vertices formed by the 4 edge labels

Arrow

T

Fork

L

Three distinct types of trihedral junctions are recognized as part of the labeling scheme (referred to as Y, T and Arrow junction), making a total of 192 trihedral junction labels and 16 dihedral junction labels. But most of these label combinations cannot occur in the physical world. … The complete Waltz set includes over 50 line labels. These can generate over 300 million logically possible junctions, of which only 1790 are physically possible (see Waltz, 1975).

Arrow

T

Fork

L

Illustration of how to assign possible labels to a figure using the label-consistency constraintStart with junction A, followed by B, C, and D. The Arrows placed at A limit the choices for L’s at B, which in turn limit for Arrows at C. At C automatic neighbor reexamination has an effect, eliminating all but one label at B and A. Finally, the C boundary label limits the Fork choices at D to the one shown.

Many natural constraints take the form of label-propagation. By increasing the number of different label types, we increase the possible constraints in interpretation. By adding shadow edges and ‘cracks’ we end up with a unique labeling of figures like this

++

--

c

c

s

ss

-

S

S

SS

+-C

C

Waltz, D. (1975). Understanding Line Drawings of Scenes with Shadows. In P. H. Winston (Ed.), The Psychology of Computer Vision (pp. 19-91). New York: McGraw-Hill.

The label consistency requirement can also explain why some figures are ambiguous… in this case

because it has two globally consistent sets of labels

+

+

++

++

+

++ +

++

+

+

+++

+ -

-

-

---

--

+

Off-retinal info different from foveal info

Labels propagate over picture

Anorthoscope: Scan view How many contours are there?

Anorthoscope: Scan view What is the shape?

Perceiving a real figure as an impossible one!

Because the visual system applies Natural Constraints blindly, it can be tricked in reverse – to see an impossible figure knowing that it is real – as this illusory figure constructed by Richard Gregory shows.

Amodal completion seems to defy rational cues; vision has a logic of its own

What would this figure look like if the squares were removed?

Is there very short iconic storage in the preconceptual stage?

● Although the idea of pictorial long-term memory is not supported, there is some provisional evidence that sensory information outlasts the duration of the stimulus. Many people have studied these “sensory buffers” including George Sperling and Michael Posner.

Sperling’s partial report method for showing an iconic memory

Posner’s demonstration of short duration shape information

Fast

Fast

Slower

Slower

Some special cases: Natural Constraints or cognitive penetration into early vision?

Solving the correspondence problemThe Constancies (Muller-Lyer, lightness demo)Direction of light built inConcavity-convexity of very familiar stimuli –

e.g., face recognition

Illusions of size and motion

Cavanagh_SlowGPSpokeS.mov

Color Motion

Here are two sets of rotating spokes. On the left, the relative luminance of the two colours varies slowly over a range that might include equiluminance (more likely on a CRT than a LCD monitor). If it does, the coloured spokes will appear to slow down briefly as the ratio moves through equiluminance, silencing the luminance response. To help judge the slowing, light and dark spokes are shown in the centre, moving at the same rate as the colour spokes. The slowing effect should grow as you watch through several cycles. To observe the slowing, you have to fixate the centre of the bull's-eye. If you look at the spokes, you can of course track them and see their actual speed.

Which is lighter: Square A or Square B?

Apparent Motion – another Natural Constraint(which of these two matches will vision choose?)

Dawson, M., & Pylyshyn, Z. W. (1988). Natural constraints in apparent motion. In Z. W. Pylyshyn (Ed.), Computational Processes in Human Vision: An interdisciplinary perspective (pp. 99-120). Stamford, CT: Ablex Publishing.

Whether these figures are seen as dimples or mounds depends on whether they are viewed in this orientation or upside-down (relative to the head)

Mountain or crater?

The Ames distorting room

Some special cases of modulesMicro-modules in vision (color & motion &

contour)Evidence for central (cognitive) architectural

modules is rarer – the best example being the Theory of Mind Module (ToMM) – Leslie, 1994.

Is there a visual-motor “module”? Milner & Goodale; patient DF. Ventral-dorsal pathways. Postural, grasping, eye-movement responses are not

subject to illusions that influence conscious vision and sometimes vice-versa.

Micro-modules?

Color, contour, motion and shadow; other information-flow constraints

The visual system consists of minimodules that are restricted in their intercommunication

Luminance

Stim

ulus

Motion

BinocularDisparity

Color

Texture

Edge Polarity

Shape

Derived Attributes

Shading,Occlusion,Surfaces,

Lumin

Color

?

Not all contours are alike:Equiluminous boundaries are not interpreted as shadows

Equiluminous boundaries also don’t yield a depth percept

Where does this leave us?● There is a large part of vision, called early vision (after

Marr) that is impervious to direct cognitive influences It can be affected by cognitively-determined selection at one of

two loci: prior to the vision module, where attention may select individual objects, and after vision the vision module, where cognition may select from among possible interpretations

● Vision seems to take into account many general world-properties (including direction of light), but it nonetheless remains “ignorant” and unable to utilize relevant knowledge about a particular scene that the organism has.

● The built-in natural constraints appear to be highly restricted and are mostly confined to geometrical-optical properties and not to physical laws.

The Pulfrich Pendulum illusion shows that a basic physical principle, such as the impenetrability of solid objects, is not part of the build-in constraints on visual interpretation. Here the visual system readily yields a physically impossible interpretation.

Leslie, A. M. (1988). The necessity of illusion: Perception and thought in infancy. In L. Weiskrantz (Ed.), Thought Without Language. Oxford: Oxford Science.

Modularity is a common principle in nature (Simon’s ‘Partially Decomposable Systems’)

The parable of the blind watchmaker (Simon)Perception may also have submodules. Vision

appears to have submodules for color, motion, contour, stereopsis… and there is even evidence for modules such as face recognition

Evidence for central (cognitive) architectural modules is rare – the best example being the Theory of Mind Module (ToMM) (Leslie, 1994).

How does this connect with classical issues in the philosophy of mind?

Early vision seems to include more than has been assumed to be part of the sensorium (sensory transduction) since it involves complex inference-like processes. However, it does not appear to permit contact with information in memory so it does not allow recognition of objects in the visual field as particular known objects.

The question of whether representations at this level are conceptual has not been raised. I believe the answer may be tied to other distinctions that we have not yet discussed – in particular to the distinction between personal and subpersonal representations, and architectural vs intentional or representation-governed processes.

The issue of accessibility to consciousness has not been raised, even though it is central to some views of sensation (or sentience). I will come back to this question later.

Humphreys and Riddoch study of a case of visual agnosia A remarkable case of classical visual agnosia is described in a book by Glyn Humphreys and Jane Riddoch

(Humphreys & Riddoch, 1987). … the patient was unable to recognize familiar objects, including faces of people well-known to him (e.g., his wife), and found it difficult to discriminate among simple shapes, despite the fact that he did not exhibit any intellectual deficit. As is typical in visual agnosias, this patient showed no purely sensory deficits, showed normal eye movement patterns, and appeared to have close to normal stereoscopic depth and motion perception. Despite the severity of his visual impairment, the patient could do many other visual and object-recognition tasks. For example, even though he could not recognize an object in its entirety, he could recognize its features and could describe and even draw the object quite well – either when it was in view or from memory. Because he recognized the component features, he often could figure out what the object was by a process of deliberate problem-solving, much as the continuity theory claims occurs in normal perception, except that for this patient it was a painstakingly slow process. From the fact that he could describe and copy objects from memory, and could recognize objects quite well by touch, it appears that there was no deficit in his memory for shape. These deficits seem to point to a dissociation between the ability to recognize an object (from different sources of information) and the ability to compute an integrated pattern from visual inputs which can serve as the basis for recognition. As (Humphreys & Riddoch, 1987, p 104) put it, this patient’s pattern of deficits “supports the view that ‘perceptual’ and ‘recognition’ processes are separable, because his stored knowledge required for recognition is intact” and that inasmuch as recognition involves a process of somehow matching perceptual information against stored memories, then his case also “supports the view that the perceptual representation used in this matching process can be ‘driven’ solely by stimulus information, so that it is unaffected by contextual knowledge.”

It appears that in this patient the earliest stages in perception – those involving computing contours and simple shape features – are spared. So also is the ability to look up shape information in memory in order to recognize objects. What then is damaged? It appears that an intermediate stage of “integration” of visual features fails to function as it should. The pattern of dissociation shows the intact capacity to extract features together with the capacity to recognize objects from shape information is insufficient for visual recognition so long as the unique visual capacity for integration is absent. But “integration” according to the New Look (or Helmholtzian) view of perception, comes down to no more than making inferences from the basic shape features – a capacity that appears to be spared.

Independence of detection and category selection bias

Signal Detection Theory shows that expectations based on statistical frequency-of-occurrence are response bias effects, and therefore post-perceptual

Arthur Samuel’s study of phoneme restorationVisual expertise often arises from learning where to

attend (pre-visual) and which patterns to retain because they are diagnostic (post-visual).

Shiffrar & Biederman study of expert chicken sexers Studies of athletes’ perception (and Chess masters’ memory)

Download - Is perception continuous with cognition?

Top Related