cogs 202 (sp16): cognitive science foundations ...ajyu/teaching/cogs202_sp16/slides/...• 05/09:...

Cogs 202 (SP16): Cognitive Science Foundations

Computational Modeling of Cognition

Prof. Angela Yu

Department of Cognitive Science, UCSD

https://thiscourse.com/ucsd/cogs202/sp16


Today

Self-introductions

Logistics

Introduction to cognitive modeling

Self-Introductions

• Name

• Department, year/status

• One piece of scientific fact you find fascinating

• What you want to get out of this class

Grading

• 30% participation ✦ reading (please read assigned papers before class)

✦ in-class discussion, online discussion

• 25% discussion leading

• 20% homework

• 25% final project

• No laptop, tablet, or cell phone in class, unless you need it for presentation (get a notebook)

• State your name before you speak

Course schedule

• 03/28: Introduction

• 04/04: Signal-detection theory & Bayesian modeling (Omar, Reina, Shuai, Mehul, Gautam)

• 04/11: Multisensory integration & exploration vs. exploitation (Kevin, Tricia, Nikhil, Saurabh); HW1 (Mehul, Omar, Shuai, Reina)

• 04/18: Hidden Markov model & sequential effects (Mehul, Shuai, Gabriela, Erin); HW2 (Jarrett, Saurabh, Tricia, Kevin)

• 04/25: Uncertainty and neuromodulation (Siyang, Kevin, Jarrett, Gautam); HW3 (Nikhil, Siyang, Bjornar, Michael)

• 05/02: Reinforcement learning & dopamine (Michael, Jarrett, Bjornar, Omar); HW4 (Gautam, Erin, Gabriela)

• 05/09: Decision-making and parietal cortex (Gabriela, Saurabh, Tricia, Reina, Siyang)

• 05/16: Sensorimotor integration and movement planning (Erin, Nikhil, Michael, Bjornar)

• 05/23: Final project presentations



Discussion Leading

• (30 min: discussion of homework)

• 90 min: presentation (2 parts)

• 30 min: discussion, Q&A

• Aim to finish by 3:30 PM

• Slides draft 1 due prior Thursday noon

• Slides draft 2 due Friday noon

• Slides final version due Monday morning

Q&A• I don’t want to ask a question because I’ll sound stupid

Chances are others have the same question; even if they don’t, this is your chance to learn something

• I feel totally lost and can’t follow the discussion Ask a question!

• I don’t learn much from others explaining HW problems Do the HW problems on your own beforehand!

• I wanted to learn about X, or more about YThis is an intro (not foundations) course, with just enough depth/breadth to give you a taste; ask/look for refs

• How do I apply modeling to my own research? Typically it takes a whole PhD to learn this; final project is an opportunity, but perhaps on someone else’s research topic

• Why do we read so many papers of Dr. Yu?Because you need access to data for final projects

Straw Poll of Final Projects

What is cognitive modeling and why do it?

• Why do we study cognitive science at all? ✦ To understand how the mind works

✴ How we process information and act on it ✴ How we learn and generalize, and create new ideas ✴ How we think, reason, feel, and make decisions

✦ To make predictions of how people & animals behave in new situations

✦ To understand pathology in cognition ✦ To build intelligent artificial systems and agents

“Verbally expressed statements are sometimes flawed by internal inconsistencies, logical contradictions, theoretical weaknesses and

gaps. A running computational model, on the other hand, can be considered as a sufficiency

proof of the internal coherence and completeness of the ideas it is based

upon...” (Fum, Del Misser, Stocco, 2007)


• It’s possible to study the brain without modeling

• But discovering facts is only the beginning


• Facts ≠ understanding, description ≠ understanding • Our goal is to make the book shorter!

36

Principles of Neural Science (Kandel, Schwartz, & Jessel)

Year of publishing

No. pages


• The description is long because the system is complex ✦ “Understanding physics is child’s play compared to

understanding child’s play” -- Albert Einstein

• A theory makes it possible to ✦ Explain why we (scientists) observe what we observe ✦ Specify the nature of internal representations ✦ Predict what would happen in a new situation

• A model can be an explicit realization of a theory ✦ Forces explicitness in assumptions, logic, and predictions ✦ Implications often defy expectations ✦ Aids communication between experimentalists and theorists ✦ Facilitates continuity/cumulative progress


Different Modeling Approaches

• Different modeling approaches make different core assumptions, aim at different levels of analysis, and are applied to different aspects of cognition ✦ Bayesian/ideal-observer

✦ connectionist/neural network/dynamical systems

✦ symbolic/rule-based/cognitive architectures

• Different modeling approaches make different core assumptions, aim at different levels of analysis, and are applied to different aspects of cognition ✦ Bayesian/ideal-observer ✦ connectionist/neural network/dynamical systems

✦ symbolic/rule-based/cognitive architectures

Different Modeling Approaches

David Marr (1945-1980)

• Theory of cerebellum: a large & simple type of memory to support motor learning

• Theory of neocortex: neocortex cells flexibly learn statistical structure from input patterns

• Theory of hippocampus:rapid encoding of memory traces via Hebbian learning

David Marr: Three Levels of Analysis

David Marr - Vision (1969)

• Vision is the process of discovering properties (what, where) of things in the world from images

• Vision = information processing task + rich internal representation

• A complete understanding of vision (and any information processing task) requires multiple levels of analysis, e.g. plain man, brain scientists, psychologists, modelers

• Machine + information-processing task (ex 1: airline-reservation system: computers + aircrafts, geography, time zones, finance, politics, diets... ex 2: a smart phone)

General Introduction


• Psychophysics: helpful for subdividing perception into sub-units (“modules”, “channels”), & elucidating nature of representation (e.g. mental rotation)

• Neurophysiology: functional significance of single neurons (e.g. Hubel & Wiesel) thought by some (e.g. Barlow’s “neuron doctrine”) to provide a “complete enough description for functional understanding” of the brain

• Marr contributed a theory of the cerebellar cortex (memory device for learning motor sequences)

• But something went wrong, PP & NP only describe the behavior of cells/subjects, but not explain such behavior

Chapter 1.1 Background


• Best way to test your understanding is trying to recreate it; MIT AI Lab (Seymour Papert: Summer Vision Project) tackled vision and found it to be really difficult

• What is missing in our understanding, in addition to that of neurons and computer programs, is the analysis of the problem as an information-processing task:

✦ What is being computed and why?

✦ In what sense is what is being computed optimal or is guaranteed to function correctly?

Chapter 1.1 Background


Necker Cube

• Sure you can understsand the bistability as a consequence of two competing neural network states

• But you would be amiss not to mention that the 3D interpretation is ambiguous: two competing explanations

Model Taxonomy: Levels of Analysis

David Marr (1969): Brain = Information Processor

computational

algorithmic

implementational

goals of computationwhy things work the way they do

representation of input and outputhow one is transformed into the other

how is the system physically realizedin hardware (architecture, dynamics)

NEGOTIATING SPEED-ACCURACY TRADEOFF IN SEQUENTIAL IDENTIFICATION UNDER A STOCHASTIC DEADLINE3

the deadline �, or by the successful registry of the subject’s decision, whichever occurring earlier—

“⇥” denotes the minimum of the two arguments on its either side. Then by the strong law of large

numbers the long-run average reward per unit time equals ER/ET with probability one. Therefore,

the maximum reward rate problem is equivalent to solving the stochastic optimization problem

V := sup(�,µ)

E⌥1{�+T0<�}

⇧mj=1 rj1{µ=j,M=j}

�

E [(� + T0) ⇥�],

for which we will show that an optimal solution always exists and describe how to calculate the

supremum and an admissible decision rule (�, µ) which attains the supremum.

An important theoretical question is whether and how Bayes-risk minimization and reward-rate

maximization are related to each other. In this work, we demonstrate that reward rate maximization

for this class of problems is formally equivalent to solving the family (W (c))c>0 of Bayes-risk

minimization problems,

W (c) := inf(�,µ)

E⇤c�(� + T0) ⇥�) + 1{�+T0<�}

⌃

i ⇤=j

rj1{µ=i,M=j} + 1{�+T0⇥�}

m⌃

j=1

rj1{M=j}

⌅,

indexed by the unit sampling (observation or time) cost c > 0, thus rendering the reward-rate

maximization problem amenable to a large array of existing analytical and computational tools in

stochastic control theory. In particular, we show that the maximum reward rate V is the unique

unit sampling cost c > 0 which makes the minimum Bayes risk W (c) equal to the maximal expected

reward⇧m

j=1 rjP(M = j) under the prior distribution. Moreover,

c � V if and only if inf(�,µ)

E⇤c�(� + T0) ⇥�

⇥� 1{�+T0<�}

m⌃

j=1

rj1{µ=j,M=j}

⌅� 0;

namely, the maximum reward rate V is the unique unit sampling cost c for which expected total

observation cost E[c((��+T0)⇥�)] and expected terminal reward E[1{��+T0<�}⇧m

j=1 rj1{µ�=j,M=j}]

break even under any optimal decision rule (��, µ�).

In Section 2, we characterize the Bayes-risk minimization solution to the multi-hypothesis se-

quential identification problems W (c), c > 0 under a stochastic deadline. This treatment extends

our previous work on Bayes risk minimization in sequential testing of multiple hypotheses [4] and

of binary hypotheses under a stochastic deadline [10], in which there are penalties associated with

breaching a stochastic deadline in addition to typical observation and misidentification costs. In

Section 3, we characterize the formal relationship between reward-rate maximization and Bayes-

risk minimization, and leverage it to obtain a numerical procedure for optimizing reward rate.

Significantly, we will show that the optimal policy for reward rate maximization depends on the

initial belief state, unlike for Bayes risk minimization—this is because the former identifies with a

di⇥erent setting of the latter depending on the initial state. This dependence on initial belief state

shows explicitly that the reward-rate-maximizing policy cannot satisfy any iterative, Markovian

form of Bellman’s dynamic programming equation [1]. Finally, in Section 4, we demonstrate how

the procedure can be applied to solve a numerical example involving binary hypotheses.


• Level of description: e.g. thermodynamics can’t be explain at the level of particles, but at the level of temperature, pressure, density… and theoretical links between levels

• Representation: each representation makes some things easy, others hard (e.g. binary vs. decimal vs. Roman)

• The levels are loosely related: many choices at each level, many issues at each level independent of the other levels

• Much confusion re: level in relating PP & NP (Necker cube)

• Neuroanatomy/physiology ⇒ implementation; pyschophysics ⇒ algorithm/representation; computational theory ⇒ computation

Chapter 1.2 Understanding Complex I-P Systems


• J. J. Gibson: closest to a computational theory of perception, but did not understand information-processing, and under-estimated the complexity of the problem

• Contribution: “How does one obtain constant perceptions in everyday life on the basis of continually changing sensations?”

• Hypotheses: brain tries to detect higher-order “invariants” -- stimulus energy, ratios, proportions, etc -- not info-processing

• “Fatal shortcomings”: failure to realize (1) detection of invariants is an information-processing problem, and (2) the difficulty of this detection problem

Chapter 1.2 Understanding Complex I-P Systems


• Deriving an invariant shape description probably requires a sequence of representations:

✦ Image (intensity)

✦ Primal sketch (2-D image, geometrical distr/organization)

✦ 2 1/2-D sketch (orientation, depth, contours, in viewer-centered frame of reference)

✦ 3-D representation (shapes and spatial organization in object-centered frame of reference)

Chapter 1.3 A Representational Framework for Vision

Hermann von Helmholtz (1821-1894)

• Physics: • conservation of energy

• fluid dynamics

• thermodynamics • Neuroscience

• nerve physiology

• visual perception (depth, color, motion)

• auditory perception

• perception as unconscious inference

Helmholtz - The Facts of Perception (1878)

• A theory of knowledge is fundamental to all science: “What is true in our sense perception and thought? In what way do our ideas correspond to reality?”

✦ Scientists and philosophers are interested in the same divide between the material world and our inner mental processes, in order to study one versus the other

✦ Nativism vs empiricism (nature vs. nurture)

✦ Locke: bodily/mental structure ⇒ perception; Kant: transcendental forms of intuition & thought constrain the interpretation of sensory experiences and formation of ideas


• Our sense can be divided into distinct modalities (“circle of quality”), such as taste and sight

• The nature of sensory experience depends on which sensory nerve, not on the physical stimulus (e.g. blow to the eye)

• Different percepts within a modality correspond to different nerves of the same modality (e.g. tones)

• Our sensations are signs (noisy data), not copies, of external objects/events, and the two need not be similar, but there should be lawful regularity (consistency)

• Signs + lawful (statistical) regularity + learning + inductive inference ⇒ intuition about the world (outer) and self (inner)

• e.g. intuition for space: self motion ⇒ consistent changes in sensory experiences ⇒ spatial properties (outer intuition); those experiences unchanged by self motion are non-spatial (inner intuition); 3D sufficient for explaining the world (a surface encloses a 3D space)

• e.g. common features ⇒ a class of objects; persistent features ⇒ changes in time

• Law of causality ⇒ inductive inference ⇒ comprehensibility


Perception as Unconscious Inference

A Blind Spot

• Our brain makes a best guess (inductive inference) at what’s out there based on the data it gets

Held & Hein (1963) Kitten Carousel

A succeeded and P failed at: visually-guided paw placement, avoidance of a visual cliff, blink to an approaching object


• “‘Perceptions occur as if the things of the material world referred to in the realistic hypothesis actually did exist.’ We cannot eliminate the ‘as if’ construction completely, however, for we cannot consider the realistic interpretation to be more than an exceedingly useful and practical hypothesis. We cannot assert that it is necessarily true, for opposed to it there is always the possibility of other irrefutable idealistic hypotheses.”

• “Every reduction of some phenomenon to underlying substances and forces indicates that something unchangeable and final has been found. We are never justified, of course, in making an unconditional assertion of such a reduction. Such a claim is not permissible because of the incompleteness of our knowledge and because of the nature of the inductive inferences upon which our perception of reality depends.”

Helmholtz- The Facts of Perception (1878)

“We are particles of dust on the surface of our planet, which is itself scarcely a grain of sand in the infinite space of the universe. We are the youngest species among the living things of the earth, hardly out of the cradle according to the time reckoning of geology, still in the learning stage, hardly half-grown, said to be mature only through mutual agreement. Nevertheless, because of the mighty stimulus of the law of causality.... We truly have reason to be proud that it has been given to us to understand, slowly and through hard work, the incomprehensibly great scheme of things.”

cogs 202 (sp16): cognitive science foundations ...ajyu/teaching/cogs202_sp16/slides/...• 05/09:...

Documents