cogs 202 (sp16): cognitive science foundations ...ajyu/teaching/cogs202_sp16/slides/...• 05/09:...
TRANSCRIPT
Cogs 202 (SP16): Cognitive Science Foundations
Computational Modeling of Cognition
Prof. Angela Yu
Department of Cognitive Science, UCSD
https://thiscourse.com/ucsd/cogs202/sp16
Today
Self-introductions
Logistics
Introduction to cognitive modeling
Self-Introductions
• Name
• Department, year/status
• One piece of scientific fact you find fascinating
• What you want to get out of this class
Grading
• 30% participation ✦ reading (please read assigned papers before class)
✦ in-class discussion, online discussion
• 25% discussion leading
• 20% homework
• 25% final project
• No laptop, tablet, or cell phone in class, unless you need it for presentation (get a notebook)
• State your name before you speak
Course schedule
• 03/28: Introduction
• 04/04: Signal-detection theory & Bayesian modeling (Omar, Reina, Shuai, Mehul, Gautam)
• 04/11: Multisensory integration & exploration vs. exploitation (Kevin, Tricia, Nikhil, Saurabh); HW1 (Mehul, Omar, Shuai, Reina)
• 04/18: Hidden Markov model & sequential effects (Mehul, Shuai, Gabriela, Erin); HW2 (Jarrett, Saurabh, Tricia, Kevin)
• 04/25: Uncertainty and neuromodulation (Siyang, Kevin, Jarrett, Gautam); HW3 (Nikhil, Siyang, Bjornar, Michael)
• 05/02: Reinforcement learning & dopamine (Michael, Jarrett, Bjornar, Omar); HW4 (Gautam, Erin, Gabriela)
• 05/09: Decision-making and parietal cortex (Gabriela, Saurabh, Tricia, Reina, Siyang)
• 05/16: Sensorimotor integration and movement planning (Erin, Nikhil, Michael, Bjornar)
• 05/23: Final project presentations
https://thiscourse.com/ucsd/cogs202/sp16
Discussion Leading
• (30 min: discussion of homework)
• 90 min: presentation (2 parts)
• 30 min: discussion, Q&A
• Aim to finish by 3:30 PM
• Slides draft 1 due prior Thursday noon
• Slides draft 2 due Friday noon
• Slides final version due Monday morning
Q&A• I don’t want to ask a question because I’ll sound stupid
Chances are others have the same question; even if they don’t, this is your chance to learn something
• I feel totally lost and can’t follow the discussion Ask a question!
• I don’t learn much from others explaining HW problems Do the HW problems on your own beforehand!
• I wanted to learn about X, or more about YThis is an intro (not foundations) course, with just enough depth/breadth to give you a taste; ask/look for refs
• How do I apply modeling to my own research? Typically it takes a whole PhD to learn this; final project is an opportunity, but perhaps on someone else’s research topic
• Why do we read so many papers of Dr. Yu?Because you need access to data for final projects
Straw Poll of Final Projects
What is cognitive modeling and why do it?
• Why do we study cognitive science at all? ✦ To understand how the mind works
✴ How we process information and act on it ✴ How we learn and generalize, and create new ideas ✴ How we think, reason, feel, and make decisions
✦ To make predictions of how people & animals behave in new situations
✦ To understand pathology in cognition ✦ To build intelligent artificial systems and agents
“Verbally expressed statements are sometimes flawed by internal inconsistencies, logical contradictions, theoretical weaknesses and
gaps. A running computational model, on the other hand, can be considered as a sufficiency
proof of the internal coherence and completeness of the ideas it is based
upon...” (Fum, Del Misser, Stocco, 2007)
What is cognitive modeling and why do it?
• It’s possible to study the brain without modeling
• But discovering facts is only the beginning
What is cognitive modeling and why do it?
• Facts ≠ understanding, description ≠ understanding • Our goal is to make the book shorter!
36
Principles of Neural Science (Kandel, Schwartz, & Jessel)
Year of publishing
No. pages
What is cognitive modeling and why do it?
• The description is long because the system is complex ✦ “Understanding physics is child’s play compared to
understanding child’s play” -- Albert Einstein
• A theory makes it possible to ✦ Explain why we (scientists) observe what we observe ✦ Specify the nature of internal representations ✦ Predict what would happen in a new situation
• A model can be an explicit realization of a theory ✦ Forces explicitness in assumptions, logic, and predictions ✦ Implications often defy expectations ✦ Aids communication between experimentalists and theorists ✦ Facilitates continuity/cumulative progress
What is cognitive modeling and why do it?
Different Modeling Approaches
• Different modeling approaches make different core assumptions, aim at different levels of analysis, and are applied to different aspects of cognition ✦ Bayesian/ideal-observer
✦ connectionist/neural network/dynamical systems
✦ symbolic/rule-based/cognitive architectures
• Different modeling approaches make different core assumptions, aim at different levels of analysis, and are applied to different aspects of cognition ✦ Bayesian/ideal-observer ✦ connectionist/neural network/dynamical systems
✦ symbolic/rule-based/cognitive architectures
Different Modeling Approaches
David Marr (1945-1980)
• Theory of cerebellum: a large & simple type of memory to support motor learning
• Theory of neocortex: neocortex cells flexibly learn statistical structure from input patterns
• Theory of hippocampus:rapid encoding of memory traces via Hebbian learning
David Marr: Three Levels of Analysis
David Marr - Vision (1969)
• Vision is the process of discovering properties (what, where) of things in the world from images
• Vision = information processing task + rich internal representation
• A complete understanding of vision (and any information processing task) requires multiple levels of analysis, e.g. plain man, brain scientists, psychologists, modelers
• Machine + information-processing task (ex 1: airline-reservation system: computers + aircrafts, geography, time zones, finance, politics, diets... ex 2: a smart phone)
General Introduction
David Marr - Vision (1969)
• Psychophysics: helpful for subdividing perception into sub-units (“modules”, “channels”), & elucidating nature of representation (e.g. mental rotation)
• Neurophysiology: functional significance of single neurons (e.g. Hubel & Wiesel) thought by some (e.g. Barlow’s “neuron doctrine”) to provide a “complete enough description for functional understanding” of the brain
• Marr contributed a theory of the cerebellar cortex (memory device for learning motor sequences)
• But something went wrong, PP & NP only describe the behavior of cells/subjects, but not explain such behavior
Chapter 1.1 Background
David Marr - Vision (1969)
• Best way to test your understanding is trying to recreate it; MIT AI Lab (Seymour Papert: Summer Vision Project) tackled vision and found it to be really difficult
• What is missing in our understanding, in addition to that of neurons and computer programs, is the analysis of the problem as an information-processing task:
✦ What is being computed and why?
✦ In what sense is what is being computed optimal or is guaranteed to function correctly?
Chapter 1.1 Background
David Marr - Vision (1969)
Necker Cube
• Sure you can understsand the bistability as a consequence of two competing neural network states
• But you would be amiss not to mention that the 3D interpretation is ambiguous: two competing explanations
Model Taxonomy: Levels of Analysis
David Marr (1969): Brain = Information Processor
computational
algorithmic
implementational
goals of computationwhy things work the way they do
representation of input and outputhow one is transformed into the other
how is the system physically realizedin hardware (architecture, dynamics)
NEGOTIATING SPEED-ACCURACY TRADEOFF IN SEQUENTIAL IDENTIFICATION UNDER A STOCHASTIC DEADLINE3
the deadline �, or by the successful registry of the subject’s decision, whichever occurring earlier—
“⇥” denotes the minimum of the two arguments on its either side. Then by the strong law of large
numbers the long-run average reward per unit time equals ER/ET with probability one. Therefore,
the maximum reward rate problem is equivalent to solving the stochastic optimization problem
V := sup(�,µ)
E⌥1{�+T0<�}
⇧mj=1 rj1{µ=j,M=j}
�
E [(� + T0) ⇥�],
for which we will show that an optimal solution always exists and describe how to calculate the
supremum and an admissible decision rule (�, µ) which attains the supremum.
An important theoretical question is whether and how Bayes-risk minimization and reward-rate
maximization are related to each other. In this work, we demonstrate that reward rate maximization
for this class of problems is formally equivalent to solving the family (W (c))c>0 of Bayes-risk
minimization problems,
W (c) := inf(�,µ)
E⇤c�(� + T0) ⇥�) + 1{�+T0<�}
⌃
i ⇤=j
rj1{µ=i,M=j} + 1{�+T0⇥�}
m⌃
j=1
rj1{M=j}
⌅,
indexed by the unit sampling (observation or time) cost c > 0, thus rendering the reward-rate
maximization problem amenable to a large array of existing analytical and computational tools in
stochastic control theory. In particular, we show that the maximum reward rate V is the unique
unit sampling cost c > 0 which makes the minimum Bayes risk W (c) equal to the maximal expected
reward⇧m
j=1 rjP(M = j) under the prior distribution. Moreover,
c � V if and only if inf(�,µ)
E⇤c�(� + T0) ⇥�
⇥� 1{�+T0<�}
m⌃
j=1
rj1{µ=j,M=j}
⌅� 0;
namely, the maximum reward rate V is the unique unit sampling cost c for which expected total
observation cost E[c((��+T0)⇥�)] and expected terminal reward E[1{��+T0<�}⇧m
j=1 rj1{µ�=j,M=j}]
break even under any optimal decision rule (��, µ�).
In Section 2, we characterize the Bayes-risk minimization solution to the multi-hypothesis se-
quential identification problems W (c), c > 0 under a stochastic deadline. This treatment extends
our previous work on Bayes risk minimization in sequential testing of multiple hypotheses [4] and
of binary hypotheses under a stochastic deadline [10], in which there are penalties associated with
breaching a stochastic deadline in addition to typical observation and misidentification costs. In
Section 3, we characterize the formal relationship between reward-rate maximization and Bayes-
risk minimization, and leverage it to obtain a numerical procedure for optimizing reward rate.
Significantly, we will show that the optimal policy for reward rate maximization depends on the
initial belief state, unlike for Bayes risk minimization—this is because the former identifies with a
di⇥erent setting of the latter depending on the initial state. This dependence on initial belief state
shows explicitly that the reward-rate-maximizing policy cannot satisfy any iterative, Markovian
form of Bellman’s dynamic programming equation [1]. Finally, in Section 4, we demonstrate how
the procedure can be applied to solve a numerical example involving binary hypotheses.
David Marr - Vision (1969)
• Level of description: e.g. thermodynamics can’t be explain at the level of particles, but at the level of temperature, pressure, density… and theoretical links between levels
• Representation: each representation makes some things easy, others hard (e.g. binary vs. decimal vs. Roman)
• The levels are loosely related: many choices at each level, many issues at each level independent of the other levels
• Much confusion re: level in relating PP & NP (Necker cube)
• Neuroanatomy/physiology ⇒ implementation; pyschophysics ⇒ algorithm/representation; computational theory ⇒ computation
Chapter 1.2 Understanding Complex I-P Systems
David Marr - Vision (1969)
• J. J. Gibson: closest to a computational theory of perception, but did not understand information-processing, and under-estimated the complexity of the problem
• Contribution: “How does one obtain constant perceptions in everyday life on the basis of continually changing sensations?”
• Hypotheses: brain tries to detect higher-order “invariants” -- stimulus energy, ratios, proportions, etc -- not info-processing
• “Fatal shortcomings”: failure to realize (1) detection of invariants is an information-processing problem, and (2) the difficulty of this detection problem
Chapter 1.2 Understanding Complex I-P Systems
David Marr - Vision (1969)
• Deriving an invariant shape description probably requires a sequence of representations:
✦ Image (intensity)
✦ Primal sketch (2-D image, geometrical distr/organization)
✦ 2 1/2-D sketch (orientation, depth, contours, in viewer-centered frame of reference)
✦ 3-D representation (shapes and spatial organization in object-centered frame of reference)
Chapter 1.3 A Representational Framework for Vision
Hermann von Helmholtz (1821-1894)
• Physics: • conservation of energy
• fluid dynamics
• thermodynamics • Neuroscience
• nerve physiology
• visual perception (depth, color, motion)
• auditory perception
• perception as unconscious inference
Helmholtz - The Facts of Perception (1878)
• A theory of knowledge is fundamental to all science: “What is true in our sense perception and thought? In what way do our ideas correspond to reality?”
✦ Scientists and philosophers are interested in the same divide between the material world and our inner mental processes, in order to study one versus the other
✦ Nativism vs empiricism (nature vs. nurture)
✦ Locke: bodily/mental structure ⇒ perception; Kant: transcendental forms of intuition & thought constrain the interpretation of sensory experiences and formation of ideas
Helmholtz - The Facts of Perception (1878)
• Our sense can be divided into distinct modalities (“circle of quality”), such as taste and sight
• The nature of sensory experience depends on which sensory nerve, not on the physical stimulus (e.g. blow to the eye)
• Different percepts within a modality correspond to different nerves of the same modality (e.g. tones)
• Our sensations are signs (noisy data), not copies, of external objects/events, and the two need not be similar, but there should be lawful regularity (consistency)
• Signs + lawful (statistical) regularity + learning + inductive inference ⇒ intuition about the world (outer) and self (inner)
• e.g. intuition for space: self motion ⇒ consistent changes in sensory experiences ⇒ spatial properties (outer intuition); those experiences unchanged by self motion are non-spatial (inner intuition); 3D sufficient for explaining the world (a surface encloses a 3D space)
• e.g. common features ⇒ a class of objects; persistent features ⇒ changes in time
• Law of causality ⇒ inductive inference ⇒ comprehensibility
Helmholtz - The Facts of Perception (1878)
Perception as Unconscious Inference
A Blind Spot
• Our brain makes a best guess (inductive inference) at what’s out there based on the data it gets
Held & Hein (1963) Kitten Carousel
A succeeded and P failed at: visually-guided paw placement, avoidance of a visual cliff, blink to an approaching object
Helmholtz - The Facts of Perception (1878)
• “‘Perceptions occur as if the things of the material world referred to in the realistic hypothesis actually did exist.’ We cannot eliminate the ‘as if’ construction completely, however, for we cannot consider the realistic interpretation to be more than an exceedingly useful and practical hypothesis. We cannot assert that it is necessarily true, for opposed to it there is always the possibility of other irrefutable idealistic hypotheses.”
• “Every reduction of some phenomenon to underlying substances and forces indicates that something unchangeable and final has been found. We are never justified, of course, in making an unconditional assertion of such a reduction. Such a claim is not permissible because of the incompleteness of our knowledge and because of the nature of the inductive inferences upon which our perception of reality depends.”
Helmholtz- The Facts of Perception (1878)
“We are particles of dust on the surface of our planet, which is itself scarcely a grain of sand in the infinite space of the universe. We are the youngest species among the living things of the earth, hardly out of the cradle according to the time reckoning of geology, still in the learning stage, hardly half-grown, said to be mature only through mutual agreement. Nevertheless, because of the mighty stimulus of the law of causality.... We truly have reason to be proud that it has been given to us to understand, slowly and through hard work, the incomprehensibly great scheme of things.”