categori zat i on

35
Tom Griffiths CogSci 131 Models of categorization

Upload: brian

Post on 04-Dec-2015

226 views

Category:

Documents


2 download

DESCRIPTION

Categori Zat i On

TRANSCRIPT

Page 1: Categori Zat i On

Tom Griffiths

CogSci 131 Models of categorization

Page 2: Categori Zat i On

Spaces and features •  Will show up in many contexts in this class

– similarity – semantic representation – categorization – neural networks

•  How can we use these representations?

Page 3: Categori Zat i On

Categorization

Page 4: Categori Zat i On

Outline

Prototype and exemplar theories

Break

Testing the theories

Page 5: Categori Zat i On

How can we explain typicality?

•  One answer: reject definitions, and have a new representation for categories

•  Prototype theory: – categories are represented by a prototype – other members share a family resemblance

relation to the prototype –  typicality is a function of similarity to the

prototype

Page 6: Categori Zat i On

Family resemblance

Prototype

Page 7: Categori Zat i On

Family resemblance

(Posner & Keele, 1968)

Prototype

Page 8: Categori Zat i On

Posner and Keele (1968) •  Prototype effect in categorization accuracy •  Constructed categories by perturbing

prototypical dot arrays •  Ordering of categorization accuracy at test:

– old exemplars – prototypes – new exemplars

Page 9: Categori Zat i On

Formalizing prototype theories Representation:

Each category (e.g., A, B) has a corresponding prototype (µA,µB)

Categorization: (for a new stimulus x)

Choose category that minimizes (maximizes) the distance (similarity) from x to its prototype

(e.g., Reed, 1972)

Page 10: Categori Zat i On

Formalizing prototype theories Prototype is most frequent or “typical” member

Spaces (Binary) Features Prototype e.g., average of members of category Distance e.g., Euclidean distance

Prototype e.g., binary vector with most frequent feature values Distance e.g., Hamming distance

d(x,π A ) = (xk −µA ,k )2

k∑%

& '

(

) *

1/ 2

d(x,π A ) = xk −µA ,kk∑

Page 11: Categori Zat i On

Distances

Euclidean distance

01100100111 01110100101

Hamming distance

Page 12: Categori Zat i On

Formalizing prototype theories

Category A Category B

Prototypes (category means)

Decision boundary at equal distance (always a straight line for two categories)

Page 13: Categori Zat i On

More complex prototypes •  Various extensions to simple prototype

models have been explored… •  For features, configural cue models

– compound features, such as “red and small” –  results in combinatorial explosion

•  For spaces, prototype models that incorporate information about variance – category-specific measures of distance

Page 14: Categori Zat i On

More complex prototypes

Category A Category B

Prototypes (category means)

Decision boundary at equal distance (no longer a straight line)

Page 15: Categori Zat i On

More complex prototypes Decision boundary at equal distance

(no longer a straight line)

Boundaries are conic sections (parabolas, ellipses, and hyperbolas)

Page 16: Categori Zat i On

Predicting prototype effects •  Prototype effects are built into the model:

– assume categorization becomes easier as proximity to the prototype increases…

– …or distance from the boundary increases •  But what about the old exemplar advantage?

(Posner & Keele, 1968)

•  Prototype models are not the only way to get prototype effects…

Page 17: Categori Zat i On

Exemplar theories

Store every member (“exemplar”) of the family

Page 18: Categori Zat i On

Formalizing exemplar theories Representation:

A set of stored exemplars y1, y2, …, yn, each with its own category label

Categorization: (for a new stimulus x) Choose category A with probability

P(A | x) =

βA ηxyy∈A∑

βA ηxy + βB ηxyy∈B∑

y∈A∑

“Luce-Shepard choice rule”

ηxy is similarity of x to y

βA is bias towards A

Page 19: Categori Zat i On

The context model (Medin & Schaffer, 1978)

Defined for stimuli with binary features (color, form, size, number)

1111 = (red, triangle, big, one) 0000 = (green, circle, small, two)

ηxy = ηxykk∏

Define similarity as the product of similarity on each dimension

ηxyk =1 xk = yksk otherwise# $ %

Page 20: Categori Zat i On

Prototypes vs. exemplars •  Exemplar models produce prototype effects

–  if prototype minimizes distance to all exemplars in a category, then it has high probability

•  Also predicts old exemplar advantage – being close (or identical) to an old exemplar of

the category gives high probability •  Predicts new effects prototype models

cannot produce… – stimuli close to an old exemplar should have high

probability, even far from the prototype

Page 21: Categori Zat i On

Break

Up next: Testing the theories

Page 22: Categori Zat i On
Page 23: Categori Zat i On

Prototypes vs. exemplars •  Exemplar models produce prototype effects

–  if prototype minimizes distance to all exemplars in a category, then it has high probability

•  Also predicts old exemplar advantage – being close (or identical) to an old exemplar of

the category gives high probability •  Predicts new effects prototype models

cannot produce… – stimuli close to an old exemplar should have high

probability, even far from the prototype

Page 24: Categori Zat i On

The 5-4 category structure (Medin & Schaffer, 1978)

Category A Category B

Prototype Prototype

d(x,µ) d(x,µ) 1

2

1

1

1

2

2

1

0

Page 25: Categori Zat i On

The 5-4 category structure (Medin & Schaffer, 1978)

Category A Category B

Prototype Prototype

d(x,µ) d(x,µ) 1

2

1

1

1

2

2

1

0

Prototype: P(A|4) > P(A|7) Exemplar: P(A|4) < P(A|7)

“4”

“7”

Page 26: Categori Zat i On

The generalized context model (Nosofsky, 1986)

Defined for stimuli in psychological space

ηxy = exp{−cd(x,y)p}

P(A | x) =

βA ηxyy∈A∑

βA ηxy + βB ηxyy∈B∑

y∈A∑

c is “specificity” p = 1 is exponential p = 2 is Gaussian

where

Page 27: Categori Zat i On

The generalized context model

Category A Category B

Decision boundary determined by exemplars

90% A

10% A

50% A

Category A Category B

Page 28: Categori Zat i On

Prototypes vs. exemplars Exemplar models can capture complex boundaries

Page 29: Categori Zat i On

Prototypes vs. exemplars Exemplar models can capture complex boundaries

Page 30: Categori Zat i On

Bells and whistles: distance metrics

d(x,y) = wk (xk − yk )r

k∑$

% &

'

( )

1/ rThe “weighted Minkowski r metric”:

where r determines the metric (Euclidean or city-block) wk is the weight of dimension k (reflects attention)

Page 31: Categori Zat i On

Using different metrics r = 2: Euclidean distance r = 1: city-block distance

“integral” dimensions e.g. saturation & brightness

“separable” dimensions e.g. size & shape

Page 32: Categori Zat i On

Dimensional attention

Allows rescaling of dimensions to aid in

categorization

(similar to capturing the variance of a category)

Page 33: Categori Zat i On

Evaluating models •  Both prototype and exemplar models have

lots of free parameters – prototype locations –  response biases, attention weights, r, p, c

•  Testing the models typically involves finding the best-fitting values of the parameters – generic optimization methods (like gradient

descent) are usually used to do this

Page 34: Categori Zat i On

Some questions… •  Both prototype and exemplar models seem

reasonable… are they “rational”? – are they solutions to the computational problem?

•  Should we use prototypes, or exemplars?

•  How can we define other models that handle more complex categorization problems?

•  Is this all that categories are?

Page 35: Categori Zat i On

Next week

•  Tuesday: Linear algebra – a way of computing with spaces

•  Thursday: Semantic networks – using some linear algebra! – Google and the mind…