title slide

Stop-Consonant Perception in 7.5-month-olds:

Evidence for gradient categories

Bob McMurray & Richard N. Aslin

Department of Brain and Cognitive SciencesUniversity of Rochester

Title Slide

With thanks to Julie Markant & Robbie Jacobs

Understanding spoken language requires that children learn a complex mapping…

Learning Language

What is the form of this mapping?

How do the demands of learning affect this representation?

Lexicon

All labs

Bob’s lab

NP

the lab

S

VP

produced

MeaningAcousticAcoustic LexiconLexicon

Language Understanding

Speech perception and word recognition require mapping…

Learning Speech

What representations mediate acoustics and lexical or sublexical units?

How does learning affect this representation?

AcousticAcoustic Sublexical Units

/b/

/la//a//l/ /p/

/ip/

Sublexical Units

/b/

/la//a//l/ /p/

/ip/

LexiconLexicon

Syntax, semantics,

pragmatics…

Speech Recognition

…continuous, variable perceptual input toa something discrete, categorical.

1) Acoustic mappings: Categorical and gradient perception in adults and infants.

2) Infant speech categories are graded representations of continuous detail.

3) Statistical learning models and sparse representations.

4) Conclusions and future directions.

Overview

Overview

What is the nature of the mapping between continuous perception and discrete categories?

How are these representations sensitive (or not) to within-category detail?

Categorization & Categorical Perception

Representation of Speech Detail

Empirical approach:• Use continuously variable stimuli.• Explore response using

Discrimination Identification (adults)Habituation (infants)

Categorical Perception 1

B

P

Subphonemic within-category variation in VOT is discarded in favor of a discrete symbol (phoneme).

• Sharp labeling of tokens on a continuum.

VOT

0

100

PB

% /p

/

ID (%/pa/) 0

100

Discrim

ination

Discrimination

• Discrimination poor within a phonetic category.

Categorical Perception

Categorical Perception 2

Many tasks have demonstrated within-category sensitivity in adults...

Discrimination Task Variations Pisoni and Tash (1974) Pisoni & Lazarus (1974)Carney, Widin & Viemeister (1977)

Training Samuel (1977)Pisoni, Aslin, Perey & Hennessy (1982)

Goodness Ratings Miller (1997)Massaro & Cohen (1983)

BUT…

And lexical activation shows systematic sensitivity to subphonemic detail (McMurray, Tanenhaus & Aslin, 2002).

Infant Categorical Perception 1

Infants have shown a different pattern.

For 30 years, virtually all attempts to address this question have yielded categorical discrimination.

Categorical Perception in Infants

Exception: Miller & Eimas (1996).• Only at extreme VOTs.• Only when habituated to non- prototypical token.

GWB

Infant Categorical Perception 3

Nonetheless, infants possess abilities that would require within-category sensitivity. Su

cking

Rate

(inter

est)

B BB B B B P P P P

Suck

ing R

ate (in

terest

)

B BB B B B P P P P

• Infants can use allophonic differences at word boundaries for segmentation (Jusczyk, Hohne & Bauman, 1999; Hohne, & Jusczyk, 1994)

• Infants can learn phonetic categories from distributional statistics (Maye, Werker & Gerken, 2002).

Distributional Learning 2

Speech production causes clustering along contrastive phonetic dimensions.

Distributional Learning

E.g. Voicing / Voice Onset TimeB: VOT ~ 0P: VOT ~ 40

Result: Bimodal distribution

Within a categories, VOT is distributed Gaussian.

VOT0ms 40ms

• track frequencies of tokens at each value along a stimulus dimension.

VOT

freq

uenc

y

0ms 50ms

Distributional Learning 1

Distributional Learning

To statistically learn speech categories, infants must:

• This requires ability to track specific VOTs.

• Extract categories from the distribution.

+voice -voice

?Question 1

Prior examinations of speech-categories used:

• HabituationDiscrimination not ID.Possible selective adaptation.Possible attenuation of sensitivity.

• Synthetic speechNot ideal for infants.

• Single exemplar/continuumNot necessarily a category representation

Experiment 1: Reassess this issue with improved methods.

HTPP 1Misperception 3

Head-Turn Preference Procedure (Jusczyk & Aslin, 1995)

Infants exposed to a chunk of language:

• Words in running speech.

• Stream of continuous speech (ala statistical learning paradigm).

• Word list.

Head-Turn Preference Procedure

After exposure, memory for exposed items (or abstractions) is assessed by comparing listening time to consistent items with inconsistent items.


Test trials start with all lights off.


Center Light blinks.


Brings infant’s attention to center.


One of the side-lights blinks.

When infant looks at side-light……he hears a word

Beach… Beach… Beach…


…as long as he keeps looking.


Experiment 1 MethodsMisperception 3

Experiment 1

7.5 month old infants exposed to either 4 b-, or 4 p-words.

80 repetitions total.

Form a category of the exposed class of words.

PeachBeachPailBailPearBearPalmBomb

Measure listening time on…

VOT closer to boundaryCompetitors

Original words

Pear*Bear*BearPearPearBear

Experiment 1 StimuliMisperception 3

B* and P* were judged /b/ or /p/ at least 90% consistently by adult listeners.

B*: 97%P*: 96%

Stimuli constructed by cross-splicing naturally produced tokens of each end point.

B: M= 3.6 ms VOTP: M= 40.7 ms VOT

B*: M=11.9 ms VOTP*: M=30.2 ms VOT

Experiment 1 Familiarity vs.

NoveltyMisperception 3

Novelty/Familiarity preference varies across infants and experiments.

1221P

1636B

FamiliarityNoveltyWithin each group will we see evidence for gradiency?

Familiarity vs. Novelty

We’re only interested in the middle stimuli (b*, p*).

Infants were classified as novelty or familiarity preferring by performance on the endpoints.

Categorical

Experiment 1 Fam. vs. Nov. 2Misperception 3

Gradiency

What about in between?

After being exposed to bear… beach… bail… bomb…

Infants who show a novelty effect……will look longer for pear than bear.

Gradient

Bear*Bear Pear

List

enin

g Ti

me

4000

5000

6000

7000

8000

9000

10000

Target Target* Competitor

Lis

teni

ng T

ime

(ms)

Experiment 1 Results

Experiment 1 Results Nov

BP

Exposed to:

Novelty infants (B: 36 P: 21)

Target vs. Target*:Competitor vs. Target*:

p<.001p=.017

Experiment 1 Results Fam

Familiarity infants (B: 16 P: 12)

Target vs. Target*:Competitor vs. Target*:

P=.003p=.012

4000

5000

6000

7000

8000

9000

10000

Target Target* Competitor

Lis

teni

ng T

ime

(ms) B

P

Exposed to:

Experiment 1 Results Planned PMisperception 3

Planned Comparisons

Infants exposed to /p/

NoveltyN=21

P P* B

.024*

.009**

P P* B

.024*

.009**

4000

5000

6000

7000

8000

9000

10000

Lis

teni

ng T

ime

(ms)

P* B4000

5000

6000

7000

8000

9000

.018*

.028*

.018*

P

Lis

teni

ng T

ime

(ms)

.028*

FamiliarityN=12

NoveltyN=36

<.001**>.1

<.001**>.2

4000

5000

6000

7000

8000

9000

10000

B B* P

Lis

teni

ng T

ime

(ms)

Experiment 1 Results Planned BMisperception 3

Infants exposed to /b/

FamiliarityN=16

4000

5000

6000

7000

8000

9000

10000

B B* P

Lis

teni

ng T

ime

(ms)

.06.15

Experiment 1 ConclusionsMisperception 3

7.5 month old infants show gradient sensitivity to subphonemic detail.

• Clear effect for /p/• Effect attenuated for /b/.

Experiment 1 Conclusions

Contrary to all previous work:

Experiment 1 Conclusions 2Misperception 3

Reduced effect for /b/… But:

Bear Pear

List

enin

g Ti

me

Bear*

Null Effect?

Bear Pear

List

enin

g Ti

me

Bear*

Expected Result?


• Bear* Pear

Bear Pear

List

enin

g Ti

me

Bear*

Actual result.

• Category boundary lies between Bear & Bear*• Between (3ms and 11 ms).

• Will we see evidence for within-category sensitivity with a different range?

Experiment 2Misperception 3

Same design as experiment 1.

VOTs shifted away from hypothesized boundary (7 ms).

Train

40.7 ms.Palm Pear Peach Pail

3.6 ms.Bomb* Bear* Beach* Bale*

-9.7 ms.Bomb Bear Beach Bale

Test:

Bomb Bear Beach Bale -9.7 ms.

Experiment 2 Results FamMisperception 3


Familiarity infants (34 Infants)

4000

5000

6000

7000

8000

9000

B- B P

Lis

teni

ng T

ime

(ms) =.05*

=.01**

Experiment 2 Results NovMisperception 3


Novelty infants (25 Infants)

=.02*

=.002**

4000

5000

6000

7000

8000

9000

B- B P

Lis

teni

ng T

ime

(ms)

Experiment 2 ConclusionsMisperception 3

Experiment 2 Conclusions

• Within-category sensitivity in /b/ as well as /p/.

VOT

Adult boundary

/b/ /p/

Cat

egor

y M

appi

ngSt

reng

th Adult Categories

• Shifted category boundary in /b/: not consistent with adult boundary (or prior infant work). Why?


/b/ results consistent with (at least) two mappings.

VOT

Adult boundary

/b/ /p/

Cat

egor

y M

appi

ngSt

reng

th 1) Shifted boundary

• Inconsistent with prior literature.

• Why would infants have this boundary?


2) Sparse Categories/b/

VOT

Adult boundary

/p/

Cat

egor

y M

appi

ngSt

reng

th

unmappedspace

HTPP is a one-alternative task. Asks: B or not-B not: B or P

Sparse categories may in fact by a by-product of efficient statistical learning.

Model IntroMisperception 3

Distributional learning model/b/

VOT

Adult boundary

/p/

Cat

egor

y M

appi

ngSt

reng

th

unmappedspace/b/

VOT

Adult boundary

/p/

Cat

egor

y M

appi

ngSt

reng

th

unmappedspace

Computational Model

1) Model distribution of tokens asa mixture of gaussian distributions over phonetic dimension (e.g. VOT) .

2) After receiving an input, the Gaussian with the highest posterior probability is the “category”.

VOT

3) Each Gaussian has threeparameters:

Model Intro 2Misperception 3

Statistical Category Learning

1) Start with a set of randomly selected Gaussians.

2) After each input, adjust each parameter to find best description of the input.

3) Start with more Gaussians than necessarymodel doesn’t innately know how many

categories. -> for unneeded categories.

VOT VOT

Model Intro 3Misperception 3

Model Overgen Misperception 3

Overgeneralization • large • costly: lose phonetic distinctions…

Model UndergenMisperception 3

Undergeneralization• small • not as costly: maintain distinctiveness.

Model err on side of caution

To increase likelihood of successful learning:• err on the side of caution.• start with small

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 10 20 30 40 50 60

Starting

P(Su

cces

s)

2 Category Model3 Category Model

Model Sparseness

Sparseness coefficient: % of space not mapped to any category.

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0 2000 4000 6000 8000 10000 12000

Training Epochs

Avg

Spa

rsity

Coe

ffic

ient

Starting

VOT

.5-1

Unmapped space

Small

Model Sparseness

2


0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0 2000 4000 6000 8000 10000 12000

Training Epochs

Avg

Spa

rsity

Coe

ffic

ient

20-40

Starting

VOT

.5-1

Model Sparseness

3


0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0 2000 4000 6000 8000 10000 12000

Training Epochs

Avg

Spa

rsity

Coe

ffic

ient

12-173-11

Starting

VOT

.5-1

20-40

Model Conclusions

Small starting ’s lead to sparse category structure during infancy—much of phonetic space is unmapped.

Occasionally model leaves sparse regions at the end of learning.

1) Competition/Choice framework:• Additional competition or selection mechanisms

during processing allows categorization on the basis of incomplete information.

Model Conclusions

To avoid overgeneralization……better to start with small estimates for

Model Conclusions 2

• Similar properties in terms of starting and the resulting sparseness.

2) Non-parametric models

VOT

Categories• Competitive Hebbian Learning

(Rumelhart & Zipser, 1986).• Not constrained by a particular

equation—can fill space better.

Conclusions 3

Final Conclusions

Infants show graded response to within-category detail.

/b/-results suggest regions of unmapped phonetic space.

Statistical approach provides support for sparseness.• Given current learning theories, sparseness results

from optimal starting parameters.

Empirical test will require a two-alternative task.• AEM: train infants to make eye-movements in

response to stimulus identity.

Future Work

Future Work

• Infants make anticipatory eye-movements along predicted trajectory, in response to stimulus identity.

• Two alternatives allows us to distinguish between category boundary and unmapped space.

Last Word

Early speech categories emerge from an interplay of

• Exquisite sensitivity to graded detail in the signal.

• Long-term sensitivity to statistics of the signal.

• Early biases to optimize the learning problem.

-60 -40 -20 0 20 40 60 80VOT

The last word

title slide

Documents

continuous perception

gradient perception

infant speech categories

consonant perception

speech recognitioncontinuous

categorical discrimination

category sensitivity

discrete categories