neural correlates of continuous and categorical sinewave speech perception: an fmri study rutvik...

1
Neural correlates of continuous and categorical sinewave speech perception: An FMRI study Rutvik Desai, Einat Liebenthal, Eric Waldron, and Jeffrey R. Binder Department of Neurology, Medical College of Wisconsin, Milwaukee, WI Task Trial s Outsid e the scanne r P Iden. W/ feedback on anchor points 40 P Iden. WO/ feedback on continuum 70 N Iden. W/ feedback on anchor points 40 N Iden. WO/ feedback on continuum 70 P ABX W/ feedback on continuum 18 N ABX W/ feedback on continuum 18 P-N-P-N ABX (4 runs) 404 Inform subjects about the stimuli Introduction •Numerous studies implicate the left superior temporal gyrus and sulcus (LSTG/S) in speech perception, though the precise function of these areas remains unclear. • Are these areas sensitive to the acoustic properties of speech, or to other properties of speech sounds such as the categorical nature of their perception? • Here, we examine changes in brain activation related to a perceptual shift from continuous to categorical analysis of sinewave analogs of English CV syllables, following exposure to their phonetic properties. Naïve listeners typically perceive sinewave speech analogs as nonspeech sounds but can learn to attend to their phonetic properties (1). Methods Subjects: 28 right handed subjects. Stimuli: Sinewave analogs of an 8-token /ba/ to /da/ continuum (P). The continuum was created by varying the second formant (F2) continuously from /ba/ to /da/. Three pairs (ba2-ba4, ba4-ba6, and ba6-ba8) were chosen for the ABX task (see below). ba4- ba6 pair is across the /ba/-/da/ category boundary, while the other two pairs are within-category. • A corresponding continuum of nonphonetic sinewave analogs (N) was created by spectrally rotating the first formant (F1) of the syllables. Task: Familiarization and testing with stimuli outside scanner, using identification (see table). • Scanner task: 2-alternative ABX forced- choice discrimination (is X identical to the first or second token in the preceding AB pair?). • In the “pre” session, subjects are naïve about the nature of the stimuli. They are informed about the phonetic nature of the stimuli after the first scan session and perform identical tasks on identical stimuli in the “post” session. •Four conditions: preP, preN, postP, postN. • 80 trials per condition (40 across category, 40 within category). • Two runs per conditions; 8 baseline (silence) trials per run. Imaging Parameters •Image acquisition: 1.5T GE Signa scanner. •Functional images: gradient-echo, echo- planar images, clustered acquisition at TR = 8 sec, TE = 40 ms, Flip angle = 90°, acquisition time = 2100 ms. 22 axial slices, 3.75 x 3.75 x 4 mm 3 . • Structural Images: 3-D spoiled gradient- echo sequence. Whole brain sagittal slices, 0.9 x 0.9 x 1.2 mm 3 . •Image analysis: Spatial co-registration, multiple linear regression (AFNI 3dDeconvolve) with reference functions representing the conditions. • Functional group maps thresholded at voxel-wise p < 0.02, corrected to map-wise FMRI Results postP > preP L STS , caudate, anterior and posterior cingulate postP > postN L STS, middle frontal gyrus, angular gyrus, R middle frontal gyrus postN > preN L middle and inferior frontal gyrus, anterior and posterior cingulate, caudate, inferior parietal lobule and angular gyrus, precuneus, putamen, thalamus, lingual gyrus, STG and planum temporale R middle frontal gyrus, anterior and posterior cingulate, caudate, lingual gyrus, precentral gyrus, STG, precuneus, thalamus preP > pre N None ROI Analysis • A spherical ROI with a diameter of 12mm was placed at the center of mass of the left STS activation in the postP > preP contrast (blue; Talairach coordinates -52, -36, 1). • Four other spherical ROIs were placed 15mm anterior and posterior to this ROI, as well as dorsally and ventrally. • An index of categorical perception based on individual performance accuracy with within- and across- category contrasts was computed: index = postAcross/postWithin – preAcross/preWithin • A high value of this index indicates an increase in the categorical nature of the perception of stimuli from pre (naïve) to •The activation in the center-of-mass (blue) ROI was found to be significantly correlated (p < 0.05, two-tailed) with the behavioral index of categorical perception. The activation in no other ROIs was found to be significantly correlated. Discussion and Conclusions • Identical stimuli were heard once as non- speech sounds (pre- condition) and once as speech (post- condition). • The behavioral identification data indicate a shift from continuous to categorical perception from the pre- to the post- session. • The discrimination accuracy of the syllables was better across category even in the pre- condition. However, all subjects reported not identifying the sounds as speech in that condition. Thus, this effect may be due to the acoustic characteristics of the sounds, rather than to their categorization in the pre- condition. • A region in the left STS was activated when the syllables were perceived categorically as opposed to continuously, in line with previous results (2). • A similar region was activated in postP > postN contrast. • This region was not activated in the postN > preN nonphonetic control contrast. A deactivation of dorsal primary auditory regions was observed, possibly due to habituation. • The activation in the Left STS was found to be correlated with a behavioral categorical perception index • Results suggests that this region in the P honetic D iscrim ination (n=28) 50% 60% 70% 80% 90% 100% Across W ithin Accuracy Pre P ost N onP honetic D iscrim ination (n=28) 50% 60% 70% 80% 90% 100% Across W ithin Accuracy Pre P ost P honetic Identification (n=28) 0% 20% 40% 60% 80% 100% ba2 ba3 ba4 ba5 ba6 ba7 ba8 Precieved as /ba/ Pre P ost N onP honetic Identification (n=28) 0% 20% 40% 60% 80% 100% NP2 NP3 NP4 NP5 NP6 NP7 NP8 Precieved as token1 Pre P ost [email protected] Download from www.neuro.mcw.edu/~rhdesai/

Post on 18-Dec-2015

221 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Neural correlates of continuous and categorical sinewave speech perception: An FMRI study Rutvik Desai, Einat Liebenthal, Eric Waldron, and Jeffrey R

Neural correlates of continuous and categorical sinewave speech perception: An FMRI study

Rutvik Desai, Einat Liebenthal, Eric Waldron, and Jeffrey R. BinderDepartment of Neurology, Medical College of Wisconsin, Milwaukee, WI

Task Trials

Outside the scanner

P Iden. W/ feedback on anchor points 40

P Iden. WO/ feedback on continuum 70

N Iden. W/ feedback on anchor points 40

N Iden. WO/ feedback on continuum 70

P ABX W/ feedback on continuum 18

N ABX W/ feedback on continuum 18

Inside the scanner

P-N-P-N ABX (4 runs) 404

Inform subjects about the stimuli

P Iden. W/ feedback on anchor points 30

P Iden. WO/ feedback on continuum 70

N Iden. W/ feedback on anchor points 30

N Iden. WO/ feedback on continuum 70

P-N-P-N ABX (4 runs) 404

Introduction

•Numerous studies implicate the left superior temporal gyrus and sulcus (LSTG/S) in speech perception, though the precise function of these areas remains unclear.• Are these areas sensitive to the acoustic properties of speech, or to other properties of speech sounds such as the categorical nature of their perception?• Here, we examine changes in brain activation related to a perceptual shift from continuous to categorical analysis of sinewave analogs of English CV syllables, following exposure to their phonetic properties. Naïve listeners typically perceive sinewave speech analogs as nonspeech sounds but can learn to attend to their phonetic properties (1).

Methods

Subjects: 28 right handed subjects.

Stimuli: Sinewave analogs of an 8-token /ba/ to /da/ continuum (P). The continuum was created by varying the second formant (F2) continuously from /ba/ to /da/. Three pairs (ba2-ba4, ba4-ba6, and ba6-ba8) were chosen for the ABX task (see below). ba4-ba6 pair is across the /ba/-/da/ category boundary, while the other two pairs are within-category.• A corresponding continuum of nonphonetic sinewave analogs (N) was created by spectrally rotating the first formant (F1) of the syllables.

Task: Familiarization and testing with stimuli outside scanner, using identification (see table).• Scanner task: 2-alternative ABX forced-choice discrimination (is X identical to the first or second token in the preceding AB pair?).• In the “pre” session, subjects are naïve about the nature of the stimuli. They are informed about the phonetic nature of the stimuli after the first scan session and perform identical tasks on identical stimuli in the “post” session.

•Four conditions: preP, preN, postP, postN.

• 80 trials per condition (40 across category, 40 within category).

• Two runs per conditions; 8 baseline (silence) trials per run.

Imaging Parameters•Image acquisition: 1.5T GE Signa scanner. •Functional images: gradient-echo, echo-planar images, clustered acquisition at TR = 8 sec, TE = 40 ms, Flip angle = 90°, acquisition time = 2100 ms. 22 axial slices, 3.75 x 3.75 x 4 mm3.• Structural Images: 3-D spoiled gradient-echo sequence. Whole brain sagittal slices, 0.9 x 0.9 x 1.2 mm3 . •Image analysis: Spatial co-registration, multiple linear regression (AFNI 3dDeconvolve) with reference functions representing the conditions.• Functional group maps thresholded at voxel-wise p < 0.02, corrected to map-wise p < 0.05.

Behavioral Results

FMRI Results

postP > preP

L STS, caudate, anterior and posterior cingulate

R anterior and posterior cingulate, supramarginal gyrus

postP > postN

L STS, middle frontal gyrus, angular gyrus,

R middle frontal gyrus

postN > preN

L middle and inferior frontal gyrus, anterior and posterior cingulate, caudate, inferior parietal lobule and angular gyrus, precuneus, putamen, thalamus, lingual gyrus, STG and planum temporale

R middle frontal gyrus, anterior and posterior cingulate, caudate, lingual gyrus, precentral gyrus, STG, precuneus, thalamus

preP > pre N

None

ROI Analysis• A spherical ROI with a diameter of 12mm was placed at the center of mass of the left STS activation in the postP > preP contrast (blue; Talairach coordinates -52, -36, 1).

• Four other spherical ROIs were placed 15mm anterior and posterior to this ROI, as well as dorsally and ventrally.

• An index of categorical perception based on individual performance accuracy with within- and across- category contrasts was computed:

index = postAcross/postWithin – preAcross/preWithin

• A high value of this index indicates an increase in the categorical nature of the perception of stimuli from pre (naïve) to post (informed) conditions.

•The correlation between the average activation and the behavioral index was computed for each of the ROIs.

•The activation in the center-of-mass (blue) ROI was found to be significantly correlated (p < 0.05, two-tailed) with the behavioral index of categorical perception. The activation in no other ROIs was found to be significantly correlated.

Discussion and Conclusions

• Identical stimuli were heard once as non-speech sounds (pre- condition) and once as speech (post- condition).

• The behavioral identification data indicate a shift from continuous to categorical perception from the pre- to the post- session.

• The discrimination accuracy of the syllables was better across category even in the pre- condition. However, all subjects reported not identifying the sounds as speech in that condition. Thus, this effect may be due to the acoustic characteristics of the sounds, rather than to their categorization in the pre- condition.

• A region in the left STS was activated when the syllables were perceived categorically as opposed to continuously, in line with previous results (2).

• A similar region was activated in postP > postN contrast.

• This region was not activated in the postN > preN nonphonetic control contrast. A deactivation of dorsal primary auditory regions was observed, possibly due to habituation.

• The activation in the Left STS was found to be correlated with a behavioral categorical perception index

• Results suggests that this region in the L STS is not sensitive to the fine-grained acoustic properties of CV syllables, but responds to the categorical nature of their perception.

References

(1) Remez et al., 1981 Science 212:947-950; Grunke & Pisoni, 1982 Percep Psych 31:210-218.

(2) Liebenthal et al., 2003 JOCN Supp. 103-104.

Work supported by R01- DC 006287-01 (E. Liebenthal)

Phonetic Discrimination (n=28)

50%

60%

70%

80%

90%

100%

Across Within

Acc

urac

y

Pre

Post

NonPhonetic Discrimination (n=28)

50%

60%

70%

80%

90%

100%

Across Within

Acc

urac

y

Pre

Post

Phonetic Identification (n=28)

0%

20%

40%

60%

80%

100%

ba2 ba3 ba4 ba5 ba6 ba7 ba8

Pre

ciev

ed a

s /b

a/

Pre

Post

NonPhonetic Identification (n=28)

0%

20%

40%

60%

80%

100%

NP2 NP3 NP4 NP5 NP6 NP7 NP8

Pre

ciev

ed a

s to

ken1

Pre

Post

[email protected] Download from www.neuro.mcw.edu/~rhdesai/