speech psyc 330: perception. some basics methods of manipulation phonation (air pushed across vocal...

18
SPEECH PSYC 330: PERCEPTION

Upload: wilfred-bates

Post on 17-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

S P E E C H

PSYC 330: PERCEPTION

SOME BASICS

• Methods of Manipulation• PHONATION (air pushed across vocal cords)• Airflow• Mass and “tuning” of cords• Harmonics

• ARTICULATION (changes in vocal tract – ah; ee)• Vocal tract (everything above the larynx) acts as resonator

Change shape change in resonance characteristics(shape increases/decreases energy at different frequencies)

Filter function, peaks in wave called FORMANTS lowest freq = F1, F2, and so forth

ARTICULATION AND SPEECH

• Vowels• TONGUE• Up or down• Front or back

• LIPS• Degree of rounding

• EE, AH, OO

• Consonants• PLACE of articulation• Bilabial (lips): b,p,m• Alveolar (teeth): d,t,n• Velar (soft palate): g,k,ng

• MANNER of articulation• Stops: b,d,g,p,t,k• Fricatives: s,z,f,v,th,sh• Laterals and glides: l,r,w,y• Affricates: ch,j• Nasals: n,m,ng

• VOICING• Voiced: b,m,z,l,r• Not voiced: p,s,ch

ADDITIONAL COMPLICATIONS

Co-articulation• Articulation of one speech sound overlaps with the next, because we talk so

fast• Adjust production of sounds based upon sounds preceding and following • Context effect(say “moody” and “eedoom”

lack of physical invariance in the stimulus – doesn’t both speech perception in practice, but big problem for theory (and for AI)

Categorical perception1. sharp labeling (one OR the other)2. inability (or difficulty) discriminating within categories3. discrimination performance predicted by labeling

McGurk effect – mismatched auditory and visual inputvisual “gah”; auditory “bah” perception “dah”

• Speech Segmentation• Saffran et al (1996)• 8 m old infants trained with 2 m stream of artificial language• After brief exposure already picking out the “words”

• DV listening time to presented “words” and “non-words”

• Concluded that children pick out covariance of sound combinations (statistical likelihood)• pri-tee (baby, good, far, nice) • bay-bee (girl, boy, good)• tee-bay (?)

LANGUAGE DEVELOPMENT

• Can you say 7,777 in Swedish?• Not just pronunciation, but hearing too• Preference methodology• In utero (HR changes)• Familiarity effect (own language, own mother, own stories)

• 6 m• Preference for own vowels

• 12 m• Preference for own consonants

• Huppi & Dubois (2013) brain scans on premature babies (up to 3 mo early – brain not fully developed)• Found that they discriminated between male and female voices• Found that they discriminated between “ga” and “da”• Used same regions of the brain as adults do to make the

discriminations

SPECIAL TOPIC: EMOTION PERCEPTION IN LANGUAGE

Laukka (2005) Categorical Perception of VocalEmotion Expression

Stimulus Development:Actress says “It is now 11 o’clock” with tones reflecting anger, fear, happiness and sadness

Physically “morph” the sounds from one emotion to the next continuous variation

METHOD

• UGs presented with sequential discrimination task with two tones (ABX) is X = to A or B?• All combinations of the morphed sounds differing

by 20% were compared• Also asked to judge the emotion (in addition to

discriminating it)

IDENTIFICATION ACCURACY

REACTION TIMES: 50% V OTHER

DISCRIMINATION: ACROSS CATEGORY V. WITHIN CATEGORY

BRAIN STRUCTURES IN SPEECH COMPREHENSION

• Old school: Broca’s and Wenicke’s areas lateralization effects

• AI; belt and parabelt anterior temporal lobe • HOW?• Rosen et al (2011)• How (on what basis) does language (as opposed to other

complex sounds) become lateralized?• IV = intelligible vs unintelligible sentences of equal auditory

complexity (created by manipulating frequency and amplitude changes in auditory signal)

• DV = brain scan data (PET)• Results: intelligible sentences processed in left temporal lobe;

equally complex but non intelligible sentences processed bilaterally

ATTENTION AND SPEECH

"Mechanisms Underlying Selective Neuronal Tracking of Attended Speech at a 'Cocktail Party” Golubic et al, 2013• The Cocktail Party Phenomenon – how do we do it?• Direct electrical recording from the brains of epilepsy

patients• Presented naturalistic stimuli re: “cocktail party”

• Findings: brain regions in and near to primary auditory cortex respond to both attended and non attended speaker; processing in subsequent paths are selective

• We use bottom-up processing (temporal/amplitude patterns) to “tune” the selection – selectivity “unfolds” (becomes more prominent) across a sentence

Green dots = brain regions that responded to both speakers (attended and ignored)Red dots = brain regions that respond selectively (only to attended speaker)