some speech basics phonetic transcription, context-dependent variation, and intonation
DESCRIPTION
Jennifer J. Venditti Postdoctoral Research Associate Columbia Computer Science 12 September 2002. Some Speech Basics Phonetic Transcription, Context-dependent variation, and Intonation. 1. Phonetic Transcription. Spelling vs. Sounds. same spelling = different sounds - PowerPoint PPT PresentationTRANSCRIPT
Some Speech BasicsPhonetic Transcription,
Context-dependent variation,and Intonation
Jennifer J. VendittiPostdoctoral Research Associate
Columbia Computer Science
12 September 2002
1. Phonetic Transcription
Spelling vs. Sounds same spelling = different sounds
o comb, tomb, bomb oo blood, food, goodc court, center, cheese s reason, surreal, shy
same sound = different spellings[i] sea, see, scene, receive, thief [s] cereal, same,
miss[u] true, few, choose, lieu, do [ay] prime, buy, rhyme, lie
combination of letters = single soundch child, beach th that, batheoo good, foot gh laugh
single letter = combination of soundsx exit, Texas u use, music
‘silent’ lettersk knife, know p psycho, pterodactyle moose, bone gh through
Figures 4.1 and 4.2:Jurafsky & Martin (2000),pages 94-95.
On-line pronunciation dictionaries
phonesetderived from:
number of wordforms
English variety
LDC PRONLEX
ARPAbet 90,694 American
CMUdict ARPAbet 100,000 American
CELEX IPA 160,595 British
Source: Jurafsky & Martin (2000), page 121.
Places of articulation
http://www.chass.utoronto.ca/~danhall/phonetics/sammy.html
labial
dentalalveolar post-alveolar/palatal
velar
uvular
pharyngeal
laryngeal/glottal
Vocal fold vibration
[UCLA Phonetics Lab demo]
Articulatory parameters for English consonants (in ARPAbet) PLACE OF ARTICULATION
bilabial
labio-dental
inter-dental
alveolar
palatal
velar glottal
stop p b t d k g q
fric. f v th dh s z sh zh h
affric. ch jh
nasal m n ng
approx
w l/r y
flap dxMA
NN
ER
OF
AR
TIC
ULA
TIO
N
VOICING: voiceless
voiced
American English vowel space
FRONT BACK
HIGH
LOW
eyow
aw
oy
ay
iy
ih
eh
ae aa
ao
uw
uh
ah
ax
ix ux
[iy] vs. [uw]
(From a lecture given by Rochelle Newman)
[ae] vs. [aa]
(From a lecture given by Rochelle Newman)
Acoustic landmarks
“Patricia and Patsy and Sally”
[p] [t] [p] [t]
[p] [t]
[l][sh] [s] [s][n] [n][ix]
[ix] [ih]
[ih] [ax] [ae] [iy] [iy][ae]
Articulators in action
“Why did Ken set the soggy net on top of his deck?”
(Sample from the Queen’s University / ATR Labs X-ray Film Database)
Exercise (1)1. Write your name in:
(a) IPA.(b) ARPAbet (if possible).
2. Choose one of the following triplets and transcribe each word in both IPA and ARPAbet. cone, tomb, bottom blood, fool, hook court, race, cheese reason, surreal, cash thing, these, other laugh, through, ghoul
Figures 4.1 and 4.2:Jurafsky & Martin (2000),pages 94-95.
IPA consonants
(Distributed by the International Phonetics Association.)
IPA vowels
(Distributed by the International Phonetics Association.)
2. Context-dependent phonetic variation
Context-dependent variation What we would consider a single ‘sound’ can be
pronounced differently depending on the phonetic context. For example, the phoneme /t/:
Figure 4.8: Jurafsky & Martin (2000), page 104.
Another regular alternation I can ask [ay k ae n ae s k] I can see [ay k ae n s iy] I can bake [ay k ae m b ey k] I can play [ay k ae m p l ey] I can go [ay k ae ng g ow] I can carry [ay k ae ng k ae r iy]
n m / __ [+labial stop]n ng / __ [+velar stop]
(inopportune [n], insatiable [n], impervious [m], immortal [m], incoherent [ng], ingratitude [ng])
English pluralshiccup [p] hiccups flood [d] floodssock [k] socks scab [b] scabshabit [t] habits frog [g] frogsspoof [f] spoofs comb [m] combshearth [th] hearths grave [v] graves
lathe [dh] lathesbeach [ch] beaches fool [l] foolsdish [sh] dishes sewer [r] sewersjudge [jh] judges pies [ay] piesrace [s] races curfew [uw] curfewsaxe [s] axes sofa [ax] sofasraise [z] raises
Phonological rules for Engl. plurals
Assume that the lexical form of plural is /z/. Insertion: ix / [+sibilant] ^__ z # Devoicing: z s / [-voice] ^__ #
bus+PL cape+PL hen+PL/b ah s +z/ /k ey p +z/ /h eh n +z/
insertion:b ah s +ix z -- --devoicing: -- k ey p s --
[b ah s ix z] [k ey p s] [h eh n z]
/b ah s +z/ /k ey p +z/ /h eh n +z/devoicing: b ah s s k ey p s
--insertion:-- -- --
*[b ah s s] [k ey p s] [h eh n z]
3. Intonation
Intonation makes the difference
A: I’d like to fly to Davenport, Iowa on TWA.B: TWA doesn’t fly there ...
B1: They fly to Des Moines. B2: They fly to Des Moines.
A: What types of foods are a good source of vitamins?B1: Legumes are a good source of vitamins.B2: Legumes are a good source of vitamins.
A1: I met Mary and Elena’s mother at the mall yesterday.A2: I met Mary and Elena’s mother at the mall yesterday.
Intonation is about ...
Pitch Melody, or “tune” Alignment Prominence and focus Chunking, or “phrasing” ... and more ...
Vocal fold vibration
Physical: Fundamental frequency (F0) rate of vibration of the vocal folds
Perceptual: Pitch
perceived pitch
fun
dam
en
tal fr
eq
.[UCLA Phonetics Lab demo]
Pitch range
[from Prosody on the Web tutorial on pitch]
Differences can be due to physical size, gender, social identity, excitement level, linguistic, etc ...
English Pitch Accents Certain words in the speech stream can be made
structurally and perceptually prominent by the use of pitch accents.
Lenora works for Lucent.* *
Pitch accents are local pitch movements (e.g. rising, falling) or pitch maxima/minima that accompany these metrically strong syllables.
The intonational “tune” is the melody that is created by sequences of pitch accents over an utterance.
Intonational tunes: What do they mean?
Lenora works for Lucent.
Lenora works for Lucent.
Lenora works for Lucent.
Lenora works for Lucent.
[Tell me something about the world ...]
[... I hope she doesn’t have stock options.]
[... Really? I wasn’t aware of that.]
[I’ve told you a million times ...]
* *
*
* *
* *
*
[See works by Bolinger, Ladd, Hirschberg ...]
Alignment differences cue “assertion” vs. “suggestion”
A: I’d like to fly to Davenport, Iowa on TWA.B: TWA doesn’t fly there ...
50
100
150
200
250
300
350
400
they fly to Des Moines 50
100
150
200
250
300
350
400
they fly to Des Moines
Alignment with different words
B: LEGUMES are a good source of vitamins.
Legumes are a good source of vitamins.* *
*
“broad focus”
“narrow focus”
A: What types of foods are a good source of vitamins?
# Legumes are a good source of VITAMINS.
50
100
150
200
250
300
350
400
Placement of focal accent
LEGUMES are a good source of vitamins
The rise-fall tune (= “I assert this”) shifts locations.
50
100
150
200
250
300
350
400
Placement of focal accent
Legumes are a GOOD source of vitamins
The rise-fall tune (= “I assert this”) shifts locations.
Placement of focal accent
legumes are a good source of VITAMINS50
100
150
200
250
300
350
400
The rise-fall tune (= “I assert this”) shifts locations.
Chunking, or “phrasing”
A1: I met Mary and Elena’s mother at the mall yesterday.
A2: I met Mary and Elena’s mother at the mall yesterday.
50
100
150
200
250
300
350
400
Phrasing can disambiguate
I met Mary and Elena’s mother at the mall yesterday
Mary & Elena’s mothermall
One intonation phrase with relatively flat overall pitch range.
50
100
150
200
250
300
350
400
Phrasing can disambiguate
I met Mary and Elena’s mother at the mall yesterday
Marymall
Elena’s mother
Separate phrases, with expanded pitch movements.
Lists of numbers, nounstwenty.eight.five
ninety.four.three
seventy.three.seven
forty.seven.seven
seventy.seven.seven coffee cake and cream
chocolate ice cream and cake
fish fingers and bottles
cheese sandwiches and milk
cream buns and chocolate[from Prosody on the Web tutorial on chunking]
Exercise (2)1. Sketch out an F0 contour of
Does Manitowoc have a bowling alley?as uttered in the following two contexts:(a) “I know Green Bay has a bowling alley, but ...”(b) “I know Manitowoc has a theater, but ...”
2. Complete the sentence:When Madonna sings the song ...
Describe the prosodic phrasing of your utterance.
3. How can phrasing help disambiguate the utterance:
that’s right at the traffic light