behrman chapter 5, 6
DESCRIPTION
Behrman Chapter 5, 6. Place less emphasis on… Minor anatomical landmarks and features Extrinsic muscles of the larynx Blood supply to the larynx Central motor control of larynx Peripheral Sensory control of larynx Stress-Strain Properties of Vocal Folds. Laryngeal Activity in Speech/Song. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/1.jpg)
Behrman Chapter 5, 6
Place less emphasis on…
• Minor anatomical landmarks and features
• Extrinsic muscles of the larynx
• Blood supply to the larynx
• Central motor control of larynx
• Peripheral Sensory control of larynx
• Stress-Strain Properties of Vocal Folds
![Page 2: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/2.jpg)
Laryngeal Activity in Speech/Song
• Sound source to excite the vocal tract– Voice– Whisper
• Prosody– Fundamental frequency (F0) variation– Amplitude variation
• Realization of phonetic goals– Voicing– Devoicing– Glottal frication (//, //)– Glottal stop (//)– Aspiration
• Para-linguistic and extra-linguistic roles– Transmit affect– Speaker identity
![Page 3: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/3.jpg)
The vocal fold through life…
• Newborns– No layered structure of LP– LP loose and pliable
• Children– Vocal ligament appears 1-4
yrs– 3-layered LP is not clear
until 15 yrs
• Old age– Superficial layer becomes
edematous & thicker– Thinning of intermediate
layer and thickening of deep layer
– Changes in LP more pronounced in men
– Muscle atrophy
![Page 4: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/4.jpg)
The Glottal Cycle
![Page 5: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/5.jpg)
Complexity of vocal fold vibration
Vertical phase difference
Longitudinal phase difference
http://video.google.com/videosearch?source=ig&hl=en&rlz=&q=high%20speed%20video%20voice&um=1&ie=UTF-8&sa=N&tab=wv#
![Page 6: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/6.jpg)
Myoelastic Aerodynamic Theory of Phonation
Necessary and Sufficient Conditions
• Vocal Folds are adducted (Adduction)
• Vocal Folds are tensed (Longitudinal Tension)
• Presence of Aerodynamic pressures
![Page 7: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/7.jpg)
2-mass model
Lower part of vocal fold
Upper part of vocal fold
Mechanical coupling stiffness
TA muscle
Coupling between
mucosa & muscle
![Page 8: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/8.jpg)
•VF adducted & tensed → myoelastic pressure (Pme )•Glottis is closed•subglottal air pressure (Psg) ↑ •Psg ~ 8-10 cm H20, Psg > Pme
•L and R M1 separate•Transglottal airflow (Utg) = 0
As M1 separates, M2 follows due to
mechanical coupling stiffnessPsg > Pme
glottis begins to openPsg > Patm therefore Utg > 0
![Page 9: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/9.jpg)
Utg ↑ ↑ since glottal aperature << tracheal circumference
Utg ↑ Ptg ↓ due to
Bernoulli effectPressure drop across the glottis
Bernoulli’s Law
P + ½ U2 = K
where
P = air pressure
= air density
U = air velocity
![Page 10: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/10.jpg)
Utg ↑ Ptg ↓ due to Bernoulli effect
Plus “other” aerodynamic effects
Ptg < Pme
M1 returns to midlineM2 follows M1 due to
mechanical coupling stiffness
Utg = 0
Pattern repeats 100-200 times a second
![Page 11: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/11.jpg)
![Page 12: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/12.jpg)
Limitations of this simple model
![Page 13: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/13.jpg)
The Glottal Cycle
![Page 14: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/14.jpg)
Sound pressure wave
Time
Inst
anta
neou
sso
und
pres
sure
![Page 15: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/15.jpg)
Phonation is actually quasi-periodic
• Complex Periodic– vocal fold oscillation
• Aperiodic– Broad frequency noise embedded in signal– Non-periodic vocal fold oscillation– Asymmetry of vocal fold oscillation – Air turbulence
• Voicing vs. whispering
![Page 16: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/16.jpg)
Glottal Aerodynamics
• Volume Velocity
• Driving Pressure
• Phonation Threshold Pressure– Initiate phonation– Sustain phonation
• Laryngeal Airway Resistance
![Page 17: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/17.jpg)
Measuring Glottal Behavior
• Videolaryngoscopy – Stroboscopy– High speed video
![Page 18: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/18.jpg)
Photoglottography (PGG)
Time
illum
inat
ion
![Page 19: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/19.jpg)
Electroglottography (EGG)• Human tissue = conductor • Air: conductor• Electrodes placed on each
side of thyroid lamina• high frequency, low current
signal is passed between them
• VF contact = impedance• VF contact = impedance
![Page 20: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/20.jpg)
Electroglottogram
![Page 21: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/21.jpg)
Glottal Airflow (volume velocity)
• Instantaneous airflow is measured as it leaves the mouth
• Looks similar to a pressure waveform
• Can be inverse filtered to remove effects of vocal tract
• Resultant is an estimate of the airflow at the glottis
![Page 22: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/22.jpg)
Flow Glottogram
![Page 23: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/23.jpg)
Synchronous plots
Sound pressure waveform(at mouth)
Flow glottogram(inverse filtered mask signal)
Photoglottogram
Electroglottogram
![Page 24: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/24.jpg)
F0 Control
• Anatomical factorsMales ↑ VF mass and length = ↓ Fo
Females ↓ VF mass and length = ↑ Fo
• Subglottal pressure adjustment – show example↑ Psg = ↑ Fo
• Laryngeal and vocal fold adjustments↑ CT activity = ↑ Fo
TA activity = ↑ Fo or ↓ Fo
• Extralaryngeal adjustments↑ height of larynx = ↑ Fo
![Page 25: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/25.jpg)
Fundamental Frequency (F0)
Average F0
• speaking fundamental frequency (SFF)
• Correlate of pitch
• Infants– ~350-500 Hz
• Boys & girls (3-10) – ~ 270-300 Hz
• Young adult females– ~ 220 Hz
• Young adult males– ~ 120 Hz
Older females: F0 ↓
Older males: F0 ↑
F0 variability• F0 varies due to
– Syllabic & emphatic stress– Syntactic and semantic factors– Phonetics factors (in some
languages) • Provides a melody (prosody)
• Measures– F0 Standard deviation
• ~2-4 semitones for normal speakers
– F0 Range
![Page 26: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/26.jpg)
Maximum Phonational Frequency Range
• highest possible F0 - lowest possible F0
• Not a speech measure
• measured in Hz, semitones or octaves
• Males ~ 80-700 Hz1
• Females ~135-1000 Hz1
• 3 octaves often considered normal
1Baken (1987)
![Page 27: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/27.jpg)
Fundamental Frequency (F0) Control
• Ways to measure F0
– Time domain vs. frequency domain– Manual vs. automated measurement– Specific Approaches
• Peak picking• Zero crossing• Autocorrelation• The cepstrum & cepstral analysis
![Page 28: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/28.jpg)
Autocorrelation
Data Correlation
+ 1.0
+ 0.1
- 0.82
+ 0.92
![Page 29: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/29.jpg)
Cepstrum
![Page 30: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/30.jpg)
![Page 31: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/31.jpg)
Amplitude Control
• Subglottal pressure adjustment↑ Psg = ↑ sound pressure
• Laryngeal and vocal fold adjustments↑ medial compression = ↑ sound pressure
• Supralaryngeal adjustments
![Page 32: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/32.jpg)
Measuring Amplitude
• Pressure
• Intensity
• Decibel Scale
![Page 33: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/33.jpg)
Sound Pressure Level (SPL)
Average SPL• Correlate of loudness• conversation:
• ~ 65-80 dBSPL
SPL Variability SPL to mark stress• Contributes to prosody• Measure
– Standard deviation for neutral reading material:
• ~ 10 dBSPL
![Page 34: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/34.jpg)
Dynamic Range
• Amplitude analogue to maximum phonational frequency range
• ~50 – 115 dB SPL
![Page 35: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/35.jpg)
Vocal Quality
• no clear acoustic correlates like pitch and loudness
• However, terms have invaded our vocabulary that suggest distinct categories of voice quality
Common Terms• Breathy• Tense/strained• Rough• Hoarse
![Page 36: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/36.jpg)
Are there features in the acoustic signal that correlate with these
quality descriptors?
![Page 37: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/37.jpg)
BreathinessPerceptual Description• Audible air escape in the voice
Physiologic Factors• Diminished or absent closed phase• Increased airflow
Potential Acoustic Consequences• Change in harmonic (periodic) energy
– Sharper harmonic roll off• Change in aperiodic energy
– Increased level of aperiodic energy (i.e. noise), particularly in the high frequencies
![Page 38: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/38.jpg)
harmonics (signal)-to-noise-ratio (SNR/HNR)
• harmonic/noise amplitude HNR
– Relatively more signal– Indicative of a normality
HNR– Relatively more noise– Indicative of disorder
• Normative values depend on method of calculation
• “normal” HNR ~ 15
![Page 39: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/39.jpg)
Harmonic peak
Noise ‘floor’
Noise ‘floor’
Frequency
Am
plitude
Harmonic peak
![Page 40: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/40.jpg)
From Hillenbrand et al. (1996)
First harmonic amplitude
![Page 41: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/41.jpg)
Prominent Cepstral Peak
![Page 42: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/42.jpg)
Spectral Tilt: Voice Source
![Page 43: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/43.jpg)
Spectral Tilt: Radiated Sound
![Page 44: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/44.jpg)
Peak/average amplitude ratio
![Page 45: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/45.jpg)
From Hillenbrand et al. (1996)
![Page 46: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/46.jpg)
WMU Graduate Students
![Page 47: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/47.jpg)
Tense/Pressed/Effortful/Strained Voice
Perceptual Description• Sense of effort in production
Physiologic Factors• Longer closed phase• Reduced airflow
Potential Acoustic consequences• Change in harmonic (periodic) energy
– Flatter harmonic roll off
![Page 48: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/48.jpg)
Pressed
Breathy
Spectral Tilt
![Page 49: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/49.jpg)
Acoustic Basis of Vocal Effort
100.000000 200.000000 300.000000 400.000000 500.000000
effort
100.000000
200.000000
300.000000
400.000000
500.000000
Reg
ress
ion
Ad
just
ed (
Pre
ss)
Pre
dic
ted
V
alu
eDependent Variable: effort
Scatterplot
F0 + RMS + Open Quotient
Perc
epti
on o
f E
ffor
t
Tasko, Parker & Hillenbrand (2008)
![Page 50: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/50.jpg)
![Page 51: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/51.jpg)
Roughness
• Perceptual Description– Perceived cycle-to-cycle variability in voice
• Physiologic Factors– Vocal folds vibrate, but in an irregular way
• Potential Acoustic Consequences– Cycle-to-cycle variations F0 and amplitude– Elevated jitter– Elevated shimmer
![Page 52: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/52.jpg)
Period/frequency & amplitude variability
• Jitter: variability in the period of each successive cycle of vibration
• Shimmer: variability in the amplitude of each successive cycle of vibration
…
![Page 53: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/53.jpg)
Jitter and Shimmer
Sources of jitter and shimmer• Small structural asymmetries
of vocal folds• “material” on the vocal folds
(e.g. mucus)• Biomechanical events, such as
raising/lowering the larynx in the neck
• Small variations in tracheal pressures
• “Bodily” events – system noise
Measuring jitter and shimmer• Variability in measurement
approaches• Variability in how measures are
reported• Jitter
– Typically reported as % or msec– Normal ~ 0.2 - 1%
• Shimmer– Can be % or dB– Norms not well established
![Page 54: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/54.jpg)
Vocal Register
What is a vocal register?
![Page 55: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/55.jpg)
Vocal Registers
Pulse (Glottal fry)– 30-80 Hz, mean ~ 60 Hz– Closed phase very long (90 % cycle)– May see biphasic pattern of vibration (open,
close a bit, open and close completely)– Low subglottal pressure (2 cm water)– Energy dies out over the course of a cycle so
parts of the cycle has very little energy– Hear each individual cycle
![Page 56: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/56.jpg)
Vocal Registers
Modal– VF are relatively short and thick – Reduced VF stiffness– Large amplitude of vibration– Possesses a clear closed phase– The result is a voice that is relatively loud and
low in pitch – Average values cited refer to modal register
![Page 57: Behrman Chapter 5, 6](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812be5550346895d906090/html5/thumbnails/57.jpg)
Vocal Registers
Falsetto– 500-1100 Hz (275-600 Hz males)– VF are relatively long and thin– Increased VF stiffness– Small amplitude of vibration– Vibration less complex– Incomplete closure (no closed phase)– The result is a voice that is high in pitch