meena ramani 04/12/06
DESCRIPTION
EEL 6586 Automatic Speech Processing. Meena Ramani 04/12/06. Topics to be covered. Lecture 1: The incredible sense of hearing 1 Anatomy Perception of Sound Lecture 2: The incredible sense of hearing 2 Psychoacoustics Hearing aids and cochlear implants. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/1.jpg)
Meena Ramani
04/12/06
EEL 6586 Automatic Speech Processing
![Page 2: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/2.jpg)
Topics to be covered
Lecture 1: The incredible sense of hearing 1The incredible sense of hearing 1
Anatomy
Perception of Sound
Lecture 2: The incredible sense of hearing 2The incredible sense of hearing 2
Psychoacoustics
Hearing aids and cochlear implants
![Page 3: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/3.jpg)
The incredible sense of hearing-2The incredible sense of hearing-2
“Behind these unprepossessing flaps lie structures of such delicacy that they shame the most skillful craftsman"
-Stevens, S.S. [Professor of Psychophysics, Harvard University]
![Page 4: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/4.jpg)
How do we hear?
![Page 5: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/5.jpg)
Threshold of Hearing
![Page 6: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/6.jpg)
Equal loudness curves
![Page 7: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/7.jpg)
The Bass Loss Problem
Rock music
Too lowno bass
Too hightoo much bass
![Page 8: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/8.jpg)
Threshold variation with age
102
103
104
-10
0
10
20
30
40
50
60
70
80
90
Frequency (Hz)
Th
res
ho
ld o
f h
ea
rin
g (
dB
SP
L)
Thresholds of hearing for normal & HI listeners
Normal hearingHearing impaired
![Page 9: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/9.jpg)
The Audiogram
0 1000 2000 3000 4000 5000 6000-20
0
20
40
60
80
100
Frequency, Hz
He
ari
ng
Le
ve
l (H
L),
dB
Audiogram
Left EarRight Ear
![Page 10: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/10.jpg)
The Audiogram (contd.)
Pure tone audiogram
[250 500 1K 2K 4K 6k] Hz
<20 dB HL is Normal Hearing
0 1000 2000 3000 4000 5000 6000-20
0
20
40
60
80
100
Frequency, Hz
He
ari
ng
Le
ve
l (H
L),
dB
Audiogram
Left EarRight Ear
![Page 11: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/11.jpg)
Loudness Growth Curve
0 20 40 60 80 1000
1
2
3
4
5
6
7
Input level (dB SPL)
LG
OB
-Lo
ud
ne
ss
ra
tin
g
LGOB loudness growth curve at 250 Hz
Normal hearingHearing impaired
![Page 12: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/12.jpg)
Otoacoustic emissions
• The ear produces some sounds!– OHC-outer hair cell
• Used to test hearing for infants & check if patient is feigning a loss
![Page 13: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/13.jpg)
Monoaural beats
If two tones are presented monaurally with a small frequency difference, a beating pattern can be heard
500 & 502 Hz 500 & 520 Hz
Interaction of the two tones in the same auditory filter
Waveform: 150 Hz + 170 Hz
![Page 14: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/14.jpg)
Beating can also be heard when the tones are presented to different ears!
Beating arises from neural interaction
Only perceived if the tones are sufficiently close in frequency
500 Hz - left 520 Hz - right binaural
Binaural beats
![Page 15: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/15.jpg)
The case of the missing fundamental
Telephone BW: 300-3400 Hz
How do we know the pitch?
Primary Auditory cortex
•Pitch sensitive neurons [Bendor and Wang, Nature 2005]
•Neuron responds to fundamental and harmonics
•What are the I/Ps to these neuron?
How do spikes represent periodic, temporal and spectral information?
![Page 16: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/16.jpg)
Matlab code available
Feed it a wav file
Spits out PSTH
<post stimulus time histogram>
Auditory-periphery model
(Zhang et al. ~2001)
![Page 17: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/17.jpg)
Critical bands
Equally loud, close in frequency
•Same IHCs
•Slightly louder
Equally loud, separated in freq.
•Different IHCs
•Twice as loud
Psychoacoustic experiments
![Page 18: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/18.jpg)
Critical Band (cont.)
• Proposed by Fletcher• How to measure?
– S/N ratio vs noise BW • CB ~= 1.5mm spacing on BM• 24 such band pass filters
• BW of the filters increases with fc
• Logarithmic relationship – Weber’s law example
• Bark scale
Center Freq Critical BW
100 90
200 90
500 110
1000 150
2000 280
5000 700
10000 1200
![Page 19: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/19.jpg)
Critical bands for HI
103
104
0
10
20
30
40
50
60
70
80
90
Desired tone frequency (Hz)
De
sir
ed
to
ne
th
res
ho
ld (
dB
SP
L)
4 kHz tuning curve for normal & HI listeners
MaskerNormal hearingHearing impaired
![Page 20: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/20.jpg)
“You know I can't hear you when the water is running!”
MASKING
![Page 21: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/21.jpg)
Frequency Masking
• Masking occurs because two frequencies lie within a critical band and the higher amplitude one masks the lower amplitude signal
• Masking can be because of broad band, narrowband noise, pure and complex tones
• Low frequency broad band sounds mask the most– Eg. Truck on road, water flowing
• Masking threshold– Amount of dB for test tone to be just audible in presence of noise
![Page 22: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/22.jpg)
Temporal Aspects of Masking
• Simultaneous Masking• Pre-Stimulus/Backward/Premasking
– 1st test tone 2nd Masker
• Poststimulus/Forward/Postmasking– 1st Masker 2nd test tone
![Page 23: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/23.jpg)
Simultaneous masking– Duration >200ms constant test tone threshold– Assume hearing system integrates over a period of 200ms
Postmasking– Decay in effect of masker for 100ms– More dominant
Premasking – Takes place 20ms before masker is on!!– Each sensation is not instantaneous , requires build-up time
• Quick build up for loud maskers• Slower build up for softer maskers
– Less dominant effect
Temporal Aspects of Masking (contd.)
![Page 24: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/24.jpg)
Temporal masking for HI
0 20 40 60 80 100 120 1400
10
20
30
40
50
60
70
80
Desired-Masker tone separation (ms)
De
sir
ed
to
ne
th
res
ho
ld (
dB
SP
L)
Temporal resolution at 4 kHz for normal & HI listeners
Normal hearingHearing impaired
![Page 25: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/25.jpg)
Meena Ramani
04/14/06
EEL 6586 Automatic Speech Processing
![Page 26: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/26.jpg)
Normal Hearing
Sensorineural Hearing Loss
Mild to Severe Loss
[10 20 30 60 80 90] dB HL
Time (s)
Fre
qu
en
cy
(H
z)
Cell phone speech for normal hearing
0 0.5 1 1.5 20
500
1000
1500
2000
2500
3000
3500
4000
-250
-200
-150
-100
-50
0
Time (s)
Fre
qu
en
cy
(H
z)
Cell phone speech for SNHL
0 0.5 1 1.5 20
500
1000
1500
2000
2500
3000
3500
4000
-250
-200
-150
-100
-50
0
What do the hearing impaired hear?
![Page 27: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/27.jpg)
Facts on Hearing Loss in Adults
• One in every ten (28 million) Americans has hearing loss.
• The vast majority of Americans (95% or 26 million) with hearing loss can have their hearing loss treated with hearing aids.
• Only 6 million use HAs
• Millions of Americans with hearing loss could benefit from hearing aids but avoid them because of the stigma.
![Page 28: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/28.jpg)
Types of Hearing aids
Behind The earIn the Ear
In the Canal Completely in the canal
![Page 29: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/29.jpg)
Anatomy of a Hearing Aid
• Microphone• Tone hook• Volume control• On/off switch
• Battery compartment
![Page 30: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/30.jpg)
Ear Mold Measurements
Hearing Aid Fitting
![Page 31: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/31.jpg)
Acclimatization effect
Auditory cortex brain plasticity
Time for the HI to reuse the HF information: Acclimatization effect
How does this affect HA fitting?– Multiple fitting sessions– Initial fitting should be optimum one
![Page 32: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/32.jpg)
So doc, what is the fitting methodology employed by the hearing aid company to compensate for my hearing loss?
Not-so-average Joe
(PhD EE/Speech person)
CO
NFI
DEN
TIA
L?
![Page 33: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/33.jpg)
So, do you want your HA to:
1) Always be comfortably loud2) Equalize loudness across
frequencies3) Normalize loudness
…?
?
Which fitting methodology is the bestbest?
![Page 34: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/34.jpg)
Existing HL compensation algorithms
Rationale Adhoc: Half Gain, POGO Make speech comfortable: NAL-R Loudness normalization: IHAFF, Fig 6 Loudness equalization: DSL
Hearing aid fittingalgorithms
Threshold-only Suprathreshold
NAL-R POGO HG Fig 6 IHAFF DSL
![Page 35: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/35.jpg)
Sensorineural hearing loss [10 20 30 60 80 90] dB HLSpeech level= 65 dBA
Spectrograms and sound files
Normal hearing Hearing impaired HI with Linear gain
HI with DSL gain HI with RBC gain
Section Two
![Page 36: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/36.jpg)
Speech Intelligibility
Objective MeasuresAI, STI
Speech Quality
Objective MeasuresPESQ
Subjective MeasuresMOS
Speech Intelligibility (SI): The degree to which speech can be understood
Performance metrics
Subjective MeasuresHINT
Speech Quality: “Does the speech match your expectations?”
![Page 37: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/37.jpg)
Performance metrics (contd.)• Objective speech quality measure
– Perceptual Evaluation of Subjective Quality (PESQ)• Subjective speech quality measure
– Mean Opinion Score (MOS)• Subjective speech intelligibility measure
– Hearing In Noise Test (HINT)
Reference signal
Comparison signal
Score
![Page 38: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/38.jpg)
Hearing In Noise Test (HINT)
![Page 39: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/39.jpg)
Subjective listening experiments
Audiograms of the HI patients
0 2000 4000 6000 80000
20
40
60
80
100
120
Frequency (Hz)
Th
res
ho
ld o
f h
ea
rin
g (
dB
HL
)
Left ear audiograms of the HI subjectsLocation:
Shands speech & hearing clinic
(sound proof booth)
Subjects:
15 HI people– PTA: 40-70 dB HL
15 normal hearing people
Tools used:
Matlab HINT and MOS GUIs
![Page 40: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/40.jpg)
Subjective HINT and MOS scores for RBC:hearing impaired, cell phone speech
RBC has a 7 dB improvement in SI when compared to DSL
MOS scores reveal that RBC has a quality rating of ‘Good’
None HPF RBC NALR POGO HG NALRP DSL
1-Bad
2-Poor
3-Fair
4-Good
5-Excellent
Algorithm
Ave. MOSs of 15 HI subjects
None HPF RBC NALR POGO HG NALRP DSL-20
-15
-10
-5
0
5
Algorithm
SN
R r
ela
tiv
e t
o b
as
eli
ne
(d
B)
Ave. HINT scores of 15 HI subjects
![Page 41: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/41.jpg)
Subjective HINT and MOS scores for RBC:normal hearing, cell phone speech
RBC has a 12 dB improvement in SI when compared to DSL
MOS scores reveal that RBC has a quality rating of ‘Good’
![Page 42: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/42.jpg)
Cochlear Implants
The first fully functional Brain Machine The first fully functional Brain Machine Interface (BMI)Interface (BMI)
Definition:
A device that electrically stimulates the auditory nerve of patients with severe-to-profound hearing loss to provide them with sound and speech information
![Page 43: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/43.jpg)
Who is a candidate?
• Severe-to profound sensorineural hearing loss
• Hearing loss did not reach severe-to-profound level until after acquiring oral speech and language skills
• Limited benefit from hearing aids
![Page 44: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/44.jpg)
• Worldwide:– Over 100,000 multi-channel implants
• At Univ of Florida:– Implanted first patient in 1985– Currently follow over 400 cochlear patients
CI statistics
![Page 45: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/45.jpg)
Technical and Safety Issues
• Magnetic Resonance Imaging• Surgical issues
![Page 46: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/46.jpg)
How does the Cochlea encode frequencies?
![Page 47: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/47.jpg)
![Page 48: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/48.jpg)
![Page 49: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/49.jpg)
Example: New Freedom
![Page 50: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/50.jpg)
![Page 51: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/51.jpg)
CI characteristics
1. Electrode design – Number of electrodes, electrode configuration
2. Type of stimulation – Analog or pulsatile
3. Transmission link – Transcutaneous or percutaneous
4. Signal processing – Waveform representation or feature extraction
![Page 52: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/52.jpg)
Signal processing
• Compressed Analog (CA)• Continuous Interleaved Sampling (CIS)• Multiple Peak (MPEAK )• Spectral Maxima Sound Processor (SMSP)• Spectral Peak (SPEAK)
![Page 53: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/53.jpg)
Compressed Analog (CA) approach
![Page 54: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/54.jpg)
CA activation signals
![Page 55: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/55.jpg)
Continuous Interleaved Sampling (CIS)
![Page 56: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/56.jpg)
CIS activation signals
![Page 57: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/57.jpg)
Multiple Peak (MPEAK)
![Page 58: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/58.jpg)
MPEAK activated electrodes
![Page 59: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/59.jpg)
Spectral Maxima Sound Processor (SMSP)
![Page 60: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/60.jpg)
SMSP activated electrodes
![Page 61: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/61.jpg)
Spectral Peak (SPEAK)
![Page 62: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/62.jpg)
SPEAK activated electrodes
![Page 63: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/63.jpg)
Outcomes for Post-lingual Adults
• Wide range of success
• Most score 90-100% on AV sentence materials
• Majority score > 80% on high context
• Performance more varied on single word tests
![Page 64: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/64.jpg)
Auditory Brainstem Implant
• Approved October 20, 2000• Uses the Nucleus 24 system
processors• Plate array with 21 electrodes
![Page 65: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/65.jpg)
Review-1Pinna:
ITDs,IIDs: Horizontal localizationReflections: Vertical localization
Ear canal:¼ wave resonance 1-3 kHz
Middle ear:Amplification by lever action and by areaStapedius reflex
Cochlea:IHCs/OHCs: convert mechanical to electricalPlace theory: frequency analysisMissing fundamental
![Page 66: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/66.jpg)
Review-2
Adaptation: AN firing sensitive to changes
Otoacoustic emissions:Produced by movement of OHCs
Beats:Monaural & binaural
Measurement of hearing:Audiogram: threshold of hearingThreshold variation with ageEqual loudness curves
Bass loss problem: discrimination against LFs
![Page 67: Meena Ramani 04/12/06](https://reader035.vdocuments.us/reader035/viewer/2022062221/56814457550346895db0f311/html5/thumbnails/67.jpg)
Review-3
Critical bands:used for efficient encodingBark scale
Masking:Frequency: LFs mask moreTemporal: simultaneous, pre and post
Hearing impairment:Hearing aids: external to cochleaCochlear implants: inside cochlea