week 6 february 19, 2013 human psychoacousticsese250/week6/week6_s13.pdf · the physical ear •...
TRANSCRIPT
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah 1
ESE250:
Digital Audio Basics
Week 6
February 19, 2013
Human Psychoacoustics
2
Course Map
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
Where are we? • Week 2 Received signal is sampled &
quantized
q = PCM[ r ]
• Week 4 Sampled signal first
transformed into frequency domain
Q = DFT[ q ]
• Week 3 Quantized Signal is Coded
c =code[ q ]
• Week 5 signal oversampled & low
pass filtered
Q = LPF[ DFT(q+n) ]
• Week 6 Transformed signal analyzed
Using human psychoacoustic models
• Week 7 Acoustically Interesting signal
is “perceptually coded”
C = MP3[ Q]
Over
Sample DFT LPF
Decode Produce
r(t)
p(t)
q + n
C Perceptual
Coding
Store /
Transmit
Q + N Q
Week 4
Week 6
Week 5 Week 3
[Painter & Spanias. Proc.IEEE, 88(4):451–512, 2000]
3 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
4
The Physical Ear
• External Sound Waves
Guided by outer ear
into auditory canal
• Excite Inner Ear
Through mechanical linkage
connecting ear drum
to cochlea
[R. Munkong and B.-H. Juang. IEEE Sig. Proc. Mag., 25(3):98–117, 2008]
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
5
The Physical Ear
• Initiates signal processing
frequency domain analysis
Via analog computation
Video: Cochlea
• What part of the Cochlea vibrates for an 800 Hz square wave?
[R. Munkong and B.-H. Juang. IEEE Sig. Proc. Mag., 25(3):98–117, 2008]
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
6
The Cognitive Ear • Modern Psychoacoustics
Benefits greatly from o decades of neural recording o contemporary brain imaging technology
[R. Munkong and B.-H. Juang. IEEE Sig. Proc. Mag., 25(3):98–117, 2008]
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
7
Power Spectrum Model of Hearing
• Rough Picture (main content of today’s lecture):
Critical Bands: Auditory system contains finite array
of adaptively tunable, overlapping bandpass filters
Frequency Bins: humans process a signal’s
component (against noisy background) in the one
filter with closest center frequency
Masking: certain signal components in a given band
are “favored” and others are filtered out
• Established through decades of psychoacoustic
experiments
B.C.J. Moore. Int.Rev.Neurobiol., 70:49–86, 2005.
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
8
Auditory Thresholds
• In the lab, you varied the frequency, amplitude and
phase of signals
• What was the effect of each, if any, on the sound you
heard?
Frequency
Amplitude
Phase
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
)2sin()( ftAts
Auditory Thresholds
• Harvey Fletcher (1940)
Played pure tones varying
o frequency, f [ Hz]
o Intensity,
I [Dyn ¢ cm-2]
= 10-5 [N ¢ cm-2]
= 0.1 Pa
o phase changes tend to be inaudible
Large listener population
o Young
o Acute
• Recorded extreme thresholds faintest audible
greatest tolerable
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
(http://www.et.byu.edu/)
10
Auditory Thresholds • Results:
pain-free hearing range extends at most over 20 Hz – 20 KHz
with sensitivity » 2 ¢ 10-4 ¢ 0.1 Pa = 20 Pa
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
0.1 Pa
[H. Fletcher. Rev. Mod. Phys., 12(1):47–65, 1940].
11
The decibel unit • Define standard pressure: p0 = 0.0002 ¢ 0.1 Pa = 20 Pa
• Threshold of human hearing
• Compute Sound Pressure Level as: LSPL = 20 log10(p/p0) dB
• LSPL for p1 = 20 Pa , for p2 = 200 Pa , for p3 = 20 mPa
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
Compare to
Ambient sea-level pressure:
1 Atmosphere
= 105 Pascal
• Q: why use log-log
scale?
• A1: dynamic range
• A2: “loudness” is a
power function
0.1 Pa
12
The decibel unit – Hearing intensity
Week 6 – Psychoacoustics
(http://www.dspguide.com/ch22/1.htm)
13
Let’s try to reproduce these results!
Week 6 – Psychoacoustics
(http://www.dspguide.com/ch22/1.htm)
• We will listen to single sine tones starting at a frequency of 10KHz, all the
way up to 20KHz, so each student can figure out their cut-off frequency
• Suggestions to improve this experiment?
14
Animal hearing ranges
• Dogs: Greater hearing range: 40Hz to 60KHz
Ultrasonic dog whistles
• Mice: Large ears in comparison to their bodies
Hearing range: 1KHz to 70KHz
Can’t hear low frequency noises
Communicate with high frequency
Distress call (40KHz), alert of predator
[Pictures from Wikipedia] Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
15
Why Sinusoids?
• Why not some other harmonic series?
Fourier’s analysis shows
harmonic analysis could be based on
arbitrary smooth periodic fundamental
• Why does the animal receiver use
sinusoids?
• Hamiltonian Mechanics
Simplest physical model of vibrating
masses
Coupled spring-mass-damper mechanics
Produce sinusoidal harmonics • Video: Cochlea
m
x
b k
…. all sound
is produced
by vibrating
masses ….
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
16
Masking - Spatial
• Masking Paradigms “Masker” masking “maskee”
Tone Masking Noise o pure tone
of 80 SPL
at 1 kHz
o just masks “critical band” noise of 56 SPL
centered at 1 kHz
Masker-to-Maskee ratio o Constant for fixed relative frequency and varying amplitude
o Changes with varying relative frequency
[T. Painter and A. Spanias. Proc. IEEE, 88(4):451–512, 2000.]
1 “Bark”
frequency
interval
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
17
Masking
[H. Fletcher. Rev. Mod. Phys., 12(1):47–65, 1940].
The first graph shows the masking pattern for a 200Hz tone
Mostly masks tones around 200Hz, but also at harmonics
The second graph shows the same plot for different frequencies,
but only the fundamental part
Notice that the band gets wider for increasing frequencies
…masker at fundamental
can somewhat mask maskees
at the harmonics …
… but the “spreading
curve” is traditionally
depicted over the
fundamental only
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
18
Tone Masking Noise
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
• Are the following signals masked?
200 Hz tone at 80dB
200 Hz tone at 40dB
300 Hz tone at 40dB
400 Hz tone at 40dB
700 Hz tone at 30dB
19
Masking [H. Fletcher. Rev. Mod. Phys., 12(1):47–65, 1940].
• Tone Masking Noise (Fig 12) value above quiet threshold
such that a signal at the abscissa frequency
can be heard in presence of
top: 200 Hz tone
bottom: various frequencies
• Noise Masking Tone (Fig 13) dots show pure tone magnitude
(in dB)
required to be audible above noise
o Of the magnitude on the middle curve
o centered at that frequency
o with bandwidth
at least wider
than the bars of Fig 12
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
20 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
• Are the
following
signals masked
by the noise?
200Hz at 60dB
1KHz at 60dB
Noise Masking Tone
21 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
• Are the
following
signals masked
by the noise?
200Hz at 60dB
o Yes!
1KHz at 60dB
Noise Masking Tone
noise
22 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
• Are the
following
signals masked
by the noise?
200Hz at 60dB
o No!
1KHz at 60dB
Noise Masking Tone
23 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
• Are the
following
signals masked
by the noise?
200Hz at 60dB
1KHz at 60dB
o No!
Noise Masking Tone
24 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
• Are the
following
signals masked
by the noise?
200Hz at 60dB
1KHz at 60dB
o No!
Noise Masking Tone
25
Masking - Temporal • Temporal Masking Masker effect persists for tenths of a second
Masker effect is “acausal” o on ~ 2/100 timescales
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
26
Pitch JND • JND = “just noticeable difference”
change in stimulus that “just” elicits perceptual notice
where “just” means that a smaller variations of stimulus cannot be discerned
[H. Fletcher. Rev. Mod. Phys., 12(1):47–65, 1940].
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
• What can you say about the JND: Below 1000 Hz?
o roughly constant
o ~ 3 Hz
Above 1000 Hz? o roughly log-log
linear
o Log[Jnd(f2)] - Log[ Jnd(f1)]
~ n (Log[f2] - Log[f1])
• Suggests that as frequency increases broader frequency
bands
“assigned” to same length of cochlear tissue
Remember cochlea model
What is n?
e.g. f1 =2000 f2 =4000
6 = 10 – 4 ~ n( Log10[2] )
) n ~ 20
27
JND experiment
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
• The following audio files contain a single
tone playing for 10 seconds. The sine
starts at 200Hz, then changes to a higher
frequency (201, 202, 203, 205, 210).
• This change occurs after a number of
“noises”: 1, 2, 3, 4, 5, 6, 7, 8 or 9.
• Can you notice when the change
happens?
28
Critical Bands
Decades of empirical study • reveal that human audio frequency
perception
• is quantized into < 30 “critical bands”
• of perceptually near-identical pitch classes
• corresponding to ~equal length bands of cochlear tissue (neurons)
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
29
Critical Bands: Evidence
Tone masking Noise (Fig. a & c)
o noise audibility threshold
o for small bandwidth noise
o remains constant
o until tone frequency locus
o falls away from critical
bandwidth
Noise masking Tone (Fig. b & d)
o same effect
o with masker and maskee
roles reversed
[T. Painter and A. Spanias. Proc. IEEE, 88(4):451–512, 2000.]
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
30
The Bark Scale
• “Bark” units: Uniform JND scale for frequency
Maps frequency intervals into their respective critical band number
[E. Zwicker. J. Acoust. Soc.Am., 33(2):248, February 1961]
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
31
The Bark Scale
• Frequency-to-Bark function First Principles vs. Empirical Modeling
[E. Zwicker. J. Acoust. Soc.Am., 33(2):248, February 1961]
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
))7500/((tan5.3)00076.0(tan13)( 211 fffB
32
Compression opportunities
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
Consider the following recording
Any ways to improve the compression?
33
Compression opportunities
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
Zooming in on a smaller portion
Any ways to improve the compression?
200Hz 205Hz Frequency
195Hz 193 194 196 197 198 199 201 202 203 204 206 207 208
dB
80
0
20
40
60
100
120
Masked
34
Compression opportunities
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
Zooming in on a smaller portion
Any ways to improve the compression?
200Hz 205Hz Frequency
195Hz 193 194 196 197 198 199 201 202 203 204 206 207 208
dB
80
0
20
40
60
100
120
JND:
Could only
represent integer
frequency values
35
Compression opportunities
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
Zooming in on a smaller portion
Any ways to improve the compression?
200Hz 205Hz Frequency
195Hz 193 194 196 197 198 199 201 202 203 204 206 207 208
dB
80
0
20
40
60
100
120
36
Next Week
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
• How can we use what we know about
human perception to compress music?
Frequency hearing range
Masking
o Temporal
o Spatial
o JND
o Barks
37
Big Ideas
• Sound is a pressure wave that makes the Cochlea vibrate with frequencies from ~20Hz (at the tip) to ~20KHz (at the base)
• This vibration is sinusoidal (physics) This is why sound harmonics are best represented as sinusoidal signals
• Masking Temporal – A masker tone can mask another tone that is present either
right before or a little after the masker
Spatial – A single tone can mask an entire frequency band (that contains the tone) if its intensity is high enough
There are <30 such bands (Bark scale), and they are wider for higher frequencies
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
38
Admin
• Lab 5 report due tomorrow
• On Thursday: Lab 6 You will be designing your own experiments
o To measure the range of frequencies you can hear
o To perform spatial masking experiments
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
39
ESE250:
Digital Audio Basics
End Week 6 Lecture
Human
Psychoacoustics
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah