3/23/2002 copyright james d. johnston 2003. permission granted for any educational use. 1 what can...

29
3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/ ~retired_old_jj [email protected]

Upload: paul-kennedy

Post on 17-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use.

1

What can we hear?

James D. Johnston

home.comcast.net/~retired_old_jj

[email protected]

Page 2: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 2

How do our ears work?

And what can we detect in a

natural soundfield?

Page 3: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 3

How One Ear Works - Short Form! We aren’t going to talk

about binaural today.(this being the short-form of what should occupy a

semester’s examination and discussion)

The ear is usually broken into 3 separate parts, theouter, middle, and inner ears. The outer ear consistsof the head, the pinna, and the ear canal. The middleear consists of the eardrum, the 3 small bones, and the

connection to the cochlea. Finally, the inner ear consistsof the cochlea, containing the organ of corti,

basilar membrane, tectoral membrane, and the associated fluids and spaces.

Page 4: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 4

The Outer EarThe outer ear provides frequency directivity via shadowing,shaping, diffraction, and the like. It is different (by enoughto matter) for different individuals, but can be summarized

by the “Head Related Transfer Functions” (HRTF’s) or “Head Related Impulse Responses”(HRIR’s) mentioned in the

literature, at least on the average or for a given listener.

The HRTF’s or HRIR’s are ways of determining the effect ona sound coming from a given direction to a given ear.

The ear canal inserts a 1 octave or so wide resonance at about1 to 4 kHz depending on the individual.

Page 5: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 5

The Middle EarThe middle ear carries out several functions, the most

important of which, for levels and frequencies thatare normally (or wisely) experienced, is matching the

impedence of the air to the fluid in the cochlea.

There are several other functions related to overloadprotection and such, which are not particularly germane

under comfortable conditions.

The primary effect of the middle ear is to provide a 1-zero

high pass function, with a matching pole at approximately

700Hz or so, depending on the individual.

Page 6: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 6

The Inner Ear

A complicated subject at best, the inner ear can be thought ofas having two membranes, each a travelling wave filter,

one a high-pass, and the other a low-pass filter. Between thetwo membranes are two sets of hair cells, the inner hair cells,

and the outer hair cells. The inner hair cells are primarilydetectors. They fire when the movement of the two membranes

are different. The outer hair cells are primarily a system thatcontrols the exact points of the very steep low pass filters and

high pass filters. The outer hair cells can polarize and depolarize,and change both their length and stiffness. This polarization is

how they affect the relative tunings of the two membranes.

Page 7: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 7

Outer Hair Cells FullyDepolarized

Outer Hair Cells Fully

Polarized

frequency

Page 8: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 8

An example (not a human subject)

Page 9: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 9

The exact magnitude and shape of thosecurves are under a great deal of discussion andexamination, but it seems clear that, in fact, thepolarization of the outer hair cells creates the

compression exhibited in the difference betweenapplied intensity (the external power) and theinternal loudness (the actual sensation level

experienced by the listener).

There is at least 60dB of compression available. Fortunately, the shape of the resulting curve doesnot change very much, except at the tails, betweenthe compressed and uncompressed state, leading to

a set of filter functions known as the cochlear filters.

Page 10: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 10

Critical Bands and Cochlear Filters

The overall effect of this filter structure is time/frequency analysis of a particular sort, called critical band (Bark

Scale) or effective rectangular bandwidth (ERB) filter functions. Note that this

is not a set of filters, but rather a continuous set of filters,with lower and higher bandwidths varying according to

the center frequency.

Roughly speaking, critical bandwidths are about 100Hz up to700Hz, and 1/3 octave thereafter. ERB’s are usually

a bit narrower, especially at higher frequencies.

Page 11: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 11

A discussion of which is right, and which should be used,is, by itself, well beyond the range of a one-hour seminar.

The basic point that must come out of this discussion is thatthe sound arriving in an ear will be analyzed in somethingapproximating 100Hz bandwidth filters at low frequencies,

and at something like 1/3 octave bandwidths at higherfrequencies, and that the system will detect either thesignal waveform itself (below 500Hz) or the signalenvelope (above 4000 Hz), or a bit of both (in the

range between 500Hz and 4000 Hz). Exactly what isdetected is likewise, by itself, well beyond a one hour

seminar, and furthermore, a consensus is yet to emerge.

Page 12: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 12

Page 13: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 13

For a given cochlear filter bandwidth, there is a corresponding time width of the main lobe of the filter. For the auditory

system, these filter lengths vary approximately by a factor of 40:1, from the range of 10 milliseconds down to .25

millisecond.

This means that at low frequencies, the time resolution available to the ear is quite poor, but that at high frequencies, it is quite

accurate, on the order of a dozen or so samples at 48kHz.

Over any time extent longer than this, the ear, due to the compression effects of the ear, can not be considered a linear

transducer. This can create problems, such as pre-echo, in filterbanks or even in simple filters under some situations.

Page 14: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 14

2.25kHz filter

750Hz Filter

Page 15: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 15

.

.

.

.

HF

LF

FILTERS

Audio In

DETECTORS

AuditoryNerve

Feedback

CNSFeedback

Schematic Cochlea

Page 16: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 16

How about the detectors?

• Below 500Hz, the detectors fire on the positive going edge of the filtered waveform.

• Above 2kHz, the detectors fire synchronously with the ENVELOPE of the filtered waveform

• Between 500Hz and 2kHz, the detectors function on a mix of the two mechanisms.

Page 17: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 17

What does this mean, in practical terms.

• Below 500Hz, distorting the waveform itself, and moving zero-crossings of the filtered waveform (to to distortion, phase shifts, etc) will be audible.

• Above 2kHz, the same effects happen on the signal envelope. Again, phase shifts can radically change the signal envelope, as can distortions.

• Between 500Hz and 2kHz, both mechanisms will operate to some extent, with each favored toward its end of the frequency spectrum.

Page 18: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 18

So?

• At low frequencies, don’t change zero crossings or the signal waveform.

• At high frequencies, don’t change the signal envelope.

• Things like jitter, distortions, and phase shift can cause either of these problems.

Page 19: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 19

What are the hard level limits?

• The atmosphere, due to the discrete nature of air molecules, has a noise level. At the eardrum, it is approximately white noise at a level of 6dB SPL.

• The ear’s lowest detection level is about -6dB SPL, which nearly matches the energy in the critical band near the ear canal resonance due to basic atmospheric noise.

Page 20: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 20

Fletcher’s loudness plot goes here.

(From Fletcher)

Page 21: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 21

What about the loud end of things?

• Anything above 120dB SPL is bad for the auditory system.

• Anything above 140dB SPL is in a regime where the atmosphere is very nonlinear. Some signals (percussion, natural sounds, shuttle takeoffs) may reach these levels.

• More than 70-80dB of instantaneous dynamic range across frequency in a 20 millisecond period is approximately the largest spectral tilt that is audible.

Page 22: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 22

What does extreme loudness mean?

• 194dB SPL in a sine wave represents a sine wave that goes from zero to two atmospheres. This can not be physically realized.

• Above that level, the proper term is “shock wave”, as the air is propagating in a very nonlinear fashion.

• 32 bits of uniform PCM dynamic range takes us from the noise level of the atmosphere (6dB SPL) to 198dB SPL, or 4dB above 1 atmosphere. This level is usually experienced in catastrophic military situations.

Page 23: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 23

What about high and low frequencies?

• Frequencies in the lowest audio octave are sensed substantially by the body. The hearing apparatus has a high-pass filter, which is fortunate, because otherwise the “weather” would be deafeningly loud.

• 20kHz is not a firm “cutoff” for human hearing. Children appear to hear above 20kHz, as do some teens who haven’t been noise-exposed.

• Age and noise exposure reduce high-frequency hearing ability.

• At high power levels, ultrasonic signals are perceived on the skin. These levels are approached in sonar and the like, however the only musical occurrences may be from percussion, and at a close range.

Page 24: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 24

What about nonlinear effects?

• The ear analyzes on a time-scale much like that of the cochlear filters. If a long-term signal or signal-processing process is longer than the shortest cochlear filter, the effects of the nonuniform time/frequency scaling and detection must be considered.

Page 25: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 25

An extreme example, pre-echo in audio codecs.

Page 26: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 26

A potential, but unproven, issue with pre-echo.

Page 27: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 27

Some conclusions

• Audible effects must be considered as analyzed by critical band filters. These filters determine both time and frequency sensitivity to artifacts.

• Altering waveform at low frequencies, or signal envelope at high frequencies, will create audible differences.

Page 28: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 28

• 0dB SPL is a more than reasonable minimum level for presentations. More low-level response is only useful before the ear is involved.

• Recording engineers may meet levels peaking above 150dB or so, but they may not be either accurately recordable or reproducible.

• 20kHz is a reasonable limit for adult human beings, but is not a “hard limit”. An young individual may be able to hear above 20kHz. Other sensory modes are not generally active at high frequencies at levels that we hope to be exposed to.

Page 29: 3/23/2002 Copyright James D. Johnston 2003. Permission granted for any educational use. 1 What can we hear? James D. Johnston home.comcast.net/~retired_old_jj

3/23/2002 Copyright James D. Johnston 2003 29

• A variety of nonlinear effects may create audible differences due to small time or frequency changes in signals.

• In general, the farther removed from the original frequency that an artifact occurs, the more audible it will be, if it creates sensation or changes sensation at a point where signal energy on the basilar membrane is lower.