cs 414 multimedia systems design lecture 3 digital audio

34
CS 414 - Spring 2012 CS 414 Multimedia Systems Design Lecture 3 Digital Audio Representation Klara Nahrstedt Spring 2012

Upload: others

Post on 30-Jan-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

CS 414 - Spring 2012

CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio

Representation

Klara Nahrstedt

Spring 2012

Page 2: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

CS 414 - Spring 2012

Administrative

Form Groups for MPs

Deadline January 23 (today!!) to email TA

Page 3: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Integrating Aspects of Multimedia

CS 414 - Spring 2012

Image/Video

Capture

Image/Video Information

Representation

Media

Server

Storage

Transmission

Compression

Processing

Audio/Video

Presentation

Playback Audio/Video

Perception/

Playback

Audio Information

Representation

Transmission

Audio

Capture

A/V

Playback

Page 4: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Today Introduced Concepts

Analog to Digital Sound Conversion

Sampling, sampling rate, Nyquist Theorem

Quantization, pulse code modulation,

differentiated PCM, Adaptive PCM

Signal-to-Noise Ratio

Data Rates

CS 414 - Spring 2012

Page 5: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Key Questions

How can a continuous wave form be

converted into discrete samples?

How can discrete samples be converted

back into a continuous form?

CS 414 - Spring 2012

Page 6: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Lifecycle from Sound to Digital to Sound

Source: http://en.wikipedia.org/wiki/Digital_audio

Page 7: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Characteristics of Sound

Amplitude

Wavelength (w)

Frequency ( )

Timbre

Hearing: [20Hz – 20KHz]

Speech: [200Hz – 8KHz]

Page 8: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Digital Representation of Audio

Must convert wave

form to digital

sample

quantize

compress

Page 9: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Sampling (in time)

Measure amplitude at regular intervals

How many times should we sample?

CS 414 - Spring 2012

Page 10: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Nyquist Theorem

For lossless digitization, the sampling rate

should be at least twice the maximum

frequency response.

In mathematical terms:

fs > 2*fm

where fs is sampling frequency and fm is

the maximum frequency in the signal

CS 414 - Spring 2012

Page 11: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Nyquist Limit

max data rate = 2 H log2V bits/second, where

H = bandwidth (in Hz)

V = discrete levels (bits per signal change)

Shows the maximum number of bits that can be sent

per second on a noiseless channel with a bandwidth

of H, if V bits are sent per signal

Example: what is the maximum data rate for a 3kHz

channel that transmits data using 2 levels (binary) ?

Solution: H = 3kHz; V = 2;

max. data rate = 2x3,000xln2=6,000bits/second

CS 414 - Spring 2012

Page 12: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Limited Sampling

But what if one cannot sample fast

enough?

CS 414 - Spring 2012

Page 13: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Limited Sampling

Reduce signal frequency to half of

maximum sampling frequency

low-pass filter removes higher-frequencies

e.g., if max sampling frequency is 22kHz,

must low-pass filter a signal down to 11kHz

CS 414 - Spring 2012

Page 14: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Sampling Ranges

Auditory range 20Hz to 22.05 kHz

must sample up to to 44.1kHz

common examples are 8.000 kHz, 11.025

kHz, 16.000 kHz, 22.05 kHz, and 44.1 KHz

Speech frequency [200 Hz, 8 kHz]

sample up to 16 kHz

but typically 4 kHz to 11 kHz is used

CS 414 - Spring 2012

Page 15: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

CS 414 - Spring 2012

Page 16: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Quantization

-1.5

-1

-0.5

0

0.5

1

1.5

CS 414 - Spring 2012

Page 17: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Sampling and 4-bit quantization

Source: http://en.wikipedia.org/wiki/Digital_audio

Page 18: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Quantization

Typically use

8 bits = 256 levels

16 bits = 65,536 levels

How should the levels be distributed?

Linearly? (PCM)

Perceptually? (u-Law)

Differential? (DPCM)

Adaptively? (ADPCM)

CS 414 - Spring 2012

Page 19: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Pulse Code Modulation

Pulse modulation

Use discrete time samples of analog signals

Transmission is composed of analog

information sent at different times

Variation of pulse amplitude or pulse timing

allowed to vary continuously over all values

PCM

Analog signal is quantized into a number of

discrete levels

CS 414 - Spring 2012

Page 20: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

PCM Quantization and Digitization

CS 414 - Spring 2012

Digitization Quantization

Page 21: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Linear Quantization (PCM)

Divide amplitude spectrum into N units (for

log2N bit quantization)

0

50

100

150

200

250

300

0 50 100 150 200 250 300

Sound Intensity

Qu

anti

zati

on I

nd

ex

CS 414 - Spring 2012

Page 22: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Non-uniform Quantization

CS 414 - Spring 2012

Page 23: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Perceptual Quantization (u-Law)

Want intensity values logarithmically

mapped over N quantization units

0

50

100

150

200

250

300

0 50 100 150 200 250 300

Sound Intensity

Qu

anti

zati

on I

nd

ex

CS 414 - Spring 2012

Page 24: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Differential Pulse Code

Modulation (DPCM)

What if we look at sample differences, not

the samples themselves?

dt = xt-xt-1

Differences tend to be smaller

Use 4 bits instead of 12, maybe?

CS 414 - Spring 2012

Page 25: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Differential Pulse Code

Modulation (DPCM) Changes between adjacent samples small

Send value, then relative changes

value uses full bits, changes use fewer bits

E.g., 220, 218, 221, 219, 220, 221, 222, 218,.. (all

values between 218 and 222)

Difference sequence sent: 220, +2, -3, 2, -1, -1, -1,

+4....

Result: originally for encoding sequence 0..255

numbers need 8 bits;

Difference coding: need only 3 bits

CS 414 - Spring 2012

Page 26: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Adaptive Differential Pulse Code

Modulation (ADPCM) Adaptive similar to DPCM, but adjusts the width of the

quantization steps

Encode difference in 4 bits, but vary the mapping of bits

to difference dynamically

If rapid change, use large differences

If slow change, use small differences

CS 414 - Spring 2012

Page 27: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Signal-to-Noise Ratio (metric to quantify quality of digital audio)

CS 414 - Spring 2012

Page 28: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Signal To Noise Ratio

Measures strength of signal to noise

SNR (in DB)=

Given sound form with amplitude in [-A, A]

Signal energy =

)(log10 10

energyNoise

energySignal

-1

-0.75

-0.5

-0.25

0

0.25

0.5

0.75

1A

0

-A

2

2A

CS 414 - Spring 2012

Page 29: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Quantization Error

Difference between actual and sampled value

amplitude between [-A, A]

quantization levels = N

e.g., if A = 1,

N = 8, = 1/4

N

A2

-1

-0.75

-0.5

-0.25

0

0.25

0.5

0.75

1

CS 414 - Spring 2012

Page 30: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Compute Signal to Noise Ratio

Signal energy = ; Noise energy = ;

Noise energy =

Signal to noise =

Every bit increases SNR by ~ 6 decibels

12

2

N

A2

2

2

3 N

A

2

3log10

2N

2

2A

CS 414 - Spring 2012

Page 31: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Example

Consider a full load sinusoidal modulating signal of

amplitude A, which utilizes all the representation levels

provided

The average signal power is P= A2/2

The total range of quantizer is 2A because modulating

signal swings between –A and A. Therefore, if it is N=16

(4-bit quantizer), Δ = 2A/24 = A/8

The quantization noise is Δ2/12 = A2/768

The S/N ratio is (A2/2)/(A2/768) = 384; SNR (in dB) 25.8

dB

CS 414 - Spring 2012

Page 32: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Data Rates

Data rate = sample rate * quantization * channel

Compare rates for CD vs. mono audio

8000 samples/second * 8 bits/sample * 1 channel = 8 kBytes / second

44,100 samples/second * 16 bits/sample * 2 channel = 176 kBytes / second ~= 10MB / minute

CS 414 - Spring 2012

Page 33: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Comparison and

Sampling/Coding Techniques

CS 414 - Spring 2012

Page 34: CS 414 Multimedia Systems Design Lecture 3 Digital Audio

Summary

CS 414 - Spring 2012