digital sound and video chapter 10, exploring the digital domain

Digital Sound and Video

Chapter 10,Exploring the Digital Domain

Digital Sampling of Sound

Digital sound is sound that has been converted to, or created in, a discrete form (namely a set of numeric values)

Natural sound is a continuous phenomena and is converted to digital form by sampling techniques

Properties of sound waves amplitude (measure of loudness) frequency (measure of pitch)

Continuous Sound Wave

Sampling Sound

Temporal sampling sample rate resolution

Analog-to-digital converters (ADCs) Digital-to-analog converters (DACs)

Sampling of a Sound Wave Illustrated

Sampling of a Sound Wave Illustrated (cont’d)

Resolution and Dynamic Range

Resolution depends on how much memory is devoted to storing individual sample amplitudes 8-bit sound 16-bit sound

Dynamic range and clipping

Resolution Illustrated

Storing Digital Sound

Sample rate and resolution -- tradeoffs

Voice and speech: 8-bit resolution and 5-10 KHz sample rate

CD-quality music: 16-bit or higher resolution and 44 KHz

Nyquist’s “Theorem” and aliasing Audio file compression Sound on the Web -- streaming audio

Synthesizing Music

Uses simple waveforms and oscillators to “build” more complex sound waves

Various techniques are employed to do this “build” Subtractive synthesis Additive synthesis FM synthesis Phase distortion synthesis Integrated synthesis

Impose an ADSR (attack, decay, sustain, release) envelope to simulate instruments

Oscillators

ADSR Envelope

MIDI Instruments and Devices

MIDI is a standard interface between electronic musical instruments and synthesizers

Devices which are MIDI-compatible can communicate with each other

Advantages to MIDI files are encoded and are much smaller than

digitized sound files files can be easily edited and mixed for

multiple tracks

Speech Synthesis

Speech synthesis involves creating speech from written text (or other encodings)

Two approaches Store digitized recordings of words Analysis of written text focuses on

breaking the text into phonemes

Digitized Word Approach Parser separates text into words Uses a binary search tree to look up words Similar to a spelling dictionary – instead an

enunciation dictionary Advantage is very realistic sound Disadvantage is lack of speed

Works well if the number of words are limited Grocery checkout Alert systems

Not very useful for real-time “reading” where the words in the text is not know in advance

Rule-Based Phoneme Approach

Analysis of written text focuses on breaking the text into phonemes rather than words Text is parsed for phonemes Phonemes are identified Enunciation is looked up in a small

indexed file English employs approximately 50

basic phonemes

Phoneme Approach (cont’d)

Rules allow a speech synthesis program to evaluate alternate pronunciations of phonemes appropriate for the context

Such rule-based phoneme analysis produces excellent speech synthesis results

Speech Recognition

Speech recognition attempts to interpret digitized speech for meaning

The task is complicated by the differences among speakers and even the different ways a given speaker might pronounce the same word depending on mood, context, etc.

Speech Recognition:Illustrating the Difficulties

Speech Recognition (cont’d)

Some success has been achieved by tailoring/training a program to recognize a particular speaker

Some reasonably successful voice activation systems have been produced where vocabulary is limited to small number of words

Speech recognition remains a very challenging problem

Editing Digitized Sound

Digital sound editing is part of a larger field called digital signal processing

Once sound is digitized, it is in discrete numerical format

Numerical transformations on this data can be used to: change the sound’s pitch change the sound’s amplitude add echoes and other special effects

Summary -- Sound

Digital sound is produced by sampling sound waves over time

A digital sound file consists of sampled amplitudes at a number of discrete times within a given time interval

The number of samples per second is called the sample rate

The number of bits devoted to storing individual sampled amplitudes is called the resolution of the digitized sound: 8-bit, 16-bit and higher resolutions are used depending on the kind of sound being digitized

Fidelity will be largely determined by the sample rate and resolution

Producing Digital Video

Video capture Editing Playback

Digital Video The Process Illustrated

Digital Video:Advantages and Disadvantages

Advantages Scalable to different playback systems Random access to frames Nonlinear editing More playback options Potential for interactivity

Disadvantages Special hardware/software needed for

production High playback and storage requirements

Compression: Coping with Large Files

Compression is an encoding process that filters the original file in several successive stages

Other Methods for Reducing Demands Frame rate adjustment

adjusts for with slower CPUs helps keep video and audio synchronized

Lower resolution on individual frames sometimes used in conjunction with

smaller display window

The Desktop Video SystemBasic Components

Analog Source Video Capture Card CPU Secondary Storage Monitor Edit and Playback

Control

Editing Digital Video

Clip Logging Assembling Transitions

dissolves wipes, etc.

Rotoscoping Compositing

keying titling

Summary --Video

Digital video is: scalable allows nonlinear editing has interactive potential

Digital video can be produced with desktop systems

Flexible editing and playback options are major advantages

Storage requirement is biggest disadvantage

digital sound and video chapter 10, exploring the digital domain

Documents

sound wave illustratedsampling

digitized sound filesfiles

digitized speech

speech synthesis program

digital form

digital converters adcsdigital

higher resolution

spelling dictionary