digital sound and video chapter 10, exploring the digital domain
TRANSCRIPT
Digital Sound and Video
Chapter 10,Exploring the Digital Domain
Digital Sampling of Sound
Digital sound is sound that has been converted to, or created in, a discrete form (namely a set of numeric values)
Natural sound is a continuous phenomena and is converted to digital form by sampling techniques
Properties of sound waves amplitude (measure of loudness) frequency (measure of pitch)
Continuous Sound Wave
Sampling Sound
Temporal sampling sample rate resolution
Analog-to-digital converters (ADCs) Digital-to-analog converters (DACs)
Sampling of a Sound Wave Illustrated
Sampling of a Sound Wave Illustrated (cont’d)
Sampling of a Sound Wave Illustrated (cont’d)
Resolution and Dynamic Range
Resolution depends on how much memory is devoted to storing individual sample amplitudes 8-bit sound 16-bit sound
Dynamic range and clipping
Resolution Illustrated
Storing Digital Sound
Sample rate and resolution -- tradeoffs
Voice and speech: 8-bit resolution and 5-10 KHz sample rate
CD-quality music: 16-bit or higher resolution and 44 KHz
Nyquist’s “Theorem” and aliasing Audio file compression Sound on the Web -- streaming audio
Synthesizing Music
Uses simple waveforms and oscillators to “build” more complex sound waves
Various techniques are employed to do this “build” Subtractive synthesis Additive synthesis FM synthesis Phase distortion synthesis Integrated synthesis
Impose an ADSR (attack, decay, sustain, release) envelope to simulate instruments
Oscillators
ADSR Envelope
MIDI Instruments and Devices
MIDI is a standard interface between electronic musical instruments and synthesizers
Devices which are MIDI-compatible can communicate with each other
Advantages to MIDI files are encoded and are much smaller than
digitized sound files files can be easily edited and mixed for
multiple tracks
Speech Synthesis
Speech synthesis involves creating speech from written text (or other encodings)
Two approaches Store digitized recordings of words Analysis of written text focuses on
breaking the text into phonemes
Digitized Word Approach Parser separates text into words Uses a binary search tree to look up words Similar to a spelling dictionary – instead an
enunciation dictionary Advantage is very realistic sound Disadvantage is lack of speed
Works well if the number of words are limited Grocery checkout Alert systems
Not very useful for real-time “reading” where the words in the text is not know in advance
Rule-Based Phoneme Approach
Analysis of written text focuses on breaking the text into phonemes rather than words Text is parsed for phonemes Phonemes are identified Enunciation is looked up in a small
indexed file English employs approximately 50
basic phonemes
Phoneme Approach (cont’d)
Rules allow a speech synthesis program to evaluate alternate pronunciations of phonemes appropriate for the context
Such rule-based phoneme analysis produces excellent speech synthesis results
Speech Recognition
Speech recognition attempts to interpret digitized speech for meaning
The task is complicated by the differences among speakers and even the different ways a given speaker might pronounce the same word depending on mood, context, etc.
Speech Recognition:Illustrating the Difficulties
Speech Recognition (cont’d)
Some success has been achieved by tailoring/training a program to recognize a particular speaker
Some reasonably successful voice activation systems have been produced where vocabulary is limited to small number of words
Speech recognition remains a very challenging problem
Editing Digitized Sound
Digital sound editing is part of a larger field called digital signal processing
Once sound is digitized, it is in discrete numerical format
Numerical transformations on this data can be used to: change the sound’s pitch change the sound’s amplitude add echoes and other special effects
Summary -- Sound
Digital sound is produced by sampling sound waves over time
A digital sound file consists of sampled amplitudes at a number of discrete times within a given time interval
The number of samples per second is called the sample rate
The number of bits devoted to storing individual sampled amplitudes is called the resolution of the digitized sound: 8-bit, 16-bit and higher resolutions are used depending on the kind of sound being digitized
Fidelity will be largely determined by the sample rate and resolution
Producing Digital Video
Video capture Editing Playback
Digital Video The Process Illustrated
Digital Video:Advantages and Disadvantages
Advantages Scalable to different playback systems Random access to frames Nonlinear editing More playback options Potential for interactivity
Disadvantages Special hardware/software needed for
production High playback and storage requirements
Compression: Coping with Large Files
Compression is an encoding process that filters the original file in several successive stages
Other Methods for Reducing Demands Frame rate adjustment
adjusts for with slower CPUs helps keep video and audio synchronized
Lower resolution on individual frames sometimes used in conjunction with
smaller display window
The Desktop Video SystemBasic Components
Analog Source Video Capture Card CPU Secondary Storage Monitor Edit and Playback
Control
Editing Digital Video
Clip Logging Assembling Transitions
dissolves wipes, etc.
Rotoscoping Compositing
keying titling
Summary --Video
Digital video is: scalable allows nonlinear editing has interactive potential
Digital video can be produced with desktop systems
Flexible editing and playback options are major advantages
Storage requirement is biggest disadvantage