6/13/2015 cse5060 -- multimedia on the web -- lecture11 1 music hath charms to sooth the savage...

24
03/27/22 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

Post on 19-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 1

Music Hath Charms to Sooth the Savage Beast

Introduction to Sound Processing

Page 2: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

Some Sources Used

• Richard E Berg: Physics 102: PHYSICS OF MUSIC, University of Maryland

• Robert Jourdain (1997). Music, the Brain and Ecstasy. Quill. • Various bits of Wikipedia• Dolby Sound

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 2

Page 3: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 3

Sound Is Analog

• So there’s infinite variation• Like a rock thrown into a pond, there are waves:

– Amplitude: how high the waves are -- Loudness– Frequency: how many waves per second -- Pitch

• Loudness is measured in decibels– This is a log scale, so 20 is ten times as loud as 10, 30 is ten times

as loud as 20, and so forth.– You can distinguish from just over 0 dB to 120dB

• 37 quiet office (no air-conditioning)• 59 conversation• 76 loud factory• 110 really loud night club or rave• 140 threshold of pain (well, for some)

Page 4: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 4

Sound Is Analog, 2

• Pitch is measured in Hz (Hertz, cycles per second) and kHz.• You can distinguish between a few Hz and 15 - 20 khz (this

is age dependent)– Lowest note on piano 27 Hz– Highest note on piano 4,186 Hz (4.186 kHz)– Lowest vocal sound 80 Hz– Highest vocal sound 800 Hz– The A above middle C 440 Hz (used to be lower!)

• But just a sine wave at these frequencies sounds sterile: it lacks the overtones, the harmonics, produced by all natural sources of sound.

Page 5: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

This is a sine wave, which may represent a “pure” (and so artificial) sound

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 5

Its frequency (tone) is the distance between crests -- hertz

Its amplitude (loudness) is the height of the crests -- decibels

With frequent sampling we can capture both frequency and amplitude in a single series of numbers

Two sine waves interacting

Any sound can be reproduced using a sequence of overlaid sine waves (Fourier transformations)

Page 6: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

Notice how Complicated the Vibrations

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 6

Page 7: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 7

Sound Is Analog, 3

• An instrument vibrating produces lots of sounds above the fundamental tone.

• Many of these are various octaves above the fundamental– Octave = double the frequency– To get realistic sound we have to pick up at least the 4th

harmonic, 4x the frequency of the fundamental.– So we have to pick up to 12kHz for, say a realistic flute sound

(where the highest fundamental is just under 4kHz– More is better until, say 20kHz where a 5 year old’s hearing cuts

out.

• So how frequently do we need to sample to get “realistic” sounds?

Page 8: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

Poor Sampling Rate: Graphic

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 8

Page 9: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

Nyquist-Shannon Sampling Theorem

• You have to sample at 2x the size of the smallest difference you want to catch.

• Remember, we are sampling the volume (loudness) of a sound consisting of lots of superimposed fundamental and harmonic frequencies.

• So there are 44,100 samples per second, each a 2 byte --16 bit between 0 and 64k

• Sampling Demo Program

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 9

Page 10: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 10

MP3

• MPEG-1, Layer 3 sound compression -- intended for movies on CD and DVD– 90+% compression possible– A typical song (50mb) goes to 5mb– Is a lossy compression, so the quality goes down– No encryption in any way– No “watermark” (watermark = a secret pattern of bits somewhere

which indicates the source of the copy)– Much music publisher panic with the sudden popularity of the

format.– Much more music publisher panic with the iPod and friends

Page 11: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

MP3 Sound Isn’t Very Good…

• Having failed at my attempt to demonstrate how bad MP3 is in a tute

• I will now (fail again?) demonstrate the loss of quality in MP3 yet again…

• Roll it, monks…..

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 11

Page 12: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 12

Sound Into Bits (ADC)• Something that always confuses me:

– The 16 bits used (0 - 64k) record the amplitude (loudness)– The differences between successive 16 bit samples contain the

frequency (pitch) – Remember, a high wave also has a trough between peaks.

• So how often do we have to sample to get enough samples– 2x the maximum difference we want to catch.– And we want to catch differences up to 20mHz, so we have to

sample at at least 40,000 times a second.– So your CDs contain music sampled at just over 44,000 times a

second. – So the digital signal bandwidth must be 16 x 44,000 = 704,000 bits

per second or 88 kbps. With stereo sound, we have to have two such samples, so a 1x CD-ROM bus goes at 176 kbps, which we already knew!

Page 13: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 13

Bits Into Sound (DAC)

• Amplifiers and speakers are analog devices, so• The CD player does DAC and passes the results as an

analog signal to your stereo system.• It does the same if you listen to music off your CD-ROM

drive.• But where does your sound card/chip do the conversion?• Hummmm…. Later is (far, far) better, because there’s lots

of electrical interference inside your PC. Digital isn’t affected by this, but analog is.

• So the perfect system would be all digital inside the computer and have its DAC inside the speakers

Page 14: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

So We Have DAC for both Sound and Video

• At the same time• By two independent sets of hardware and

software• Working on two independent files• How can we guarantee synchronisation???

• This is a problem with Flash!

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 14

Page 15: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 15

MIDI & General MIDI

• MIDI = Music Instrument Digital Interface• MIDI is to sampling exactly what vector is to raster

graphics– A language for describing sounds

• The notes• The instruments, each of which has a number

– 128 instruments– Plus drum kit

• The note characteristics– attack– sustain– decay– release

• 2+ ways of making those notes– FM synthesis– Wavetable

Page 16: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

An MDI Studio

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 16

Page 17: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

An Audioacoustic Editing Lab

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 17

Page 18: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 18

The Parts of a MIDI Note

• From the MIDI Manufacturers Homepage

Page 19: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 19

Making MIDI

• FM Synthesis– Sterile sine waves– What gave computer music a bad name

• Wavetable Sound Generation– The music gives the number of the instrument– Samples of the sound of that instrument are stored in ROM/RAM

on the sound card– The samples are processed to give a far better illusion of the

sound of the instrument– The more samples, the better, so 64mb of samples on ROM are

better than 512k.– Wavetables may also be downloaded from CD-ROMS

Page 20: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 20

MIDI Quality

• Well, as always there’s the trade off:– Much smaller file size– Always somewhat less quality– Infinitely cheaper to create -- only one muso necessary– May require significant CPU processing

Page 21: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 21

Channels, Voices and Streams

• A channel drives a speaker:– 2 channels for standard stereo– 4-5 channels for 3D sound (two may be faked)– 8+ channels for super sound in theatre movies

• A voice is an instrument, etc. on a channel– MIDI supports a large number of voices: 32, 64. This is polyphony– The voices are superimposed, in digital or analog form, and then

sent to the speakers– Again, multiple voices may load down the CPU

• A stream is half voice and half channel– Lets you record a sound effect, a stream– When we need it, we superimpose it on top of the sound going to

a channel– The sound card and/or CPU do the work

Page 22: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 22

Channels, Voices and Streams, 2

• The higher the bandwidth into the sound card/chip– The more channels, voices and streams we can get at once– And the more processing work has to be done– So we either do more on-sound-card/chip processing or bog

down the CPU– (Sound like the issues related to 3D accelerator cards?!)

• Evolution in sound cards/chips rather slow• Most systems use sound chip on motherboard

• But if you want to play games….

Page 23: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 23

Games and Computer Sound• Games are one of several factors driving the evolution of

graphics boards• Games are almost the only factor driving the evolution of sound

cards– Who is sneaking up behind me? We need 3D.– What kind of sound does that alien make when exploded? We need

lots of streams superimposed.• 3D illusion

– Uses 3 speakers (woofer, + 2 satellite) and an algorithm to fox the ear by marginally delaying one stereo channel

– Developed by NASA for space flight simulators– Can work well if don’t move your head– With 5 speakers, esp. with 4 channels, can work very well indeed

• Note: deep tones non-directional, so we can use just one woofer

• As the musicians are never behind you, not necessary for music. Whoops, sorry Berlioz, Allegri, Tchkovsky, etc.

Page 24: 6/13/2015 CSE5060 -- Multimedia on the Web -- Lecture11 1 Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing

04/18/23 CSE5060 -- Multimedia on the Web -- Lecture11 24

Games and Computer Sound, 2

• Competing 3D positional audio standards– A3D

• From Aureal Semiconductor• On their widely used Vortex audio chips

– Audio Extensions .EAX• From Creative Labs (who brought us Sound Blaster)

– DirectSound3D• From Microsoft• Part of the DirectX set of Windows APIs/extensions, including

Direct3D

• Currently the first of these is the standard, but watch out for the rest.