chapter 3 representation

55
Chapter 3 Representation

Upload: wynter-malone

Post on 31-Dec-2015

62 views

Category:

Documents


1 download

DESCRIPTION

Chapter 3 Representation. Key Concepts. Digital vs Analog How many bits? Some standard representations Compression Methods. Chapter Goals. Distinguish between analog and digital information. Explain data compression and calculate compression ratios. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chapter 3 Representation

Chapter 3

Representation

Page 2: Chapter 3 Representation

Key Concepts

• Digital vs Analog

• How many bits?

• Some standard representations

• Compression Methods

3-2

Page 3: Chapter 3 Representation

Chapter Goals

• Distinguish between analog and digital information.

• Explain data compression and calculate compression ratios.

• Describe the characteristics of the ASCII and Unicode character sets.

3-3

Page 4: Chapter 3 Representation

Chapter Goals

• Perform various types of text compression.

• Explain the nature of sound and its representation.

3-4

Page 5: Chapter 3 Representation

Chapter Goals

• Explain how RGB values define a color.

• Distinguish between raster and vector graphics.

3-5

Page 6: Chapter 3 Representation

Data and Computers

• Computers are multimedia devices

• We need to represent many kinds of data, using only bits– Numbers– Text– Audio– Images and graphics– Video

3-6

Page 7: Chapter 3 Representation

Analog and Digital Information

• Computers are finite.

• Computer memory has only so much room to store a certain amount of data.

• The goal, is to represent enough of the world to satisfy our needs.

3-7

Page 8: Chapter 3 Representation

Analog and Digital Information

• Information can be represented in one of two ways: analog or digital.

Analog data A continuous representation, analogous to the actual information it represents.

INFINITE # of values

Digital data A discrete representation, breaking the information up into separate elements.

FINITE # of values

3-8

Page 9: Chapter 3 Representation

Analog and Digital Information

3-9

Figure 3.1 A mercury thermometer continually rises in direct proportion to the temperature

Page 10: Chapter 3 Representation

Digitization of Analog Information

• We digitize information by breaking it into pieces and representing those pieces separately.

• Basically, analog data becomes digital when we measure it.

• Why digitize?– Because computers use bits– Digital data can be kept “perfect”

3-10

Page 11: Chapter 3 Representation

Electronic Signals (Cont’d)

• Periodically, a digital signal is reclocked to regain its original shape.

3-11

Figure 3.2 An analog and a digital signal

Figure 3.3 Degradation of analog and digital signals

Page 12: Chapter 3 Representation

3-12

Page 13: Chapter 3 Representation

Binary Representations

• We need to REPRESENT different kinds of information using only bits

• So, how many bits do we need?? It depends on how many things we need to represent.

Who has a pet bunny?

3-13

Page 14: Chapter 3 Representation

Binary Representations

• One bit can be either 0 or 1. Therefore, one bit can represent only two things.

• Two bits can represent four things because there are four combinations of 0 and 1 that can be made from two bits:

00011011

3-14

Page 15: Chapter 3 Representation

Binary Representations

• Three bits can represent eight things because there are eight combinations of 0 and 1 that can be made from three bits.

000 001010 011100 101110 111

3-15

Page 16: Chapter 3 Representation

Binary Representations

3-16

Figure 3.4 Bit combinations

Page 17: Chapter 3 Representation

Binary Representations

• In general, bits can represent 2 things because there are 2n combinations of 0 and 1 that can be made from n bits.

• Note that every time we increase the number of bits by 1, we double the number of things we can represent.

3-17

Page 18: Chapter 3 Representation

Representing Text

• We need to represent every possible character.

• The general approach is to list them all and assign each an integer value.

• A character set is a list of characters and the integer codes used to represent each one.

• This works because we already know how to represent the integers (Chapter 2).

3-18

Page 19: Chapter 3 Representation

The ASCII Character Set

• ASCII stands for American Standard Code for Information Interchange.

• ASCII uses eight bits to represent 256 characters.

3-19

Page 20: Chapter 3 Representation

The ASCII Character Set

3-20

Page 21: Chapter 3 Representation

The ASCII Character Set

• Note that the first 32 characters in the ASCII character chart do not have a simple character representation that you could print to the screen.

• Thus, they don’t show up in Notepad.

3-21

Page 22: Chapter 3 Representation

The Unicode Character Set

• 8 bit ASCII is not enough. It can represent 2^8 = 256 characters.

• 16 bit Unicode can represent 2^16 = , over 65,536 characters.

• Unicode was designed to be a superset of ASCII. That is, the first 256 characters in the Unicode character set correspond exactly to the extended ASCII character set.

3-22

Page 23: Chapter 3 Representation

The Unicode Character Set

3-23Figure 3.6 A few characters in the Unicode character set

Page 24: Chapter 3 Representation

Data Compression

• Data compression Reduction in the amount of space needed to store a piece of data.

• Compression ratio The size of the compressed data divided by the size of the original data.

• A data compression techniques can be– lossless, which means the data can be retrieved without

any loss of the original information,– lossy, which means some information may be lost in the

process of compaction.

3-24

Page 25: Chapter 3 Representation

Text Compression

• It is important that we find ways to store and transmit text efficiently, which means we must find ways to compress text.– keyword encoding– run-length encoding– Huffman encoding

3-25

Page 26: Chapter 3 Representation

Keyword Encoding

• Frequently used words are replaced with a single character. For example,

3-26

Page 27: Chapter 3 Representation

Example: Uncompressed

The human body is composed of many independent systems, such as the circulatory system, the respiratory system, and the reproductive system. Not only must all systems work independently, they must interact and cooperate as well. Overall health is a function of the well-being of separate systems, as well as how these separate systems work in concert.

3-27

Page 28: Chapter 3 Representation

Compressed with Keyword Encoding

The human body is composed of many independent systems, such ^ ~ circulatory system, ~ respiratory system, + ~ reproductive system. Not only & each system work independently, they & interact + cooperate ^ %. Overall health is a function of ~ %- being of separate systems, ^ % ^ how # separate systems work in concert.

3-28

Page 29: Chapter 3 Representation

Keyword Encoding

• 349 characters in the • 314 characters in the compressedthus; • The compression ratio for this example is

314/349 or approximately 0.9• Or about 10% smaller. Not real good

3-29

Page 30: Chapter 3 Representation

Run-Length Encoding (RLE)• Sometimes bytes repeat in a long sequence.

– This type of repetition doesn’t generally take place in English text, but often occurs in large data streams.

• In RLE, a sequence of repeated characters is replaced by:

<flag><repeated character><number of repeats>

3-30

Page 31: Chapter 3 Representation

Run-Length Encoding Examples• AAAAAAA would be encoded as *A7

• *n5*x9ccc*h6 some other text *k8eee would be decoded into the following original textnnnnnxxxxxxxxxccchhhhhh some other text kkkkkkkkeee

• Original: 51 characters• Compressed 35 characters• Compression ratio: of 35/51 or approximately 0.68.

• Better than keyword, and can be more better for certain kinds of data

3-31

Page 32: Chapter 3 Representation

Huffman Encoding

• Why should the character “x”, which is seldom used in text, take up the same number of bits as the “e”, which is used very frequently?

• Huffman codes using variable-length bit strings to represent each character.

• A few characters may be represented by five bits, and another few by six bits, and yet another few by seven bits, and so forth.

3-32

Page 33: Chapter 3 Representation

Huffman Encoding

• If we use only a few bits to represent characters that appear often and reserve longer bit strings for characters that don’t appear often, the overall size of the document being represented is smaller.

3-33

Page 34: Chapter 3 Representation

Huffman Encoding

• For example

3-34

Page 35: Chapter 3 Representation

Huffman Encoding

• DOORBELL would be encode in binary as 1011110110111101001100100.

• Doorbell in binary: 8*8 = 64 bits• Doorbell in Huffman: 25 bits• Compression: 25/64, or approximately 0.39

3-35

Page 36: Chapter 3 Representation

Representing Audio Information

3-36

Page 37: Chapter 3 Representation

Representing Audio Information

• To digitize the signal we periodically measure the level of the signal and record the appropriate numeric value. The process is called sampling.

• In general, a sampling rate of around 40,000 times per second is enough to create a reasonable sound reproduction.

3-37

Page 38: Chapter 3 Representation

Representing Audio Information

3-38Figure 3.8 Sampling an audio signal

Page 39: Chapter 3 Representation

How big is an Audio File?

• Assume we want a 16 bit resolution on each sample

• And there are 40,000 samples per second

• How big (in bytes) is a 3 minute song file?

3-39

Page 40: Chapter 3 Representation

COOL PICTURE, HUH!?

3-40

Figure 3.9 A CD player reading binary information

Page 41: Chapter 3 Representation

Audio Formats

• Audio Formats– WAV, AU, AIFF, VQF, and MP3. – Some are compressed, some are not

• MP3 is common – MP3 employs both lossy and lossless compression.

First it analyzes the frequency spread and compares it to mathematical models of human psychoacoustics (the study of the interrelation between the ear and the brain), then it discards information that can’t be heard by humans. Then the bit stream is compressed using a form of Huffman encoding to achieve additional compression. Your results may vary. Don’t use MP3’s if you suffer from a preference for quality music. Ask your doctor if MP3’s are right for you.

3-41

Page 42: Chapter 3 Representation

Graphics

• Raster

• Vector

3-42

Page 43: Chapter 3 Representation

Raster Graphics

• Digitizing a picture is the act of representing it as a collection of individual dots called pixels.

• 2 resolutions involved:• The number of pixels used to represent a

picture is called the pixel resolution.• Color resolution – how many bits are used for

each pixel

3-43

Page 44: Chapter 3 Representation

Raster Graphics

• The storage of image information on a pixel-by-pixel basis is called a raster-graphics format. Several popular raster file formats including bitmap (BMP), GIF, and JPEG.

3-44

Page 45: Chapter 3 Representation

Digitized Images and Graphics

3-45

Figure 3.12 A digitized picture composed of many individual pixels

Page 46: Chapter 3 Representation

Digitized Images and Graphics

3-46

Figure 3.12 A digitized picture composed of many individual pixels

Page 47: Chapter 3 Representation

Representing Graphic Colors

• Color is our perception of the various frequencies of light that reach the retinas of our eyes.

• Our retinas have three types of color photoreceptor cone cells that respond to different sets of frequencies. These photoreceptor categories correspond to the colors of red, green, and blue.

3-47

Page 48: Chapter 3 Representation

Representing Graphic Colors

• Color is often expressed in a computer as an RGB (red-green-blue) value, which is actually three numbers that indicate the relative contribution of each of these three primary colors.

• For example, an RGB value of (255, 255, 0) maximizes the contribution of red and green, and minimizes the contribution of blue, which results in a bright yellow.

3-48

Page 49: Chapter 3 Representation

Representing Images and Graphics

3-49Figure 3.10 Three-dimensional color space

Page 50: Chapter 3 Representation

Representing Images and Graphics

• The number of bits that is used to represent a color is called the color depth.

• TrueColor indicates a 24-bit color depth. – 8 bits are used for Red– 8 bits are used for Green– 8 bits are used for Blue

3-50

Page 51: Chapter 3 Representation

Representing Images and Graphics

3-51

Page 52: Chapter 3 Representation

How big is a Raster Graphic File?

• Assume we want a 24 bit color depth

• And the image is X pixels by Y pixels

• How big is the file?

3-52

Page 53: Chapter 3 Representation

Vector Graphics

• Vector-graphics format describe an image in terms of lines and geometric shapes.

• Keywords describe a lines:– Direction– Thickness– Color– Etc…– Every pixel does not have to be accounted for.

3-53

Page 54: Chapter 3 Representation

Vector Graphics

• Vector graphics can be resized mathematically, and these changes can be calculated dynamically as needed.(example: True-Type fonts)

• However, vector graphics is not good for representing real-world images (photographs).

3-54

Page 55: Chapter 3 Representation

Vector Graphics

3-55