l02 - information 1 comp411 – spring 2007 1/16/2007 what is “computation”? computation is...

42
L02 - Information 1 omp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” - Transforming information from one form to another - Deriving new information from old - Finding information associated with a given input - “Computation” describes the motion of information through time - “Communication” describes the motion of information through space

Post on 20-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 1Comp411 – Spring 2007 1/16/2007

What is “Computation”?

Computation is about “processing information” - Transforming information from one form

to another - Deriving new information from old- Finding information associated with a

given input - “Computation” describes the motion of

information through time - “Communication” describes the motion of information through space

Page 2: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 2Comp411 – Spring 2007 1/16/2007

What is “Information”?

information, n. Knowledge communicated or received concerning a particular fact or circumstance.

Information resolves uncertainty. Information is simply that which cannot be predicted.

The less predictable a message is, the more information it conveys!

Carolina won again.

Tell me something

new…

“ 10 Problem sets, 2 quizzes, and a final!”

A Computer Scientist’s Definition:

Page 3: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 3Comp411 – Spring 2007 1/16/2007

Real-World Information

Why do unexpected messages get allocated the biggest headlines?

… because they carry the most information.

Page 4: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 4Comp411 – Spring 2007 1/16/2007

What Does A Computer Process?

• A Toaster processes bread and bagels

• A Blender processes smoothiesand margaritas

• What does a computer process?

• 2 allowable answers:– Information– Bits

• What is the mapping from information to bits?

Page 5: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 5Comp411 – Spring 2007 1/16/2007

Quantifying Information(Claude Shannon, 1948)

Suppose you’re faced with N equally probable choices, and I give you a fact that narrows it down to M choices. ThenI’ve given you

log2(N/M) bits of information

Examples: information in one coin flip: log2(2/1) = 1 bit roll of a single die: log2(6/1) = ~2.6 bits outcome of a Football game: 1 bit

(well, actually, “they won” may convey more

information than “they lost”…)

Information is measured in bits (binary digits) = number of 0/1’s

required to encode choice(s)

Page 6: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 6Comp411 – Spring 2007 1/16/2007

Example: Sum of 2 dice23456789

101112

i2 = log2(36/1) = 5.170 bitsi3 = log2(36/2) = 4.170 bitsi4 = log2(36/3) = 3.585 bitsi5 = log2(36/4) = 3.170 bitsi6 = log2(36/5) = 2.848 bitsi7 = log2(36/6) = 2.585 bitsi8 = log2(36/5) = 2.848 bitsi9 = log2(36/4) = 3.170 bitsi10 = log2(36/3) = 3.585 bitsi11 = log2(36/2) = 4.170 bitsi12 = log2(36/1) = 5.170 bits

The average information provided by the sum of 2 dice:

bits2743ppii

i2i

12

2iMN

2NM

i

i .)(log)(logave

Entropy

Page 7: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 7Comp411 – Spring 2007 1/16/2007

Show Me the Bits!Can the sum of two dice REALLY be representedusing 3.274 bits? If so, how?

The fact is, the averageinformation content is astrict *lower-bound* on howsmall of a representationthat we can achieve.

In practice, it is difficultto reach this bound. But,we can come very close.

Page 8: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 8Comp411 – Spring 2007 1/16/2007

Variable-Length Encoding

• Of course we can use differing numbers of “bits” to represent each item of data

• This is particularly useful if all items are *not* equally likely

• Equally likely items lead to fixed length encodings:– Ex: Encode a particular roll of 5?– {(1,4), (2,3), (3,2), (4,1)} which are equally

likely if we use fair dice– Entropy =

bits– 00 – (1,4), 01 – (2,3), 10 – (3,2), 11 – (4,1)

• Back to the original problem. Let’s use this encoding:

2)(log))5((log)5( 41

2

4

141

4

12

ii

ii rollrollprollrollp

2 - 10011 3 - 0101 4 - 011 5 - 0016 - 111 7 - 101 8 - 110 9 - 00010 - 1000 11 - 0100 12 - 10010

Page 9: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 9Comp411 – Spring 2007 1/16/2007

Variable-Length Encoding

• Taking a closer look

• Decoding

2 - 10011 3 - 0101 4 - 011 5 - 0016 - 111 7 - 101 8 - 110 9 - 00010 - 1000 11 - 0100 12 - 10010

Unlikely rolls are encoded using more bits

More likely rolls use fewer bits

2 5 3 6 5 8 3Example Stream: 1001100101011110011100101

Page 10: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 10Comp411 – Spring 2007 1/16/2007

Huffman Coding

• A simple *greedy* algorithm for approximating an entropy efficient encoding1. Find the 2 items with the smallest probabilities2. Join them into a new *meta* item whose

probability is the sum3. Remove the two items and insert the new

meta item4. Repeat from step 1 until there is only one item

36/36

112/36

32/36

4/36 43/36

7/36

94/36

54/36

8/36

15/36

121/36

21/36

2/36

76/36

11/36

85/36

65/36

10/36

21/36

103/36

5/36

Huffman decodingtree

Page 11: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 11Comp411 – Spring 2007 1/16/2007

Converting Tree to Encoding

36/36

43/36

112/36

32/36

4/36

7/36

94/36

54/36

8/36

15/36

76/36

103/36

121/36

21/36

2/36

5/36

11/36

85/36

65/36

10/36

21/36

0

0

0

0

00

0

0

0

0

1

1

1

1

1

1

1

1

1

1

Huffman decodingtree

Once the *tree* is constructed, label its edges consistently and follow the paths from the largest *meta* item to each of the real item to find the encoding.2 - 10011 3 - 0101 4 - 011 5 - 0016 - 111 7 - 101 8 - 110 9 - 00010 - 1000 11 - 0100 12 - 10010

Page 12: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 12Comp411 – Spring 2007 1/16/2007

Encoding Efficiency

How does this encoding strategy compare to the information content of the roll?

Pretty close. Recall that the lower bound was 3.274 bits. However, an efficient encoding (as defined by having an average code size close to the information content) is not always what we want!

3063b

54433

333345b

ave

361

362

363

364

365

366

365

364

363

362

361

ave

.

Page 13: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 13Comp411 – Spring 2007 1/16/2007

Encoding Considerations

- Encoding schemes that attempt to match the information content of a data stream remove redundancy. They are data compression techniques.

- However, sometimes our goal in encoding information is increase redundancy, rather than remove it. Why?- Make the information easier to manipulate

(fixed-sized encodings)- Make the data stream resilient to noise

(error detecting and correcting codes)

-Data compression allows us to store our entire music collection in a pocketable device

-Data redundancy enables us to store that *same* information *reliably* on a hard drive

Page 14: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 14Comp411 – Spring 2007 1/16/2007

4

Error detection using paritySometimes we add extra redundancy so that we can detect errors. For instance, this encoding detects any single-bit error:

2-11110003-11111014-00115-01016-01107-00008-10019-101010-110011-111111012-1111011

4 2* 54:001111100000101

4 10* 73:001111100000101

4 9* 72:001111100000101

4 6* 71:001111100000101

There’s somethingpeculiar aboutthose codings

Same bitstream – w/4 possible interpretationsif we allow for only one error

ex:001111100000101

Page 15: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 15Comp411 – Spring 2007 1/16/2007

Property 1: Parity

The sum of the bits in each symbol is even.

(this is how errors are detected)

2-1111000 = 1 + 1 + 1+ 1 + 0 + 0 + 0 = 43-1111101 = 1 + 1 + 1 + 1 + 1 + 0 + 1 = 64-0011 = 0 + 0 + 1 + 1 = 25-0101 = 0 + 1 + 0 + 1 = 26-0110 = 0 + 1 + 1 + 0 = 27-0000 = 0 + 0 + 0 + 0 = 08-1001 = 1 + 0 + 0 + 1 = 29-1010 = 1 + 0 + 1 + 0 = 210-1100 = 1 + 1 + 0 + 0 = 211-1111110 = 1 + 1 + 1 + 1 + 1 + 1 + 0 = 612-1111011 = 1 + 1 + 1 + 1 + 0 + 1 + 1 = 6

How muchinformation

is in the last bit?

Page 16: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 16Comp411 – Spring 2007 1/16/2007

Property 2: SeparationEach encoding differs from all others by at least two bits in their overlapping parts

3 4 5 6 7 8 9 10 11 12

1111101 0011 0101 0110 0000 1001 1010 1100 1111110 1111011

2 1111000 1111x0x xx11 x1x1 x11x xxxx 1xx1 1x1x 11xx 1111xx0 11110xx

3 1111101 xx11 x1x1 x11x xxxx 1xx1 1x1x 11xx 11111xx 1111xx1

4 0011 0xx1 0x1x 00xx x0x1 x01x xxxx xx11 xx11

5 0101 01xx 0x0x xx01 xxxx x10x x1x1 x1x1

6 0110 0xx0 xxxx xx10 x1x0 x11x x11x

7 0000 x00x x0x0 xx00 xxxx xxxx

8 1001 10xx 1x0x 1xx1 1xx1

9 1010 1xx0 1x1x 1x1x

10 1100 11xx 11xx

11 1111110 1111x1xThis difference is calledthe “Hamming distance”

“A Hamming distance of 1is needed to uniquelyidentify an encoding”

Page 17: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 17Comp411 – Spring 2007 1/16/2007

A Short DetourIt is illuminating to consider Hamming distances geometrically.

Given 2 bits the largest Hamming distancethat we can achieve between 2 encodings is 2.

This allows us to detect 1-bit errors if we

encode 1 bit of information using 2 bits.

With 3 bits we can find 4 encodings with a Hamming distance of 2, allowing the detection of 1-bit errors when3 bits are used to encode 2 bits of information.

We can also identify 2 encodingswith Hamming distance 3. Thisextra distance allows us to detect2-bit errors. However, we coulduse this extra separationdifferently.

00

10

01

11

000 001

010

100

011

111

101

110

Encodings separated bya Hamming distance of 2.

000 001

010

100

011

111

101

110

Encodings separated bya Hamming distance of 3.

Page 18: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 18Comp411 – Spring 2007 1/16/2007

Error correcting codes

We can actually correct 1-bit errors in encodings separated by a Hamming distance of 2. This is possible because the sets of bit patterns located a Hamming distance of 1 from our encodings are distinct.

However, attempting error correctionwith such a small separation is dangerous.Suppose, we have a 2-bit error. Our errorcorrection scheme will then misinterpretthe encoding. Misinterpretations alsooccurred when we had 2-bit errors in our1-bit-error-detection (parity) schemes.

A safe 1-bit error correction schemewould correct all 1-bit errors and detectall 2-bit errors. What Hamming distanceis needed between encodings toaccomplish this?

000 001

010

100

011

111

101

110

This is justa votingscheme

Page 19: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 19Comp411 – Spring 2007 1/16/2007

Information SummaryInformation resolves uncertainty

• Choices equally probable:• N choices down to M

log2(N/M) bits of information

•Choices not equally probable:• choicei with probability pi

log2(1/pi) bits of information• average number of bits = pilog2(1/pi) • use variable-length encodings

Page 20: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 20Comp411 – Spring 2007 1/16/2007

Computer Abstractionsand Technology

1. Layer Cakes2. Computers are translators3. Switches and Wires

(Read Chapter 1)

Page 21: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 21Comp411 – Spring 2007 1/16/2007

Computers Everywhere

The computers we are used toDesktops

Laptops

Embedded processorsCarsMobile phonesToasters, irons, wristwatches, happy-meal

toys

Page 22: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 22Comp411 – Spring 2007 1/16/2007

Compiler for (i = 0; i < 3; i++) m += i*i;

Assembler and Linkeraddi $8, $6, $6 sll $8, $8, 4CPU

ModuleALU

A B

Cells A BCO CI S

FA

A Computer SystemWhat is a computer system? Where does it start?Where does it end?

Gates

Transistors

Page 23: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 23Comp411 – Spring 2007 1/16/2007

Computer Layer Cake

ApplicationsSystems softwareShared librariesOperating SystemHardware – the bare metal

Hardware

Operating System

Libraries

Systems S/WAppsComputers are

digital Chameleons

Page 24: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 24Comp411 – Spring 2007 1/16/2007

Computers are Translators

User-Interface (visual programming)High-Level Languages

CompilersInterpreters

Assembly LanguageMachine Language x: .word 0

y: .word 0c: .word 123456

...

lw $t0, xaddi $t0, $t0, -3lw $t1, ylw $t2, cadd $t1, $t1, $t2mul $t0, $t0, $t1sw $t0, y

int x, y;y = (x-3)*(y+123456)

Page 25: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 25Comp411 – Spring 2007 1/16/2007

Computers are Translators

User-Interface (visual programming)High-Level Languages

CompilersInterpreters

Assembly LanguageMachine Language

x: .word 0y: .word 0c: .word 123456

...

lw $t0, xaddi $t0, $t0, -3lw $t1, ylw $t2, cadd $t1, $t1, $t2mul $t0, $t0, $t1sw $t0, y

0x04030201 0x08070605 0x00000001 0x000000020x00000003 0x00000004 0x706d6f43

Page 26: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 26Comp411 – Spring 2007 1/16/2007

Why So Many Languages?

Application SpecificHistorically: COBOL vs. FortranToday: C# vs. Java

Visual Basic vs. Matlab Static vs. Dynamic

Code MaintainabilityHigh-level specifications are

easier to understand and modify

Code ReuseCode PortabilityVirtual Machines

Page 27: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 27Comp411 – Spring 2007 1/16/2007

Under the Covers

InputOutputStorageProcessing

DatapathControl

Page 28: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 28Comp411 – Spring 2007 1/16/2007

Under the Covers

InputOutputStorageProcessing

DatapathControl

Page 29: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 29Comp411 – Spring 2007 1/16/2007

Under the Covers

InputOutputStorageProcessing

DatapathControl

Cathode Ray Tube (CRT)The “last vacuum tube”Now nearing extinction

Page 30: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 30Comp411 – Spring 2007 1/16/2007

Under the Covers

InputOutputStorageProcessing

DatapathControl

Liquid Crystal Displays (LCDs)

Page 31: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 31Comp411 – Spring 2007 1/16/2007

Under the Covers

InputOutputStorageProcessing

DatapathControl

Page 32: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 32Comp411 – Spring 2007 1/16/2007

Under the Covers

InputOutputStorageProcessing

DatapathControl

Page 33: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 33Comp411 – Spring 2007 1/16/2007

Under the Covers

InputOutputStorageProcessing

DatapathControl

Intel Pentium III Xeon

Page 34: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 34Comp411 – Spring 2007 1/16/2007

Implementation Technology

RelaysVacuum TubesTransistorsIntegrated Circuits

Gate-level integrationMedium Scale Integration (PALs)Large Scale Integration (Processing unit on a

chip)Today (Multiple CPUs on a chip)

Nanotubes??Quantum-Effect Devices??

Page 35: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 35Comp411 – Spring 2007 1/16/2007

openclosed

Implementation Technology

Common Links?A controllable switchComputers are wires and switches

open

control

Page 36: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 36Comp411 – Spring 2007 1/16/2007

Chips

Silicon WafersChip manufactures build many

copies of the same circuit onto a single wafer. Only a certain percentage of the chips will work; those that work will run at different speeds. The yield decreases as the size of the chips increases and the feature size decreases.

Wafers are processed by automated fabrication lines. To minimize the chance of contaminants ruining a process step, great care is taken to maintain a meticulously clean environment.

Page 37: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 37Comp411 – Spring 2007 1/16/2007

Field Effect Transistors (FETs)

Modern silicon fabrication technology is optimized to build a particular type of transistor. The flow of electrons from the source to the drain is controlled by a gate voltage.

Source DrainGate

Bulk

n+ n+

p

IDS = 0

VDS

Page 38: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 38Comp411 – Spring 2007 1/16/2007

ChipsSilicon Wafers

Metal 2

M1/M2 via

Metal 1

Polysilicon

Diffusion

Mosfet (under polysilicon gate)

IBM photomicrograph (Si has been removed!)

Page 39: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 39Comp411 – Spring 2007 1/16/2007

How Hardware WAS Designed

20 years agoI/O Specification

Truth tablesState diagrams

Logic designCircuit designCircuit Layout

Page 40: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 40Comp411 – Spring 2007 1/16/2007

How Hardware IS Designed

Today (with software)High-level hardware specification languages

VerilogVHDL

Page 41: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 41Comp411 – Spring 2007 1/16/2007

Reconfigurable Chips

Programmable Array Logic (PALs)Fixed logic / programmable wires

Field Programmable Gate Arrays (FPGAs)Repeated reconfigurable logic cells

Page 42: L02 - Information 1 Comp411 – Spring 2007 1/16/2007 What is “Computation”? Computation is about “processing information” -Transforming information from

L02 - Information 42Comp411 – Spring 2007 1/16/2007

Next Time

Computer RepresentationsHow is X represented in

computers?X = textX = numbersX = anything else

Encoding Information