chapter 3 arithmetic for computers. spring 2005 elec 5200/6200 from patterson/hennessey slides...

Chapter 3Arithmetic for Computers

Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides

Taxonomy of Computer Information Information

Instructions AddressesData

Numeric Non-numeric

Fixed-point Floating-point Character(ASCII)

Boolean

Unsigned(ordinal)

Signed

Sign-magnitude 2’s complement

Single-precision

Other

Double-precision


Number Format Considerations

Type of numbers (integer, fraction, real, complex) Range of values

between smallest and largest values wider in floating-point formats

Precision of values (max. accuracy) usually related to number of bits allocated

n bits => represent 2n values/levels Value/weight of least-significant bit

Cost of hardware to store & process numbers (some formats more difficult to add, multiply/divide, etc.)


Unsigned Integers

Positional number system:

an-1an-2 …. a2a1a0 =

an-1x2n-1 + an-2x2n-2 + … + a2x22 + a1x21 + a0x20

Range = [(2n -1) … 0]

Carry out of MSB has weight 2n

Fixed-point fraction:

0.a-1a-2 …. a-n = a-1x2-1 + a-2x2-2 + … + a-nx2-n


Signed Integers

Sign-magnitude format (n-bit values):

A = San-2 …. a2a1a0 (S = sign bit)

Range = [-(2n-1-1) … +(2n-1-1)] Addition/subtraction difficult (multiply easy) Redundant representations of 0

2’s complement format (n-bit values):

-A represented by 2n – A Range = [-(2n-1) … +(2n-1-1)] Addition/subtraction easier (multiply harder) Single representation of 0


Computing the 2’s Complement

To compute the 2’s complement of A:

Let A = an-1an-2 …. a2a1a0

2n - A = 2n -1 + 1 – A = (2n -1) - A + 1

1 1 … 1 1 1

- an-1 an-2 …. a2 a1 a0 + 1

-------------------------------------

a’n-1a’n-2 …. a’2a’1a’0 + 1 (one’s complement + 1)


2’s Complement Arithmetic

Let (2n-1-1) ≥ A ≥ 0 and (2n-1-1) ≥ B ≥ 0

Case 1: A + B (2n-2) ≥ (A + B) ≥ 0

Since result < 2n, there is no carry out of the MSB Valid result if (A + B) < 2n-1

MSB (sign bit) = 0 Overflow if (A + B) ≥ 2n-1

MSB (sign bit) = 1 if result ≥ 2n-1

Carry into MSB



Case 2: A - B Compute by adding: A + (-B) 2’s complement: A + (2n - B) -2n-1 < result < 2n-1 (no overflow possible) If A ≥ B: 2n + (A - B) ≥ 2n

Weight of adder carry output = 2n

Discard carry (2n), keeping (A-B), which is ≥ 0 If A < B: 2n + (A - B) < 2n

Adder carry output = 0 Result is 2n - (B - A) =

2’s complement representation of -(B-A)



Case 3: -A - B Compute by adding: (-A) + (-B) 2’s complement: (2n - A) + (2n - B) = 2n + 2n - (A + B) Discard carry (2n), making result 2n - (A + B)

= 2’s complement representation of -(A + B) 0 ≥ result ≥ -2n Overflow if -(A + B) < -2n-1

MSB (sign bit) = 0 if 2n - (A + B) < 2n-1 no carry into MSB


Relational OperatorsCompute A-B & test ALU “flags” to compare A vs. B

ZF = result zero OF = 2’s complement overflowSF = sign bit of result CF = adder carry output

Signed Unsigned

A = B ZF = 1 ZF = 1

A <> B ZF = 0 ZF = 0

A ≥ B (SF OF) = 0 CF = 1 (no borrow)

A > B (SF OF) + ZF = 0 CF ZF’ = 1

A ≤ B (SF OF) + ZF = 1 CF ZF’ = 0

A < B (SF OF) = 1 CF = 0 (borrow)


MIPS Overflow Detection

An exception (interrupt) occurs when overflow detected for add,addi,sub Control jumps to predefined address for exception Interrupted address is saved for possible resumption

Details based on software system / language example: flight control vs. homework assignment

Don't always want to detect overflow— new MIPS instructions: addu, addiu, subu

note: addiu still sign-extends!note: sltu, sltiu for unsigned comparisons


Designing the Arithmetic & Logic Unit (ALU)

Provide arithmetic and logical functions as needed by the instruction set

Consider tradeoffs of area vs. performance

(Material from Appendix B)

32

32

32

operation

result

a

b

ALU


Not easy to decide the “best” way to build something Don't want too many inputs to a single gate (fan in) Don’t want to have to go through too many gates (delay) For our purposes, ease of comprehension is important

Let's look at a 1-bit ALU for addition:

How could we build a 1-bit ALU for add, and, and or? How could we build a 32-bit ALU?

Different Implementations

cout = a b + a cin + b cin

sum = a xor b xor cin

Sum

CarryIn

CarryOut

a

b


Building a 32 bit ALU

b

0

2

Result

Operation

a

1

CarryIn

CarryOut

Result31a31

b31

Result0

CarryIn

a0

b0

Result1a1

b1

Result2a2

b2

Operation

ALU0

CarryIn

CarryOut

ALU1

CarryIn

CarryOut

ALU2

CarryIn

CarryOut

ALU31

CarryIn


Two's complement approach: just negate b and add.

What about subtraction (a – b) ?

0

2

Result

Operation

a

1

CarryIn

CarryOut

0

1

Binvert

b


Adding a NOR function

Can also choose to invert a. How do we get “a NOR b” ?

Binvert

a

b

CarryIn

CarryOut

Operation

1

0

2+

Result

1

0

Ainvert

1

0


Need to support the set-on-less-than instruction

(slt)

remember: slt is an arithmetic instruction

produces a 1 if rs < rt and 0 otherwise

use subtraction: (a-b) < 0 implies a < b

Need to support test for equality (beq $t5, $t6, $t7)

use subtraction: (a-b) = 0 implies a = b

Tailoring the ALU to the MIPS


Supporting slt

Use this ALU for most significant bitall other bits

Binvert

a

b

CarryIn

Operation

1

0

2+

Result

1

0

3Less

Overflowdetection

Set

Overflow

Ainvert

1

0

Binvert

a

b

CarryIn

CarryOut

Operation

1

0

2+

Result

1

0

Ainvert

1

0

3Less


Supporting slt

a0

Operation

CarryInALU0Less

CarryOut

b0

CarryIn

a1 CarryInALU1Less

CarryOut

b1

Result0

Result1

a2 CarryInALU2Less

CarryOut

b2

a31 CarryInALU31Less

b31

Result2

Result31

......

...

BinvertAinvert

0

0

0 Overflow

Set

CarryIn


Test for equality

Notice control lines:

0000 = and0001 = or0010 = add0110 = subtract0111 = slt1100 = NOR•Note: zero is a 1 when the result is zero!

a0

Operation

CarryInALU0Less

CarryOut

b0

a1 CarryInALU1Less

CarryOut

b1

Result0

Result1

a2 CarryInALU2Less

CarryOut

b2

a31 CarryInALU31Less

b31

Result2

Result31

......

...

Bnegate

Ainvert

0

0

0 Overflow

Set

CarryIn...

...Zero


Conclusion

We can build an ALU to support the MIPS instruction set key idea: use multiplexor to select the output we want we can efficiently perform subtraction using two’s complement we can replicate a 1-bit ALU to produce a 32-bit ALU

Important points about hardware all of the gates are always working the speed of a gate is affected by the number of inputs to the gate the speed of a circuit is affected by the number of gates in series

(on the “critical path” or the “deepest level of logic”) Our primary focus: comprehension, however,

Clever changes to organization can improve performance(similar to using better algorithms in software)

We saw this in multiplication, let’s look at addition now


Is a 32-bit ALU as fast as a 1-bit ALU? Is there more than one way to do addition?

two extremes: ripple carry and sum-of-products

Can you see the ripple? How could you get rid of it?

c1 = b0c0 + a0c0 + a0b0

c2 = b1c1 + a1c1 + a1b1 c2 =

c3 = b2c2 + a2c2 + a2b2 c3 =

c4 = b3c3 + a3c3 + a3b3 c4 = Not feasible! Why?

Problem: ripple carry adder is slow


One-bit Full-Adder Circuit

ai

bi

XOR

AND

XOR

ANDOR

ci

sumi

Ci+1

FAi


32-bit Ripple-Carry Adder

FA0

FA1

FA2

FA31

c0 a0 b0

a1 b1

a2 b2

a31 b31

sum0

sum1

sum2

sum31


How Fast is Ripple-Carry Adder?

Longest delay path (critical path) runs from cin to sum31.

Suppose delay of full-adder is 100ps. Critical path delay = 3,200ps Clock rate cannot be higher than 1012/3,200 =

312MHz. Must use more efficient ways to handle carry.


Fast Adders In general, any output of a 32-bit adder can be evaluated

as a logic expression in terms of all 65 inputs. Levels of logic in the circuit can be reduced to log2N for

N-bit adder. Ripple-carry has N levels. More gates are needed, about log2N times that of ripple-

carry design. Fastest design is known as carry lookahead adder.


N-bit Adder Design Options

Type of adder Time complexity

(delay)

Space complexity

(size)

Ripple-carry O(N) O(N)

Carry-lookahead O(log2N) O(N log2N)

Carry-skip O(√N) O(N)

Carry-select O(√N) O(N)

Reference: J. L. Hennessy and D. A. Patterson, Computer Architecture:A Quantitative Approach, Second Edition, San Francisco, California,1990.


An approach in-between our two extremes Motivation:

If we didn't know the value of carry-in, what could we do? When would we always generate a carry? gi = aibi

When would we propagate the carry? pi = ai + bi

Did we get rid of the ripple?c1 = g0 + p0c0

c2 = g1 + p1c1 c2 = g1 + p1 g0 + p1p0c0

c3 = g2 + p2c2 c3 = …

c4 = g3 + p3c3 c4 = …

Feasible! Why?

Carry-lookahead adder


Can’t build a 16 bit adder this way... (too big) Could use ripple carry of 4-bit CLA adders Better: use the CLA principle again!

Use principle to build bigger adders

a4 CarryIn

ALU1 P1 G1

b4a5b5a6b6a7b7

a0 CarryIn

ALU0 P0 G0

b0

Carry-lookahead unit

a1b1a2b2a3b3

CarryIn

Result0–3

pigi

ci + 1

pi + 1

gi + 1

C1

Result4–7

a8 CarryIn

ALU2 P2 G2

b8a9b9

a10b10a11b11

ci + 2

pi + 2

gi + 2

C2

Result8–11

a12 CarryIn

ALU3 P3 G3

b12a13b13a14b14a15b15

ci + 3

pi + 3

gi + 3

C3

Result12–15

ci + 4C4

CarryOut


Carry-Select Adder16-bit ripple carry adder

a0-a15

b0-b15

cin

sum0-sum15

16-bit ripple carry adder

a16-a31

b16-b31

0

16-bit ripple carry adder

a16-a31

b16-b31

1

Mu

ltip

lex

er

sum16-sum31

0

1This is known ascarry-select adder


ALU Summary

We can build an ALU to support MIPS addition Our focus is on comprehension, not performance Real processors use more sophisticated techniques for arithmetic Where performance is not critical, hardware description languages

allow designers to completely automate the creation of hardware!


Multiplication

More complicated than addition accomplished via shifting and addition

More time and more area Let's look at 3 versions based on a gradeschool

algorithm

0010 (multiplicand)

__x_1011 (multiplier)

Negative numbers: convert and multiply there are better techniques, we won’t look at them


Multiplication: Implementation

DatapathControl

MultiplicandShift left

64 bits

64-bit ALU

ProductWrite

64 bits

Control test

MultiplierShift right

32 bits

32nd repetition?

1a. Add multiplicand to product and

place the result in Product register

Multiplier0 = 01. Test

Multiplier0

Start

Multiplier0 = 1

2. Shift the Multiplicand register left 1 bit

3. Shift the Multiplier register right 1 bit

No: < 32 repetitions

Yes: 32 repetitions

Done


Final Version

Multiplicand

32 bits

32-bit ALU

ProductWrite

64 bits

Controltest

Shift right

32nd repetition?

Product0 = 01. Test

Product0

Start

Product0 = 1

3. Shift the Product register right 1 bit

No: < 32 repetitions

Yes: 32 repetitions

Done

What goes here?

•Multiplier starts in right half of product


Multiplying Signed Numbers withBoothe’s Algorithm Consider A x B where A and B are signed integers

(2’s complemented format) Decompose B into the sum B1 + B2 + … + Bn

A x B = A x (B1 + B2 + … + Bn)= (A x B1) + (A x B2) + … (A x Bn)

Let each Bi be a single string of 1’s embedded in 0’s:

…0011…1100… Example:

0110010011100 = 0110000000000 + 0000010000000 + 0000000011100


Boothe’s Algorithm

Scanning from right to left, bit number u is the first 1 bit of the string and bit v is the first 0 left of the string:

v u

Bi = 0 … 0 1 … 1 0 … 0

= 0 … 0 1 … 1 1 … 1 (2v – 1)

- 0 … 0 0 … 0 1 … 1 (2u – 1)

= (2v – 1) - (2u – 1)

= 2v – 2u


Boothe’s Algorithm Decomposing B: A x B = A x (B1 + B2 + … ) = A x [(2v1 – 2u1) + (2v2 – 2u2) + …] = (A x 2v1) – (A x 2u1) + (A x 2v2) – (A x 2u2) …

A x B can be computed by adding and subtracted shifted values of A: Scan bits right to left, shifting A once per bit When the bit string changes from 0 to 1, subtract

shifted A from the current product P – (A x 2u) When the bit string changes from 1 to 0, add shifted A

to the current product P + (A x 2v)


Floating Point Numbers

We need a way to represent a wide range of numbers numbers with fractions, e.g., 3.1416 large number: 976,000,000,000,000 = 9.76 × 1014

small number: 0.0000000000000976 = 9.76 × 10-14

Representation: sign, exponent, significand: (–1)sign ×significand ×2exponent

more bits for significand gives more accuracy more bits for exponent increases range


Scientific Notation

Scientific notation: 0.525×105 = 5.25×104 = 52.5×103 5.25 ×104 is in normalized scientific notation.

position of decimal point fixed leading digit non-zero

Binary numbers 5.25 = 101.01 = 1.0101×22

Binary point multiplication by 2 moves the point to the left division by 2 moves the point to the right

Known as floating point format.


Binary to Decimal Conversion

Binary (-1)S (1.b1b2b3b4) × 2E

Decimal (-1)S × (1 + b1×2-1 + b2×2-2 + b3×2-3 + b4×2-4) × 2E

Example: -1.1100 × 2-2 (binary) = - (1 + 2-1 + 2-2) ×2-2

= - (1 + 0.5 + 0.25)/4

= - 1.75/4

= - 0.4375 (decimal)


IEEE Std. 754 Floating-Point Format

S E: 8-bit Exponent F: 23-bit Fraction

bits 0-22 bits 23-30 bit 31

S E: 11-bit Exponent F: 52-bit Fraction +

bits 0-19 bits 20-30 bit 31

Continuation of 52-bit Fraction

bits 0-31

Single-Precision

Double-Precision


IEEE 754 floating-point standard

Represented value = (–1)sign ×F) ×2exponent – bias Exponent is “biased” (excess-K format) to make sorting easier

bias of 127 for single precision and 1023 for double precision E values in [1 .. 254] (0 and 255 reserved) Range = 2-126 to 2+127 (1038 to 10+38)

Significand in sign-magnitude, normalized form Significand = (1 + F) = 1. b-1b-1b-1…b-23

Suppress storage of leading 1

Overflow: Exponent requiring more than 8 bits. Number can be positive or negative.

Underflow: Fraction requiring more than 23 bits. Number can be positive or negative.


IEEE 754 floating-point standard

Example:

Decimal: -5.75 = - ( 4 + 1 + ½ + ¼ ) Binary: -101.11 = -1.0111 x 22

Floating point: exponent = 129 = 10000001

IEEE single precision: 110000001011100000000000000000000


Examples

1.1010001 × 210100 = 0 10010011 10100010000000000000000 = 1.6328125 × 220

-1.1010001 × 210100 = 1 10010011 10100010000000000000000 = -1.6328125 × 220

1.1010001 × 2-10100 = 0 01101011 10100010000000000000000 = 1.6328125 × 2-20

-1.1010001 × 2-10100 = 1 01101011 10100010000000000000000 = -1.6328125 × 2-20

Biased exponent (0-255), bias 127 (01111111) to be subtracted

0.50.1250.00781250.6328125


NegativeOverflow

PositiveOverflow

Expressible numbers

Numbers in 32-bit Formats Two’s complement integers

Floating point numbers

Ref: W. Stallings, Computer Organization and Architecture, Sixth Edition, Upper Saddle River, NJ: Prentice-Hall.

-231 231-10

Expressible negativenumbers

Expressible positivenumbers

0-2-127 2-127

Positive underflowNegative underflow

(2 – 2-23)×2127- (2 – 2-23)×2127


IEEE 754 Special Codes

± 1.0 × 2-127

Smallest positive number in single-precision IEEE 754 standard. Interpreted as positive/negative zero. Exponent less than -127 is positive underflow (regard as zero).

S 00000000 00000000000000000000000

S 11111111 00000000000000000000000

Zero

Infinity ± 1.0 × 2128 Largest positive number in single-precision IEEE 754 standard. Interpreted as ± ∞ If true exponent = 128 and fraction ≠ 0, then the number is greater

than ∞. It is called “not a number” or NaN (interpret as ∞).


Addition and Subtraction

Addition/subtraction of two floating-point numbers:

Example: 2 x 103 Align mantissas: 0.2 x 104

+ 3 x 104 + 3 x 104

3.2 x 104

General Case:

m1 x 2e1 ± m2 x 2e2 = (m1 ± m2 x 2e2-e1) x 2e1 for e1 > e2

= (m1 x 2e1-e2 ± m2) x 2e2 for e2 > e1

Shift smaller mantissa right by | e1 - e2| bits to align the mantissas.


Addition/Subtraction Algorithm

0. Zero check- Change the sign of subtrahend

- If either operand is 0, the other is the result

1. Significand alignment: right shift smaller significand until two exponents are identical.

2. Addition: add significands and report exception if overflow occurs.

3. Normalization- Shift significand bits to normalize.

- report overflow or underflow if exponent goes out of range.

4. Rounding


FP Add/Subtract (PH Text Figs. 3.16/17)


Example Subtraction: 0.5ten- 0.4375ten

Step 0: Floating point numbers to be added

1.000two×2-1 and -1.110two×2-2

Step 1: Significand of lesser exponent is shifted right until exponents match

-1.110two×2-2 → - 0.111two×2-1

Step 2: Add significands, 1.000two + (- 0.111two)

Result is 0.001two ×2-1


Example (Continued) Step 3: Normalize, 1.000two× 2-4

No overflow/underflow since

127 ≥ exponent ≥ -126 Step 4: Rounding, no change since the sum

fits in 4 bits.

1.000two ×2-4 = (1+0)/16 = 0.0625ten


FP Multiplication: Basic Idea

(m1 x 2e1) x (m2 x 2e2) = (m1 x m2) x 2e1+e2

Separate signs Add exponents Multiply significands Normalize, round, check overflow Replace sign


FP Multiplication Algorithm

P-H Figure 3.18


FP Mult. Illustration Multiply 0.5ten and -0.4375ten (answer = - 0.21875ten) or

Multiply 1.000two×2-1 and -1.110two×2-2

Step 1: Add exponents-1 + (-2) = -3

Step 2: Multiply significands 1.000

×1.110 0000 1000 100010001110000 Product is 1.110000


FP Mult. Illustration (Cont.)

Step 3: Normalization: If necessary, shift significand right and increment

exponent.

Normalized product is 1.110000 × 2-3

Check overflow/underflow: 127 ≥ exponent ≥ -126 Step 4: Rounding: 1.110 × 2-3

Step 5: Sign: Operands have opposite signs,

Product is -1.110 × 2-3

Decimal value = - (1+0.5+0.25)/8 = - 0.21875ten


FP Division: Basic Idea

Separate sign. Check for zeros and infinity. Subtract exponents. Divide significands. Normalize/overflow/underflow. Rounding. Replace sign.


MIPS Floating Point 32 floating point registers, $f0, . . . , $f31 FP registers used in pairs for double precision; $f0

denotes double precision content of $f0,$f1 Data transfer instructions:

lwc1 $f1, 100($s2) # $f1←Mem[$s1+100] swc1 $f1, 100($s2) # Mem[$s2+100]←$f1

Arithmetic instructions: (xxx=add, sub, mul, div) xxx.s single precision xxx.d double precision


Floating Point Complexities

Operations are somewhat more complicated (see text)

In addition to overflow we can have “underflow”

Accuracy can be a big problem

IEEE 754 keeps two extra bits, guard and round

four rounding modes

positive divided by zero yields “infinity”

zero divide by zero yields “not a number”

other complexities Implementing the standard can be tricky Not using the standard can be even worse

see text for description of 80x86 and Pentium bug!


Chapter Three Summary

Computer arithmetic is constrained by limited precision Bit patterns have no inherent meaning but standards do exist

two’s complement IEEE 754 floating point

Computer instructions determine “meaning” of the bit patterns Performance and accuracy are important so there are many

complexities in real machines Algorithm choice is important and may lead to hardware

optimizations for both space and time (e.g., multiplication)

You may want to look back (Section 3.10 is great reading!)

chapter 3 arithmetic for computers. spring 2005 elec 5200/6200 from patterson/hennessey slides...

Documents