comp arc a

8/9/2019 Comp Arc A

1/52

Arithmetic 1

ARITHMETIC

INTRODUCTION

A basic operation in all digital computers is theaddition or subtraction of two numbers.

Arithmetic operations occur at the machine instructionlevel.

They are implemented, along with basic logicfunctions such as AND, OR, NOT, and EXCLUSIVE-OR, in the arithmetic and logic unit (ALU) subsystemof the processor.

The time needed to perform an addition operationaffects the processors performance.

Multiply and divide operations, which require morecomplex circuitry than either addition or subtractionoperations, also affect performance.

Compared with arithmetic operations, logic operationsare simple to implement using combinationalcircuitry. They require only independent Booleanoperations on individual bit positions on the operands,whereas carry/borrow lateral signals are required inarithmetic operations.

8/9/2019 Comp Arc A

2/52

Arithmetic 2

REVIEW OF NUMBER REPRESENTATIONS

Three systems are widely used for representing bothpositive and negative numbers:

1. Sign and Magnitude

2. 1s complement

3. 2s complement

For all three systems, the leftmost bit is 0 for positivenumbers and 1 is for negative numbers

Binary, Signed Integer Representations

b 3 b 2 b 1 b 0 Sign andMagnitude

1s Complement 2s Com plement

0000 + 0 + 0 + 0

0001 + 1 + 1 + 10010 + 2 + 2 + 20011 + 3 + 3 + 30100 + 4 + 4 + 40101 + 5 + 5 + 50110 + 6 + 6 + 60111 + 7 + 7 + 71000 - 0 - 7 - 81001 - 1 - 6 - 7

1010 - 2 - 5 - 61011 - 3 - 4 - 51100 - 4 - 3 - 41101 - 5 - 2 - 31110 - 6 - 1 - 21111 - 7 - 0 - 1

8/9/2019 Comp Arc A

3/52

Arithmetic 3

Positive values have identical representations in allsystems.

In the sign-and-magnitude system, negative values are

represented by changing the MSB of the positiverepresentation to 1.

Example:

+ 5 = 0101

- 5 = 1101

In the 1s -complement system, negative values areobtained by complementing each bit of thecorresponding positive representation.

Example:

+3 = 0011

-3 = 1100

Clearly, the same operation, bit complementing, isdone in converting a negative number to thecorresponding positive value.

8/9/2019 Comp Arc A

4/52

Arithmetic 4

In the 2s -complement system, a negative number isobtained by subtracting the positive number from 2 n.

Hence, the 2s -complement representation is obtainedby adding 1 to the 1 s -complement representation.

Example:

+3 = 0011

-3 = 1101

Note that there are two distinct representations for +0and -0 in both sign-andmagnitude and 1s -complement, but the 2s -complement has only onerepresentation for 0. For 4-bit numbers, the value -8is representable in the 2s -complement system but notin the others.

Among the three systems, the 2s -complement systemyields the most efficient logic circuit implementation,and the one most often used in computers, for additionand subtraction operations. It is the one most oftenused in computers.

8/9/2019 Comp Arc A

5/52

Arithmetic 5

ADDITION OF POSITIVE NUMBERS

Addition of 1-bit numbers

The sum of 1 and 1 requires the 2-bit vector 1 0 torepresent the value 2.

In order to add multiple-bit numbers, bit pairs areadded starting from the low-order (right) end of the bitvectors, propagating carries toward the high-order(left) end.

Example:

00+

0

10+

1

01+

1

11+

01

Carry-out

XY+

Z

76+

13

Carry-inci

0

00+

1

11

1

11

0

10

1

0110

xiyi

si

Carry-outci+1

Legend for stage i

8/9/2019 Comp Arc A

6/52

Arithmetic 6

The truth table for the sum and carry-out functions foradding two equally weighted bits xi and yi:

xi y i ci si ci+1 0 0 0 0 00 0 1 1 00 1 0 1 0

0 1 1 0 11 0 0 1 01 0 1 0 11 1 0 0 11 1 1 1 1

The logic equation for the sum bit si is given by:

si = x i y i ci + x i y i ci + x i y i ci + x i y i ci

The logic equation for the carry-out bit c i+1 is givenby:

ci+1 = y i ci + x i ci + x i y i

8/9/2019 Comp Arc A

7/52

Arithmetic 7

A straightforward, 2-level, combinational logic circuitimplementation of the truth table for addition is shownbelow:

Simplified block representation of an adder(sometimes called a full-adder )

x iy ic i

x iy ic i

x i

y ic i

x iy ic i

si

xi

yi

xi

ci

yi

ci

ci+1

Adder(A) ci

si

xi yi

ci+1

8/9/2019 Comp Arc A

8/52

Arithmetic 8

A cascaded connection of n adder blocks can be usedto add two n-bit numbers.

Since the carries must propagate, or ripple, throughthis cascade, the configuration is called an n-bit

ripple-carry adder .

A cascade of k n-adders:

Adder(A)

c0

s0

x0 y0

c1Adder(A)

s1

x1 y1

cn-1Adder(A)

sn-1

xn-1 yn-1

cn

MSB Position LSB Position

n-bitadder c0

s0

x0 y0

cn

xn-1 yn-1

sn-1

. . .

. . .

n-bitadder

sn

xn ynx2n-1 y2n-1

sn-1

. . .

. . .

n-bitadder

s(k-1)n

ckn

xkn-1 ykn-1

skn-1

. . .

. . .

8/9/2019 Comp Arc A

9/52

Arithmetic 9

DESIGN OF FAST ADDERS

The n-bit ripple-carry adder may have too much delayin developing its outputs, s0 through sn-1 and cn.

Suppose that the delay from ci to ci+1 of any Adderblock is 1 ns (assuming a gate delay of 0.5 ns). An n-bit addition can be performed in the time it takes thecarry signal to reach the cn-1 position plus the time ittakes to develop sn-1 . Assuming the last delay is 1.5ns, a 32-bit addition takes (31 1 ns) + 1.5 ns = 32.5

ns.

Two approaches can be taken to reduce this delay tothe desired 10-ns range. The first approach is to usefaster electronic circuit technology in implementingthe ripple-carry logic design. The second approach isto use a different logic network structure.

Logic structures for fast adder design must speed upthe generation of the carry signals. The logicexpressions for si (sum) and c i+1 (carry-out) of stage i are:

si = x i y i ci + x i y i ci + x i y i ci + x i y i ci

ci+1 = x i y i + y i ci + x i ci

8/9/2019 Comp Arc A

10/52

Arithmetic 10

By doing the following:

ci+1 = x i y i + (y i + x i) c i

Let G i = xi yi and P i = xi + y i

ci+1 = G i + P i ci

The expressions G i and P i are called the generate and

propagate functions for stage i. Take note that bothfunctions are dependent only on the X and Y inputsand not on any carry input.

Expanding c i in terms of i 1 subscripted variables:

ci = x i-1 y i-1 + (y i-1 + x i-1 ) c i-1

ci = G i-1 + P i-1 ci-1

Substituting into the ci+1 equation:

ci+1 = G i + P i ci

ci+1 = G i + P i [G i-1 + P i-1 ci-1]

ci+1 = G i + P i G i-1 + P i Pi-1 ci-1

8/9/2019 Comp Arc A

11/52

Arithmetic 11

Continuing this type of expansion, the final expressionfor any carry variable is:

ci+1 = G i + P iG i-1 + P iPi-1G i-2 + +

P iP i-iP 1G0 + P iP i-1P 0c0

Thus all carries can be obtained three-logic delaysafter the input operands X , Y , and c0 are available,because only one gate delay is needed to develop allP i and G i signals, followed by two gate delays in theAND-OR circuit for ci+1 .

After another three gate delays (one delay to inverteach carry and two further delays to form each sum bitas shown earlier), all sum bits are available.

Therefore, independent of n, the n-bit addition processrequires only six levels of logic.

8/9/2019 Comp Arc A

12/52

Arithmetic 12

Example:

Assume a 4-bit adder:

c1 = G 0 + P 0c0

c2 = G 1 + P 1G0 + P 1P0c0

c3 = G 2 + P 2G1 + P 2P1G0 + P 2P1P0c0

c4 = G 3 + P 3G2 + P 3P2G1 + P 3P2P1G0 + P 3P2P1P0c0

It will take 1 gate delay (0.5 ns) to produce all thegenerate and propagate functions after the availabilityof the X and Y inputs.

After the generate and propagate functions becomeavailable, it will take two gate delays (1 ns) toproduce all the carry signals ( c1, c2, c3, c4).

After all the carry signals become available, it willtake three gate delays (1.5 ns) to produce all the sumbits.

Therefore the total time for a 4-bit addition will takeonly 3 ns. It also takes the same time to perform anyn-bit addition.

8/9/2019 Comp Arc A

13/52

Arithmetic 13

Fast adders that form carry functions are called carry-

lookahead adders .

A practical problem with the carry-lookaheadapproach is the gate fan-in constraints. Theexpression for c i+1 requires i + 2 inputs to the largest

AND term and i + 2 inputs to the OR term.

Consider the addition of two 8-bit numbers. For theripple-carry adder, the total time for an 8-bit additionis (7 1 ns) + 1.5 ns = 8.5 ns. For the carry-lookahead adder, the total time is 3 ns. However, thecarry-lookahead adder requires a fan-in of nine for thebasic gates.

A solution to this problem is to implement the carry-lookahead technique on a per block basis.

8/9/2019 Comp Arc A

14/52

Arithmetic 14

For the 8-bit addition:

The carries inside each 4-bit adder block are formedby lookahead circuits. However, they still ripplebetween blocks.

Because of the lookahead circuits in the rightmostadder, c4 will be formed after 3 gate delays (1.5 ns).

After c4 becomes available, the leftmost block willproduce the remaining part of the sum in 2.5 ns(because of the lookahead circuits). Therefore, thetotal time for an 8-bit addition is 4.0 ns.

4-bit Adderc0

s0s1s2s3

y3x3 y2x2 y1x1 y0x0

c44-bit Adder

s4s5s6s7

y7x7 y6x6 y5x5 y4x4

c8

8/9/2019 Comp Arc A

15/52

Arithmetic 15

SIGNED ADDITION AND SUBTRACTION

Out of the three methods of representing signednumbers, the 2s complement representation is thebest method in terms of efficiently implementingaddition and subtraction.

The rules for governing the addition and subtraction

of n-bit signed nu mbers using the 2s complementrepresentation system are:

1. To add two numbers, add their representations inan n-bit adder, ignoring the carry-out signal fromthe MSB position. The sum will be thealgebraically correct value in 2s complementrepresentation as long as the answer is in therange 2n-1 through +2 n-1 1.

2. To subtract two numbers X and Y , that is toperform X Y , form the 2s complement of Y andthen add it to X , as in rule 1. Again, the resultwill be the algebraically correct valu e in 2scomplement representation as long as the answeris in the range 2n-1 through +2 n-1 1.

8/9/2019 Comp Arc A

16/52

Arithmetic 16

Examples:

0 0 1 0

+ 0 0 1 1

0 1 0 1

( + 2 )

+ ( + 3 )

( + 5 )

0 1 0 0

+ 1 0 1 0

1 1 1 0

( + 4 )

+ ( - 6 )

( - 2 )

1 0 1 1

+ 1 1 1 0

1 0 0 1

( - 5 )

+ ( - 2 )

( - 7 )

0 1 1 1

+ 1 1 0 1

0 1 0 0

( + 7 )

+ ( - 3 )

( + 4 )

0 0 1 0

- 0 1 0 0

( + 2 )

- ( + 4 )

0 0 1 0

+ 1 1 0 0

1 1 1 0 ( - 2 )

0 1 1 0

- 0 0 1 1

( + 6 )

- ( + 3 )

0 1 1 0

+ 1 1 0 1

0 0 1 1 ( + 3 )

1 0 0 1

- 1 0 1 1

( - 7 )

- ( - 5 )

1 0 0 1

+ 0 1 0 1

1 1 1 0 ( - 2 )

1 0 0 1

- 0 0 0 1

( - 7 )

- ( + 1 )

1 0 0 1

+ 1 1 1 1

1 0 0 0 ( - 8 )

0 0 1 0

- 1 1 0 1

( + 2 )

- ( - 3 )

0 0 1 0

+ 0 0 1 1

0 1 0 1 ( + 5 )

1 1 0 1

- 1 0 0 1

( - 3 )

- ( - 7 )

1 1 0 1

+ 0 1 1 1

0 1 0 0

+

( + 4 )

8/9/2019 Comp Arc A

17/52

Arithmetic 17

Binary addition-subtraction logic network:

The Add/Sub control wire is set to 0 for addition.This allows the Y vector to be applied unchanged toone of the adder inputs along with the carry-in signal,c0, of 0.

When the Add/Sub control wire is set to 1 for

subtraction, the Y vector is 1s complemented (that is,the bit is complemented) by the EX-OR gates, and c0 is set to 1 to comple te the 2s complementation of Y .

xn-1 x1 x0

. . .

yn-1

Add/S ubControl

. . .

y1 y0

. . .

n-bit adder

c0cn

sn-1 s1

x0

. . .

s0

8/9/2019 Comp Arc A

18/52

Arithmetic 18

When the result does not fall within the representablerange, then an arithmetic overflow has occurred.

Take note that overflow can occur only when adding

two numbers that have the same sign. When bothoperands X and Y have the same sign, an overflowoccurs when the sign of S does not agree with thesigns of X and Y . In an n-bit adder, the overflowsignal is defined by the logical expression:

Overflow = x n-1 yn-1 sn-1 + x n-1 yn-1 sn-1

A computer must be able to detect an overflow. It iscustomary to dedicate a condition code flag as theindicator, and it is possible to have this flag cause aninterrupt when an add or Subtract instruction results inan overflow. Then the programmer must decide howto correct the problem.

Another overflow indicator:

There is an overflow when there is a carry intothe MSB and no carry out of the MSB or vice-versa. For subtraction, it is set to 1, when theMSB needs a borrow and there is no borrow fromthe MSB, or vice-versa.

8/9/2019 Comp Arc A

19/52

Arithmetic 19

MULTIPLICATION OF POSITIVE NUMBERS

The usual algorithm for multiplying integers by hand

for the binary system:

1 1 0 1 (13) Multiplicand M1 0 1 1 (11) Multiplier Q1 1 0 1

1 1 0 10 0 0 0

1 1 0 11 0 0 0 1 1 1 1 (143) Product P

This algorithm applies to unsigned numbers and topositive numbers.

The product of two n-digit numbers can beaccommodated in 2 n digits, so the product of two 4-bitnumbers fits into 8 bits.

In the binary system, multiplication of themultiplicand by one bit of the multiplier is easy.

If the multiplier bit is 1, the multiplicand is entered inthe appropriate position to be added to the partialproduct. If the multiplier is 0, then 0s are entered.

In early computers, because of logic costs, the addercircuitry in the ALU was used to performmultiplication sequentially.

8/9/2019 Comp Arc A

20/52

Arithmetic 20

Block diagram for the sequential circuit binarymultiplier:

C a n-1 . . . a 0 q n-1 . . . q 0

n -bitadder

0

MPX

m n-1 . . . m 0

0

ControlSequencer

Multiplicand M

Register A (initial ly 0)

Multiplier QAdd/NoaddControl

1 1 0 1

0 0 0 0 1 0 1 10

M

A QC

Initial

1 1 0 1 1 0 1 10

0 1 1 0 1 1 0 10

Add

Shift1st

Cycle

0 0 1 1 1 1 0 11

1 0 0 1 1 1 1 00

Add

Shift2nd

Cycle

1 0 0 1 1 1 1 00

0 1 0 0 1 1 1 10

No Add

Shift3rd

Cycle

0 0 0 1 1 1 1 11

1 0 0 0 1 1 1 10

Add

Shift4th

Cycle

Product

8/9/2019 Comp Arc A

21/52

Arithmetic 21

The sequential multiplier circuit performsmultiplication by using a single adder n times.

Registers A and Q combined hold PP i while multiplierbit q i generates the signal Add/Noadd. This signalcontrols the addition of the multiplicand, M, to PP i togenerate PP( i + 1).

The product is computed in n cycles.

The partial product grows in length by 1 bit per cyclefrom the initial vector PP0, of n 0s in register A.

At the start, the multiplier is loaded into register Q,the multiplicand into register M, and C and A arecleared to 0. At the end of each cycle, C, A, and Q areshifted right one bit position to allow for growth of thepartial product as the multiplier is shifted out of register Q.

Because of this shifting, multiplier bit q i appears at theLSB position of Q to generate the Add/Noadd signalat the correct time. After they are used, the multiplierbits are discarded by the right-shift operation.

After n cycles, the high-order half of the product isheld in register A and the low-order half is in registerQ.

8/9/2019 Comp Arc A

22/52

Arithmetic 22

SIGNED-OPERAND MULTIPLICATION

Consider first the case of a positive multiplier and a

negative multiplicand. In adding a negativemultiplicand to a partial product, it is necessary toextend the sign-bit value of the multiplicand to the leftas far as the product will extend.

Example:

1 0 0 1 1 (-13)0 1 0 1 1 (+11)

1 1 1 1 1 1 0 0 1 11 1 1 1 1 0 0 1 10 0 0 0 0 0 0 01 1 1 0 0 1 10 0 0 0 0 01 1 0 1 1 1 0 0 0 1 (-143)

For a negative multiplier, a straightforward solution isto form the 2s complement of both the multiplier andthe multiplicand and proceed as in the case of apositive multiplier.

This is possible because complementation of bothoperands does not change the value or the sign of theproduct.

8/9/2019 Comp Arc A

23/52

Arithmetic 23

Booth Algorithm

A powerful algorithm for signed-numbermultiplication, the Booth algorithm generates a 2 n-bit

product and treats both positive and negative numbersuniformly.

Consider a multiplication operation in which themultiplier is positive and has a single block of 1s, suchas 0011110. To derive the product, one can add fourappropriately shifted versions of the multiplicand, as

in the standard procedure.

However, the number of required operations can bereduced by regarding the multiplier as the differencebetween two numbers:

0100000 (32)- 0000010 (2)

0011110 (30)

This suggests that the product can be generated byadding 2 5 times the multiplicand to the 2scomplement of 2 1 times the multiplicand.

For convenience, the sequence of required operationscan be described by recording the preceding multiplieras 0 +1 0 0 0 1 0.

8/9/2019 Comp Arc A

24/52

Arithmetic 24

Normal multiplication algorithm:

0 1 0 1 1 0 1

0 0 1 1 1 1 00 0 0 0 0 0 0

0 1 0 1 1 0 10 1 0 1 1 0 1

0 1 0 1 1 0 10 1 0 1 1 0 1

0 0 0 0 0 0 0

0 0 0 0 0 0 00 0 0 1 0 1 0 1 0 0 0 1 1 0

Booth multiplication algorithm:

0 1 0 1 1 0 10+10 0 0 -10

0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 0 1 0 0 1 10 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 1 0 1 1 0 10 0 0 0 0 0 0 00 0 0 1 0 1 0 1 0 0 0 1 1 0

8/9/2019 Comp Arc A

25/52

Arithmetic 25

In general, in the Booth scheme, -1 times the shiftedmultiplicand is selected when moving from 0 to 1, and+1 times the shifted multiplicand is selected whenmoving from 1 to 0, as the multiplier is scanned from

right to left.

Multiplier Version of multiplicand selected

by bit i Bit i Bit i 1

0 0 0 x M0 1 +1 x M1 0 - 1 x M1 1 0 x M

The Booth algorithm clearly extends to any number of 1s in a multiplier, including the situation in which asingle 1 is considered a block.

Example:

0 0 1 0 1 1 0 0 1 1 1 0 1 0 1 1 0 0

0 +1-1 +1 0 -1 0 +1 0 0 -1 +1-1 +1 0 -1 0 0

8/9/2019 Comp Arc A

26/52

Arithmetic 26

The Booth algorithm can also be used for negativenumbers.

0 1 1 0 1 (+13)x 1 1 0 1 0 (-6)

0 1 1 0 1

0-1 +1-100 0 0 0 0 0 0 0 0 01 1 1 1 1 0 0 1 10 0 0 0 1 1 0 11 1 1 0 0 1 10 0 0 0 0 01 1 1 0 1 1 0 0 1 0 (-78)

The transformation 011110 +1000 -10 is called skipping over 1s . This term is derived from the casein which the multiplier has its 1s grouped into a fewcontiguous blocks; only a few versions of themultiplicand, that is, the summands, must be added togenerate the product, thus speeding up themultiplication operation.

8/9/2019 Comp Arc A

27/52

Arithmetic 27

However, in the worst case that of alternating 1s and0s in the multiplier each bit of the multiplier selects

a summand. In fact, this results in more summandsthan if the Booth algorithm were not used.

Examples:

Worst-case Multiplier:

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

+1-1 +1-1 +1-1 +1 -1 +1-1 +1-1 +1 -1 +1-1

Ordinary Multiplier:

1 1 0 0 0 1 0 1 1 0 1 1 1 1 0 0

0 -1 0 0 +1-1 +1 0 -1 +1 0 0 0 -1 0 0

Good Multiplier:

0 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1

0 0 0 0 +1 0 0 0 0 -1 0 0 0 +1 0 -1

8/9/2019 Comp Arc A

28/52

8/9/2019 Comp Arc A

29/52

Arithmetic 29

FAST MULTIPLICATION

A multiplication speedup technique will now bediscussed that guarantees at most n /2 summands andthat uniformly handles the signed-operand case. Thisis twice as fast as the worst-case Booth algorithmsituation.

Recall that the Booth technique multiplier q i selects asummand as a function of the bit q i-1 on its right. Thesummand selected is shifted i binary positions to theleft of the LSB position of the product before thesummand is added. Let this summand position as i (SP i).

The basic idea of the speedup technique is to use thebits i + 1and i to select a summand, as a function of biti 1, to be added at SP i. The n /2 summands are thusselected by bit pairs ( x1, x0), ( x3, x2), ( x5, x4), and so on.The technique is called the bit-pair recoding method .

8/9/2019 Comp Arc A

30/52

Arithmetic 30

Example:

Consider the multiplier 11010. Its Boothrepresentation is given as 0 1 +1 1 0. Bygrouping the Booth-recoded selectors, one canobtain a single, appropriately shifted summandfor each pair.

The rightmost Booth pair, (-1, 0), is equivalent to 2 x (multiplicand M) at SP0. The next pair, (-1,+1), is equivalent to 1 x M at SP2; finally, theleft most pair of zeros is equivalent to 0 x M atSP4.

Restating these selections in terms of the originalmultiplier bits, the case of (1,0) with 0 on theright selects 2 x M; the case of (1,0) with 1 onthe right selects 1 x M; and the case of (1,1)with 1 on the right selects 0 x M.

1 1 0 1 0 01

implied 0 to theright of LSB

sign extension

0 0 -1 +1 -1 0

-2-10

8/9/2019 Comp Arc A

31/52

Arithmetic 31

The complete set of eight cases is shown below:

Multiplier Bit Pair Multiplier Bit on the Right

i - 1

Multiplicand

Selected at SP i i + 1 i0 0 0 0 x M0 0 1 +1 x M0 1 0 +1 x M0 1 1 +2 x M1 0 0 -2 x M1 0 1 -1 x M

1 1 0 -1 x M1 1 1 0 x M

Examples:

0 1 1 0 1

+1 -1 +1

1 0 1 1 1

+1 +2 +1

8/9/2019 Comp Arc A

32/52

Arithmetic 32

Multiplication Examples:

0 1 1 0 1 (+13)x 1 1 0 1 0 (- 6)

0 1 1 0 10 -1 -2

1 1 1 1 1 0 0 1 1 0

1 1 1 1 0 0 1 10 0 0 0 0 01 1 1 0 1 1 0 0 1 0 (- 78)

0 1 0 1 0 (+10)x 1 0 1 1 1 (- 9)

0 1 1 0 1-1 +2 -1

1 1 1 1 1 1 0 1 1 00 0 0 1 0 1 0 01 1 0 1 1 01 1 1 0 1 0 0 1 1 0 (- 90)

8/9/2019 Comp Arc A

33/52

Arithmetic 33

INTEGER DIVISION

Longhand division examples:

A circuit that implements division by this longhandmethod operates as follows:

It positions the divisor appropriately with respectto the dividend and performs a subtraction. If the

remainder is zero or positive, a quotient bit isdetermined, the remainder is extended by anotherbit of the dividend, the divisor is repositioned,and another subtraction is performed.

On the other hand, if the remainder is negative, aquotient bit of 0 is determined, the dividend isrestored by adding back the divisor, and thedivisor is repositioned for another subtraction.

27413

21

2614

1 0 0 0 1 0 0 1 01 1 0 1

1 0 1 0 1

1 1 0 11 0 0 0 0

13

1

1 1 0 11 1 1 01 1 0 1

1

8/9/2019 Comp Arc A

34/52

Arithmetic 34

The logic circuit arrangement that implements this

restoring-division technique is shown below:

a n-1 . . . a 0 q n-1 . . . q 0

n +1-bitadder

0

m n-1 . . . m 0

ControlSequencer

Divisor M

Register A

Dividend QQuotientSetting

a n

Shift Left

Add/Subtract

0

8/9/2019 Comp Arc A

35/52

Arithmetic 35

An n-bit divisor is loaded into register M and an n-bitpositive dividend is loaded into register Q at the startof the operation. Register A is set to 0.

After the division is complete, the n-bit quotient is inregister Q and the remainder is in register A. Therequired subtractions are facilitated by using 2scomplement arithmetic. The extra bit position at theleft end of both A and M accommodates the sign bit

during subtractions.

The algorithm that follows performs the division:

Do the following n times:

Shift A and Q left one binary position.

Subtract M from A, and place the answerback in A.

If the sign of A is 1, set q0 to 0 and add Mback to A (that is, restore A); otherwise, setq0 to 1.

8/9/2019 Comp Arc A

36/52

Arithmetic 36

A restoring-division example:

1 0 0 0

0 0 0 0 0

1stCycle

2ndCycle

Remainder

1 1

1 0

1 1

1 0

Initially 1 0 0 00 0 0 1 10 0 0 0 1Shift1 1 1 0 1Subtract

1 1 1 1 0Set q01 1

0 0 0 0 1

Restore

0 0 0 0

0 0 0 00 0 0 1 01 1 1 0 1Subtract

1 1 1 1 1Set q01 1

0 0 0 1 0

Restore

0 0 0 0

0 0 0 0

Shift

0 0 1 0 0

1 1 1 0 1Subtract0 0 0 0 1Set q0

0 0 0 0

0 0 0 1

Shift

3rdCycle

4thCycle

0 0 0 1 01 1 1 0 1Subtract

1 1 1 1 1Set q01 1

0 0 0 1 0

Restore

0 0 1 0

0 0 1 0

Shift

Quotient

8/9/2019 Comp Arc A

37/52

Arithmetic 37

This algorithm can be improved by avoiding the needfor restoring A after an unsuccessful subtraction(subtraction is said to be unsuccessful if the result is

negative).

Consider the sequence of operations that takes placeafter the subtraction operation in the precedingalgorithm. If A is positive, we shift left and subtractM; that is, we perform 2A M. If A is negative, werestore it by performing A + M. and then we shift it

left and subtract M. This is equivalent to performing2A + M. The q0 bit is appropriately set to 0 or 1 afterthe correct operation has been performed. All of thesecan be summarized in the following nonrestoring-

division algorithm:

Step 1: Do the following n times:

If the sign of A is 0, shift A and Q one bitposition and subtract M from A;otherwise shift A and Q left and add Mto A.

If the sign of A is 0, set q0 to 1; otherwise,set q0 to 0.

Step 2: If the sign of A is 1, add M to A.

8/9/2019 Comp Arc A

38/52

Arithmetic 38

A nonrestoring-division example:

0 0 0 0 0

1stCycle

Remainder

Initially 1 0 0 00 0 0 1 10 0 0 0 1Shift1 1 1 0 1Subtract

1 1 1 1 0Set q0

0 0 0 0

0 0 0 0

Quotient

1 1 1 0 0Shift0 0 0 1 1Add

1 1 1 1 1Set q0

0 0 0 0

0 0 0 0

2nd

Cycle

1 1 1 1 0Shift0 0 0 1 1Add

0 0 0 0 1Set q0

0 0 0 0

0 0 0 1

3rdCycle

0 0 0 1 0Shift1 1 1 0 1Subtract

1 1 1 1 1Set q0

0 0 1 0

0 0 1 0

4thCycle

1 1 1 1 1Add0 0 0 1 1

0 0 0 1 0

RestoreRemainder

8/9/2019 Comp Arc A

39/52

Arithmetic 39

FLOATING-POINT NUMBERS AND OPERATIONS

Until now, the discussions have dealt with signed,fixed-point numbers and have conveniently

considered them integers, that is, having an impliedbinary point at the right end of each number.

It is also possible to assume that the binary point is just to the right of the sign bit, thus representing afraction.

In the 2s -complement system, the signed value F ,represented by the n-bit binary fraction:

B = b 0.b -1b-2 b - (n-1)

is given by:

F(B) = -b 0 x 2 0 + b -1 x 2 -1 + b -2 x 2 -2 + + b - (n-1) x 2 -(n-1)

where the range of F is:

-1 F 1 2-(n-1)

Consider the range of values possible representable ina 32-bit, signed, fixed-point format. Interpreted asintegers, the value range is approximately 0 to 2.15 x109. If interpreted as fractions, the range isapproximately 4.55 x 10 -10 to 1.

8/9/2019 Comp Arc A

40/52

Arithmetic 40

Neither of these ranges is sufficient for scientificcalculations, which might involve parameters likeAvogadros number (6.0247 x 10 23) or Plancksconstant (6.6254 x 10 -27).

This means that a computer must be able to representnumbers and operate on them in such a way that theposition of the binary point is variable and isautomatically adjusted as computation proceeds.

In such a case, the binary point is said to float, and the

numbers are called floating-point numbers . Thisdistinguishes them from fixed-point numbers, whosebinary point is always in the same position.

In the decimal scientific notation, numbers may bewritten as 6.0246 x 10 23, 6.6254 x 10 -27 , -1.0341 x 10 2,-7.3000 x 10 -14 , and so on. These numbers are said tobe given five significant digits . The scale factors (10 23, 10 -27 , and so on) indicate the position of thedecimal point with respect to the significant digits.

By convention, when the decimal point is placed tothe right of the first (nonzero) significant digit, thenumber is said to be normalized .

A floating-point representation is one in which anumber is represented by its sign, a string of significant digits, commonly called the mantissa , andan exponent to an implied base for the scale factor.

8/9/2019 Comp Arc A

41/52

Arithmetic 41

The standard for representing floating-point numbersin 32 bits has been developed and specified in detailby the IEEE.

value represented = 1. M x 2 E -127

The sign of the number is given in the first bit,followed by a representation for the exponent (to thebase 2) of the scale factor. Instead of the signedexponent, E, the value actually stored in the exponentfield is an unsigned integer E = E + 127. This is

called the excess-127 format . Thus E is in the range0 E 255.

S E' M

32 bits

Sign of Numbers:0 signifies +1 signifies - 8-bit signed exponent

in excess-127representation

23-bitmantissa fraction

8/9/2019 Comp Arc A

42/52

Arithmetic 42

Examples:

Exponent

(Binary)

Exponent

(Decimal)

Actual

Exponent00000001 1 -12600000010 2 -125

.

.

.

.

.

.

.

.

.01111111 127 0

10000000 128 +110000001 129 +2...

.

.

.

.

.

.11111101 253 +12611111110 254 +127

The end values of E, namely, 0 and 255, are used toindicate floating-point values of exact 0 and infinity,respectively. Thus, the range of E for normal valuesis 0 < E'< 255. This means that the actual exponent,E, is in the range 126 E 127.

The excess x representation for exponents enablesefficient comparison of the relative sizes of twofloating-point numbers.

8/9/2019 Comp Arc A

43/52

Arithmetic 43

The last 23 bits represent the mantissa. Since binarynormalization is used, the most significant bit of the

mantissa is always equal to 1. This bit is notexplicitly represented; it is assumed to be to theimmediate left of the binary point. The 23 bits storedin the M field represent the fractional part of themantissa, that is, the bits to the right of the binarypoint.

Example:

The 32-bit standard representation given is called a single-precision representation because it occupies asingle 32-bit word. The scale factor of thisrepresentation has a range of 2 -126 to 2 +127 , which is

approximately 1038

. The 24-bit mantissa providesapproximately the same precision as a 7-digit decimalvalue.

0 0 0 1 0 1 0 0 0 0 0 1 0 1 0 . . . 0

value represented = 1.001010 ... 0 x 2 -87

8/9/2019 Comp Arc A

44/52

Arithmetic 44

To provide more precision and range for floating-point numbers, the IEEE standard also specifies a

double-precision format for floating-point numberrepresentation:

The 11-bit excess- 1023 exponent E of the double - precision format has the range 0 < E < 2047 for normal values. Thus the actual exponent E is in therange 1022 E 1023, providing scale factors thatrange from 2 -1022 to 2 1023 (approximately 10 308).

The 53-bit mantissa provides a precision equivalent to

about 16 decimal digits.

S E' M

64 bits

Sign of

Numbers:0 signifies +1 signifies - 11-bit signed exponent

in excess-1023representation

52-bitmantissa fraction

8/9/2019 Comp Arc A

45/52

Arithmetic 45

If a number is not normalized, it can always be put innormalized form by shifting the fraction and adjustingthe exponent.

Example:

Unnormalized Value

Normalized Value

0 1 0 0 0 1 0 0 0 0 0 1 0 1 1 0 . . .

value represented = +0.0010110 ... x 2 9

excess-127 exponent

(There is no implicit 1 to the left of the binary point)

0 1 0 0 0 0 1 0 1 0 1 1 0 . . .

value represented = +1.0110 ... x 2 6

8/9/2019 Comp Arc A

46/52

Arithmetic 46

Arithmetic Operations on Floating-Point Numbers

Example of a floating point addition:

2.9400 x 10 2 + 4.3100 x 10 4

Since the exponents of the two numbers differ,mantissas must be shifted with respect to each otherbefore they are added or subtracted. Normally, thenumber with the smaller exponent is adjusted so that itwill have an exponent equal to the other number.

The result of course, will have an exponent equal tothe higher exponent.

0.0294 x 10 4 + 4.3100 x 10 4

4.3394 x 10 4

8/9/2019 Comp Arc A

47/52

8/9/2019 Comp Arc A

48/52

Arithmetic 48

Example: 6-bit to 3-bit truncation

All fractions in the range 0. b-1b-2b-3000 and0.b-1b-2b-3111 are truncated to 0. b-1b-2b-3.

2. Von Neumann Rounding . If the bits to beremoved are all 0s, they are simply dropped, withno changes in the retained bits. However, if anyof the bits to be removed are 1, the leastsignificant bit of the retained bits is set to 1.

Example: 6-bit to 3-bit truncation

All 6-bit fractions with b-4b-5b-6 not equal to000 are truncated as 0. b-1b-21.

3. Rounding . A 1 is added to the LSB position of the bits retained if there is a 1 in the MSBposition of the bits to be removed.

Example:

0.b-1b-2b-31 is rounded to 0. b-1b-2b-3+0.001

0.b-1b-2b-30 is rounded to 0. b-1b-2b-3

8/9/2019 Comp Arc A

49/52

Arithmetic 49

Implementation of Floating-Point Operations

8-bitSubtractor SWAP

S HIFTER

n bitsto right

MantissaAdder/Subtractor

CombinationalCONTROL

Network

Normalizeand

Round

8-bitSubtractor

Leading ZerosDetector

MPX

E A' E B '

M A M B

E A' E B '

M of numberwith smaller E'

M of numberwith larger E'

n = | EA' - E B' |

sign

SBSA

Add / Subtract

R : S R E R ' M R

E' - X

E'

X

Magnitude M

Add / Subtract

Sign

32-bitResult

R = A + B

8/9/2019 Comp Arc A

50/52

Arithmetic 50

Let the signs, exponents, and mantissas of operands A and B be represented by S A, E A, M A and S B, E B, M B,respectively.

The first step is to compare exponents to determinehow far to shift the mantissa of the number with thesmaller exponent. This shift-count value n isdetermined by the 8-bit subtractor circuit in the upperleft corner of the figure.

The magnitude of the difference E A E B, which is n, issent to the SHIFTER unit. The sign of the differenceresulting from the exponent comparison determineswhich mantissa is to be shifted. Therefore the sign issent to the SWAP network in the upper right corner of the figure. If the sign is 0, the E A E B, the mantissa

M B is sent to the SHIFTER since the exponent of B issmaller. Otherwise ( E A < E B), the mantissa M A is sentto the SHIFTER.

Step 2 is performed by the two-way multiplexer,MPX, in the bottom left corner of the figure. Theexponent of the result, E, is tentatively determined as

E A if E A E B, or E B if E A < E B. This is determined bythe sign of the difference resulting from the exponentcomparison operation in step 1.

8/9/2019 Comp Arc A

51/52

Arithmetic 51

Step 3 involves the major component, the mantissaadder-subtractor in the middle of the figure. TheCONTROL logic determines whether the mantissasare to be added or subtracted. This is decided by the

signs of the operands ( S A and S B) and the operation(add or subtract) that is to be performed on theoperands. The CONTROL logic also determines thesign of the result, S R.

Example:

If A is negative ( S A = 1), B is positive ( S B = 0),and the operation is A B, then the mantissas aresubtracted (the CONTROL logic will output aSUB operation to the adder-subtractor) and thesign of the result is negative ( S R = 1).

On the other hand, if A and B are both positive,and the operation is A B, then the mantissas aresubtracted. The sign of the result, S R, nowdepends on the mantissa subtraction operation.

Step 4 of the Add/Subtract rule consists of normalizing the result of step 3, mantissa M . Thenumber of leading zeros in M determines the numberof bit shifts, X , to be applied to M . The normalizedvalue to be truncated is truncated to generate the 24-bit mantissa, M R, of the result. The value X is alsosubtracted from the tentative result exponent E togenerate the true exponent, E R.

8/9/2019 Comp Arc A

52/52

Multiplication and division are somewhat easier thanaddition and subtraction, in that no alignment of mantissas is needed.

Multiply Rule

1. Add the exponents and subtract 127.

2. Multiply the mantissas and determine the sign of

the result.

3. Normalize the resulting value if necessary.

Divide Rule

1. Subtract the exponents and add 127.

2. Divide the mantissas and determine the sign of the result.

3. Normalize the resulting value if necessary.

The addition or subtraction of 127 in the Multiply andDivide rules results from using the excess-127notation for exponents.

comp arc a

Documents