lect 2b -ieee floating point adder arch

7/28/2019 Lect 2b -IEEE Floating Point Adder Arch

1/40

1/8/2007 - L25 Floating PointAdder Copyright 2006 - Joanne DeGroat, ECE, OSU1

IEEE Floating Point AdderUsing the IEEE Floating Point Standard for an

add/subtract execution units


2/40

1/8/2007 - L25 FloatingPoint Adder Copyright 2006 - Joanne DeGroat, ECE, OSU 2

Lecture overview The Interface

Part by part

A floating point adder design


3/40


Adder is double precision Double Precision

Value of bits in word representation is:

If e=2047 and f /= 0, then v is NaN regardless of s

If e=2047 and f = 0, then v = (-1)s

If 0 < e < 2047, then v = (-1)s

2e-1023

(1.f)

normalized number

If e = 0 and f /= 0, the v = (-1)s 2-1022 (0.f)

Denormalized numbersallow for graceful underflow

If e = 0 and f = 0 the v = (-1)s

0 (zero)

s e (11-bits) f (52-bits)


4/40


Specification of a FPA Floating Point Add/Subtract Unit

Specification

Inputs in IEEE 754 Double Precision Must perform both addition and subtraction

Must handle the full floating point standard

Normalized numbers

Not a NumbersNaNs

+/- Infinity

Denormalized numbers


5/40


Specifications continued Result will be a IEEE 754 Double Precision

representation

Unit will correctly handle the invalid operation ofadding + and - = Nan per the standard

Unit latches it inputs into registers from parallel

64-bit data busses.

There is a separate signal line that indicates theoperation add or subtract


6/40


Specifications continued Outputs

The correctly represented result

Flags that are output are

Zero result

Overflow to infinity from normalized numbers as inputs

NaN result

Overshift (result is the larger of the two operands)

Denormalized result

Inexact (result was rounded)

Invalid operation for addition


7/40


High level block diagram Basic architecture interface

Data64 bit A,B,& C Busses

Control signalsLatch, Add/Sub, Asel, Drive

Condition Flags Output7 Flag signals

ClocksPhi1 and Phi2 (a 2 phase clocked architecture

Floating Point AdderUnit

Abus Bbus

Cbus Flags

Add/SubLatch

AselDrive

Phi1Phi2


8/40


Start the VHDL The entity interface

Floating

Point

Adder

A_Main B- Main

Add_or_sbubLatch

Phi1

Phi2

Drive

Asel

C_Main A_Out Flags

entity Floating_Point_Adder is

port (A_Main : in BIT_VE CTOR;

B_Main : in BIT_VECTOR;C_Main : out BIT_VECTOR;

A_Out: out BIT_VECTOR;

Flags : out BIT_VECTOR;

Add_or_sub : in BIT;

Latch : in BIT;

Driv e : in BIT;

Phi1 : in BIT;Phi2 : in BIT;

Asel : in BIT );

end Floating_Point_Adder;


9/40


Basic design Can be divided into

functional sub-blocks

First latch and drive

INPUT LATCHES

RESULT LATCHES

OUTPUT DRIVERS

A/S


10/40


What goes in the other blocks From adjusting the

inputs to prepare to add

To add To renormalize

To round

INPUT LATCHES

Add Mantissas

Normalize Result

RESULT LATCHESOUTPUT DRIVERS

Input Adjust

A/S

Round according to selected scheme


11/40


VHDL coding for the latched

A first cut

The input latches

Note 2 phase

b1: block ((Phi2 and Latch) = '1')

begin

A_temp


12/40


And on the output

Drivers

Note use of

guardedblocks

out_latch1 : block ((Drive and Phi2) = '1')

begin

Flags


13/40


And what goes in between?

In the final design lots goes in between but

You first want to make sure that the latches are

working properly So just pass one input to the output and check

And once this works properly can move on with the

design

signout


14/40

1/8/2007 - L25 Floating

Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU 14

The first section Prepare to add

Identify type of inputs and appropriately adjust operands

Mantissa

ProcessingLogic

Amantissa Bmantissa

M>

M=M+(E=M>) 2x2 crossbars elements

selR L 2-1 Mux R

selR L 2-1 Mux RselR L 2-1 Mux REA1+EB1

"Zero" "Nan"

Cntrl Eq

RightLinearShifter

Shift Dist

selR L 2-1 Mux RSignA xor SignB

ADDER

Exponent

ProcessingLogic

Aexp Bexp

ShiftDist

E>

E=E +(E=M>))

Sign Out (63)to output latch

Adder Output (to normalize unit)


15/40

1/8/2007 - L25 Floating

Point Adder


The exponent unit portion

Must get the larger

exponent

And the difference

between the exponents

which is the shift distance

Also several control

signals Exponent all 0s and all 1s

Exponent A>B, A


16/40

1/8/2007 - L25 Floating

Point Adder


Mantissa Processing Logic

Need to examine the two fractional parts andgenerate several control signals that arerequired to prepare the operands

Need relational signals M>, M=, M + (E=*M>) or shift the A input to the right

side if the exponent of A is the larger OR theexponents are equal and the fractional part of A islarger


21/40

1/8/2007 - L25 Floating

Point Adder


The next multiplexers

Now have the smaller on the left path and the largeron the right path.

On the left path if either exponent is all 1s then that

operand is NaN or infinity and has been crossbarred,or is equal, to the right path operand. In this casewant to simply pass it through to the output byadding 0 to it. So a 0 is one choice of the left path

mux. On the right path select the right path value or mux

in a hardwired NaN for an illegal operation


22/40

1/8/2007 - L25 Floating

Point Adder


Linear shifting

Next step is to linear shift the left operand

The exponent generates the exponent >

signals by subtracting the exponentsExpA-ExpB and ExpB-ExpA

Then with the help of the all control signals

the exponent difference is known and thisvalue is sent to the shifter.


23/40

1/8/2007 - L25 Floating

Point Adder


One last mulitplexer

The right path operand, the larger is simply

input to the ADDER.

On the left path the output of the linear shifteris sent to the ADDER for a + operation

OR

The ones complement of the value is sent tothe ADDER for aoperation. In this case the

input carry is handled appropriately.


24/40

1/8/2007 - L25 Floating

Point Adder


Code for this section - behavioral

Most of code is

generation of

various signals andmovement of data

in muxes

--expgt expb) else '0';expeq


25/40

1/8/2007 - L25 Floating

Point Adder


Xbar code highlight

Code

swap


26/40

1/8/2007 - L25 Floating

Point Adder


Hard code NaN VHDL code

The code

-- Control equation for mux

in_mux_r_man


27/40

1/8/2007 - L25 Floating

Point Adder


Now add the mantissas

Simply add the two mantissas.

As the sign of the B input was XORed with the

operation, i.e., inverted if it was a subtract operation,

the carry in the the XOR of the two signs. If the

signs are different then a subtract is being performed

and a 1 if being input to the carry in of the adder.

The adder does twos complement addition. Inputs are of the form x.xxxxxxx or 54 bits.

The output is of the form xx.xxxxxxx or 58 bits


28/40

1/8/2007 - L25 Floating

Point Adder


On to the next challenge

This is perhaps the hardest part

renormalization of the result

Have a result exponent (the exponent of thelarger) and a mantissa in the form

xx.xxxxxxxxxx

The following slide shows the processingneeded


29/40

1/8/2007 - L25 Floating

Point Adder


Renormalization Unit

Have

exponent

and

mantissa to

deal with.

5 to 1 Mux4 to 1 Mux

detect all 1's

+1 incrementer Adder

detect all 0's

0 & value

6lsb

000000 & value

zero

UF all0

all1

inverters

Right Shift 1

Right Shift 1

LeftLinearShifter

Result Signal Generation

zero

2-1 MuxUF

XX.XXXXXX--->2nd1st

Ld1pos

fract0

Larger Exponent Adder Output

1 2 3 4 5c1c2

c3

c4

c5

c1

c2

c3

c4

1 2 3 4

exp_norm man_norm


30/40

1/8/2007 - L25 Floating

Point Adder


Many choices to deal with

May need to shift the mantissa 1 position to the righton a fixed binary point.

May be OK as is

May have to shift leftthen need to know theposition of the leading 1.

In a behavioral model can simply shift left once,increment a counter and then check.

In hardware need a leading 1 detector that give theposition of the leading 1 so that the mantissa can beshifter left.


31/40

1/8/2007 - L25 Floating

Point Adder


Interactions

All shifts of mantissa result in exponent

adjustment.

There are 4 choices on the exponent As is

Incremented by 1

Adjusted down by some amount depending onshift

Zero


32/40

1/8/2007 - L25 Floating

Point Adder


Interactions

There are 5 choices on the mantissa

As is

Right shifted by 1increment exp by 1 Left shifted for leading 1

Left shifted and then right shifted by 1

Hardwired 0 This part is the same for both addition and

multiplication. Easy to do algorithmically.


33/40

1/8/2007 - L25 Floating

Point Adder


Rounding Unit

Once done with renormalization will look at the

guard bits to determine rounding.

Standard specifies several rounding modes.

Can also just truncate.exp_norm

+1 incrementer

2-1 Mux

Exponent output

RoundLogic5

lsb

53msb

man_norm

+1 incrementer

2-1 Mux

msbin

msbout

Round(msbin xor msbout)

Mantissa Output

Round


34/40

1/8/2007 - L25 Floating

Point Adder


Rounding

Can result in changes to both the mantissa and

the exponent.

After rounding final result is output innormalized form.


35/40

1/8/2007 - L25 Floating

Point Adder


And dont forget the flags

Any arithmetic unit output flags on the status and

validity of the result.

The flags to be generated are output from various

control signals or combinations of various control

signals.

zero


36/40

1/8/2007 - L25 Floating

Point Adder


To test (verify) the design

Must test for normal operation and boundary

conditions

Will check A by B

NaN NaN

+/- infinity +/- infinity

+/- 0 +/- 0

Denorm Denorm Norm Norm

For both direct and all crossed pairings


37/40

1/8/2007 - L25 Floating

Point Adder


Boundary conditions

Wish to check several boundary conditions

Denorm + Denorm = Max Denorm

Denorm + Denorm = Min Norm NormNorm = Max Denorm

Rounding using first guard bit Rounding using 1st and 2nd guard bits


38/40

1/8/2007 - L25 Floating

Point Adder


Testing

Testing of the design code is not necessarily

the same as the testing the would be done on

the chip. The testing of the design is call verification

and must insure that all possible input

combinations produce the specified output.


39/40

1/8/2007 - L25 Floating

Point Adder


Scan of entire architecture


40/40

1/8/2007 - L25 Floating

P i t Add


Scan of the chip

lect 2b -ieee floating point adder arch

Documents