Download - Lect 2b -IEEE Floating Point Adder Arch
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
1/40
1/8/2007 - L25 Floating PointAdder Copyright 2006 - Joanne DeGroat, ECE, OSU1
IEEE Floating Point AdderUsing the IEEE Floating Point Standard for an
add/subtract execution units
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
2/40
1/8/2007 - L25 FloatingPoint Adder Copyright 2006 - Joanne DeGroat, ECE, OSU 2
Lecture overview The Interface
Part by part
A floating point adder design
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
3/40
1/8/2007 - L25 FloatingPoint Adder Copyright 2006 - Joanne DeGroat, ECE, OSU 3
Adder is double precision Double Precision
Value of bits in word representation is:
If e=2047 and f /= 0, then v is NaN regardless of s
If e=2047 and f = 0, then v = (-1)s
If 0 < e < 2047, then v = (-1)s
2e-1023
(1.f)
normalized number
If e = 0 and f /= 0, the v = (-1)s 2-1022 (0.f)
Denormalized numbersallow for graceful underflow
If e = 0 and f = 0 the v = (-1)s
0 (zero)
s e (11-bits) f (52-bits)
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
4/40
1/8/2007 - L25 FloatingPoint Adder Copyright 2006 - Joanne DeGroat, ECE, OSU 4
Specification of a FPA Floating Point Add/Subtract Unit
Specification
Inputs in IEEE 754 Double Precision Must perform both addition and subtraction
Must handle the full floating point standard
Normalized numbers
Not a NumbersNaNs
+/- Infinity
Denormalized numbers
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
5/40
1/8/2007 - L25 FloatingPoint Adder Copyright 2006 - Joanne DeGroat, ECE, OSU 5
Specifications continued Result will be a IEEE 754 Double Precision
representation
Unit will correctly handle the invalid operation ofadding + and - = Nan per the standard
Unit latches it inputs into registers from parallel
64-bit data busses.
There is a separate signal line that indicates theoperation add or subtract
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
6/40
1/8/2007 - L25 FloatingPoint Adder Copyright 2006 - Joanne DeGroat, ECE, OSU 6
Specifications continued Outputs
The correctly represented result
Flags that are output are
Zero result
Overflow to infinity from normalized numbers as inputs
NaN result
Overshift (result is the larger of the two operands)
Denormalized result
Inexact (result was rounded)
Invalid operation for addition
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
7/40
1/8/2007 - L25 FloatingPoint Adder Copyright 2006 - Joanne DeGroat, ECE, OSU 7
High level block diagram Basic architecture interface
Data64 bit A,B,& C Busses
Control signalsLatch, Add/Sub, Asel, Drive
Condition Flags Output7 Flag signals
ClocksPhi1 and Phi2 (a 2 phase clocked architecture
Floating Point AdderUnit
Abus Bbus
Cbus Flags
Add/SubLatch
AselDrive
Phi1Phi2
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
8/40
1/8/2007 - L25 FloatingPoint Adder Copyright 2006 - Joanne DeGroat, ECE, OSU 8
Start the VHDL The entity interface
Floating
Point
Adder
A_Main B- Main
Add_or_sbubLatch
Phi1
Phi2
Drive
Asel
C_Main A_Out Flags
entity Floating_Point_Adder is
port (A_Main : in BIT_VE CTOR;
B_Main : in BIT_VECTOR;C_Main : out BIT_VECTOR;
A_Out: out BIT_VECTOR;
Flags : out BIT_VECTOR;
Add_or_sub : in BIT;
Latch : in BIT;
Driv e : in BIT;
Phi1 : in BIT;Phi2 : in BIT;
Asel : in BIT );
end Floating_Point_Adder;
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
9/40
1/8/2007 - L25 FloatingPoint Adder Copyright 2006 - Joanne DeGroat, ECE, OSU 9
Basic design Can be divided into
functional sub-blocks
First latch and drive
INPUT LATCHES
RESULT LATCHES
OUTPUT DRIVERS
A/S
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
10/40
1/8/2007 - L25 FloatingPoint Adder Copyright 2006 - Joanne DeGroat, ECE, OSU 10
What goes in the other blocks From adjusting the
inputs to prepare to add
To add To renormalize
To round
INPUT LATCHES
Add Mantissas
Normalize Result
RESULT LATCHESOUTPUT DRIVERS
Input Adjust
A/S
Round according to selected scheme
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
11/40
1/8/2007 - L25 FloatingPoint Adder Copyright 2006 - Joanne DeGroat, ECE, OSU 11
VHDL coding for the latched
A first cut
The input latches
Note 2 phase
b1: block ((Phi2 and Latch) = '1')
begin
A_temp
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
12/40
1/8/2007 - L25 FloatingPoint Adder Copyright 2006 - Joanne DeGroat, ECE, OSU 12
And on the output
Drivers
Note use of
guardedblocks
out_latch1 : block ((Drive and Phi2) = '1')
begin
Flags
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
13/40
1/8/2007 - L25 FloatingPoint Adder Copyright 2006 - Joanne DeGroat, ECE, OSU 13
And what goes in between?
In the final design lots goes in between but
You first want to make sure that the latches are
working properly So just pass one input to the output and check
And once this works properly can move on with the
design
signout
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
14/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 14
The first section Prepare to add
Identify type of inputs and appropriately adjust operands
Mantissa
ProcessingLogic
Amantissa Bmantissa
M>
M=M+(E=M>) 2x2 crossbars elements
selR L 2-1 Mux R
selR L 2-1 Mux RselR L 2-1 Mux REA1+EB1
"Zero" "Nan"
Cntrl Eq
RightLinearShifter
Shift Dist
selR L 2-1 Mux RSignA xor SignB
ADDER
Exponent
ProcessingLogic
Aexp Bexp
ShiftDist
E>
E=E +(E=M>))
Sign Out (63)to output latch
Adder Output (to normalize unit)
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
15/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 15
The exponent unit portion
Must get the larger
exponent
And the difference
between the exponents
which is the shift distance
Also several control
signals Exponent all 0s and all 1s
Exponent A>B, A
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
16/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 16
Mantissa Processing Logic
Need to examine the two fractional parts andgenerate several control signals that arerequired to prepare the operands
Need relational signals M>, M=, M + (E=*M>) or shift the A input to the right
side if the exponent of A is the larger OR theexponents are equal and the fractional part of A islarger
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
21/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 21
The next multiplexers
Now have the smaller on the left path and the largeron the right path.
On the left path if either exponent is all 1s then that
operand is NaN or infinity and has been crossbarred,or is equal, to the right path operand. In this casewant to simply pass it through to the output byadding 0 to it. So a 0 is one choice of the left path
mux. On the right path select the right path value or mux
in a hardwired NaN for an illegal operation
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
22/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 22
Linear shifting
Next step is to linear shift the left operand
The exponent generates the exponent >
signals by subtracting the exponentsExpA-ExpB and ExpB-ExpA
Then with the help of the all control signals
the exponent difference is known and thisvalue is sent to the shifter.
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
23/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 23
One last mulitplexer
The right path operand, the larger is simply
input to the ADDER.
On the left path the output of the linear shifteris sent to the ADDER for a + operation
OR
The ones complement of the value is sent tothe ADDER for aoperation. In this case the
input carry is handled appropriately.
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
24/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 24
Code for this section - behavioral
Most of code is
generation of
various signals andmovement of data
in muxes
--expgt expb) else '0';expeq
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
25/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 25
Xbar code highlight
Code
swap
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
26/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 26
Hard code NaN VHDL code
The code
-- Control equation for mux
in_mux_r_man
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
27/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 27
Now add the mantissas
Simply add the two mantissas.
As the sign of the B input was XORed with the
operation, i.e., inverted if it was a subtract operation,
the carry in the the XOR of the two signs. If the
signs are different then a subtract is being performed
and a 1 if being input to the carry in of the adder.
The adder does twos complement addition. Inputs are of the form x.xxxxxxx or 54 bits.
The output is of the form xx.xxxxxxx or 58 bits
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
28/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 28
On to the next challenge
This is perhaps the hardest part
renormalization of the result
Have a result exponent (the exponent of thelarger) and a mantissa in the form
xx.xxxxxxxxxx
The following slide shows the processingneeded
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
29/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 29
Renormalization Unit
Have
exponent
and
mantissa to
deal with.
5 to 1 Mux4 to 1 Mux
detect all 1's
+1 incrementer Adder
detect all 0's
0 & value
6lsb
000000 & value
zero
UF all0
all1
inverters
Right Shift 1
Right Shift 1
LeftLinearShifter
Result Signal Generation
zero
2-1 MuxUF
XX.XXXXXX--->2nd1st
Ld1pos
fract0
Larger Exponent Adder Output
1 2 3 4 5c1c2
c3
c4
c5
c1
c2
c3
c4
1 2 3 4
exp_norm man_norm
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
30/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 30
Many choices to deal with
May need to shift the mantissa 1 position to the righton a fixed binary point.
May be OK as is
May have to shift leftthen need to know theposition of the leading 1.
In a behavioral model can simply shift left once,increment a counter and then check.
In hardware need a leading 1 detector that give theposition of the leading 1 so that the mantissa can beshifter left.
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
31/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 31
Interactions
All shifts of mantissa result in exponent
adjustment.
There are 4 choices on the exponent As is
Incremented by 1
Adjusted down by some amount depending onshift
Zero
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
32/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 32
Interactions
There are 5 choices on the mantissa
As is
Right shifted by 1increment exp by 1 Left shifted for leading 1
Left shifted and then right shifted by 1
Hardwired 0 This part is the same for both addition and
multiplication. Easy to do algorithmically.
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
33/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 33
Rounding Unit
Once done with renormalization will look at the
guard bits to determine rounding.
Standard specifies several rounding modes.
Can also just truncate.exp_norm
+1 incrementer
2-1 Mux
Exponent output
RoundLogic5
lsb
53msb
man_norm
+1 incrementer
2-1 Mux
msbin
msbout
Round(msbin xor msbout)
Mantissa Output
Round
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
34/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 34
Rounding
Can result in changes to both the mantissa and
the exponent.
After rounding final result is output innormalized form.
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
35/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 35
And dont forget the flags
Any arithmetic unit output flags on the status and
validity of the result.
The flags to be generated are output from various
control signals or combinations of various control
signals.
zero
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
36/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 36
To test (verify) the design
Must test for normal operation and boundary
conditions
Will check A by B
NaN NaN
+/- infinity +/- infinity
+/- 0 +/- 0
Denorm Denorm Norm Norm
For both direct and all crossed pairings
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
37/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 37
Boundary conditions
Wish to check several boundary conditions
Denorm + Denorm = Max Denorm
Denorm + Denorm = Min Norm NormNorm = Max Denorm
Rounding using first guard bit Rounding using 1st and 2nd guard bits
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
38/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 38
Testing
Testing of the design code is not necessarily
the same as the testing the would be done on
the chip. The testing of the design is call verification
and must insure that all possible input
combinations produce the specified output.
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
39/40
1/8/2007 - L25 Floating
Point Adder
Copyright 2006 - Joanne DeGroat, ECE, OSU 39
Scan of entire architecture
-
7/28/2019 Lect 2b -IEEE Floating Point Adder Arch
40/40
1/8/2007 - L25 Floating
P i t Add
Copyright 2006 - Joanne DeGroat, ECE, OSU 40
Scan of the chip