spring 2006ee 5324 - vlsi design ii - © kia bazargan 299 ee 5324 – vlsi design ii kia bazargan...
Post on 20-Dec-2015
226 views
TRANSCRIPT
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 1
EE 5324 – VLSI Design IIEE 5324 – VLSI Design II
Kia Bazargan
University of Minnesota
Part VII: Floating Point ArithmeticPart VII: Floating Point Arithmetic
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 2
Floating-Point vs. Fixed-Point Numbers
• Fixed point has limitations x = 0000 0000. 0000 10012
y = 1001 0000. 0000 00002 Rounding? Overflow? (x2 and y2 under/overflow)
• Floating point: represent numbers in two fixed-width fields: “magnitude” and “exponent” Magnitude: more bits = more accuracy Exponent: more bits = wider range of numbers
s e m± Exponent Magnitude
X =
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 3
Floating Point Number Representation• Sign field:
When 0: positive number, when 1, negative
• Exponent: Usually presented as unsigned by adding an offset Example: 4 bits of exponent, offset=8
o Exp=10012 e = 10012-10002 = 00012
o Exp=00102 e = 00102-10002 = 10102 = -6
• Magnitude (also called significand, mantissa) Shift the number to get: 1.xxxx Magnitude is the fractional part (hidden ‘1’) Example: 6 bits of mantissa
o Number=110.0101 shift: 1.100101 mantissa=100101
o Number=0.0001011 shift: 1.011 mantissa=011000
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 4
Floating Point Numbers: Example
X = ± 1.m × 2es e (+bias) m± Exponent Magnitude
X =
X1 = + 1.0011101 × 220 1 0 1 0 0 0 1 1 1 0 1X1 =
X2 = + 1. 1 × 2-60 0 0 1 0 1 0 0 0 0 0 0X2 =
X3 = - 1.0000001 × 231 1 0 1 1 0 0 0 0 0 0 1X3 =
X4 = + 1.0000000 × 2-8
= 00 0 0 0 0 0 0 0 0 0 0 0X4 =
X5 = + 1.0000000 × 27
= +0 1 1 1 1 0 0 0 0 0 0 0X5 =
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 5
- +0
Underflow Regions
Overflow Region
Overflow Region
Positivenumbers
Negativenumbers
FLP- FLP+ maxmin-max -min
Denser Sparser. . . . . .
DenserSparser. . . . . .
Floating Point Number Range• Range: [-max, -min] [min, max]
Min = smallest magnitude x 2smallest exponent
Max = largest magnitude x 2largest exponent
• What happens if: We increase # bits for exponent? Increase # bits for magnitude?
• Ref: http://steve.hollasch.net/cgindex/coding/ieeefloat.html ftp://download.intel.com/technology/itj/q41999/pdf/
ia64fpbf.pdf
[© Oxford U Press]
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 6
Floating Point Operations
• Addition/subtraction, multiplication/division, function evaluations, ...
• Basic operations Adding exponents / magnitudes Multiplying magnitudes Aligning magnitudes (shifting, adjusting the
exponent) Rounding Checking for overflow/underflow Normalization (shifting, adjusting the
exponent)
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 7
No need to normalize in this case
Floating Point Addition• More difficult than multiplication!• Operations:
Align magnitudes (so that exponents are equal) Add (and round) Normalize (result in the form of 1.xxx)
X = + 1.0011101 × 230 1 0 1 1 0 0 1 1 1 0 1X =
y = + 1.1010011 × 200 1 0 0 0 1 0 1 0 0 1 1y =
y = + 0.0011010 × 230 1 0 1 1 0 0 1 1 0 1 0y =
x+y= +1.0110111 × 230 1 0 1 1 0 1 1 0 1 1 1x+y=
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 8
Floating Point Adder Architecture
Unpack
Complement/swapSubtract
ExponentsAlign Magnitudes
Add Magnitudes
Normalize
Round/Complement
Normalize
Pack
AdjustExponent
AdjustExponent
SignLogic+/- Cin
Cout
[© Oxford U Press]
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 9
Floating Point Adder Components
• Unpacking Inserting the “hidden 1” Checking for special inputs (NaN, zero)
• Exponent difference Used in aligning the magnitudes A few bits enough for subtraction
o If 32-bit magnitude adder, 8 bits of exponent, only 5 bits involved in subtraction
If negative difference, swap, use positive diffo How to compute the positive diff?
• Pre-shifting and swap Shift/complement provided for one operand only Swap if needed
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 10
Floating Point Adder Components (cont.)
• Rounding Three extra bits used for rounding
• Post-shifting Result in the range (-4, 4) z = Coutz1z0.z-1z-2… Right shift: 1 bit max
o If Cout z1 right shift
Left shift: up to # of bits in magnitudeo Determine # of consecutive 0’s (1’s) in z, beginning
with z1.
Adjust exponent accordingly
• Packing Check for special results (zero, under-/overflow) Remove the hidden 1
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 11
Counting vs. Predicting Leading Zeros/Ones
Shiftamount
Post-shifter
Magnitude Adder
AdjustExponent
CountLeading
0/1
Post-Shifter
Magnitude Adder
AdjustExponent Shift
amount
Predict Leading
0/1
Counting:Simpler but on the
critical path
Predicting:More complexarchitecture
[© Oxford U Press]
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 12
Floating Point Multiplication
• Simpler than floating-point addition• Operation:
Inputs: z1= ± 1.m1 × 2e1 z2= ± 1.m2 × 2e2
Output = ± (1.m1 × 1.m2) × 2e1+e2 Sign: XOR Exponent:
o Tentatively computed as e1+e2 o Subtract the bias (=127) HOW?o Adjusted after normalization
Magnitudeo Result in the range [1,4) (inputs in the range [1,2) )o Normalization: 1- or 2-bit shift right, depending on roundingo Result is 2.(1+m) bits, should be rounded to (1+m) bitso Rounding can gradually discard bits, instead of one last
stage
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 13
Floating Point Multiplier Architecture
Note:Pipelining is used in magnitude multiplier, as well as block boundaries
Unpack
XOR AddExponents
Normalize AdjustExponent
Pack
Round
Normalize
MultiplyMagnitudes
Floating-point operands
Product
AdjustExponent
[© Oxford U Press]
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 14
Square-Rooting
• Most important elementary function• In IEEE standard, specified a basic
operation (alongside +,-,*,/)• Very similar to division• Pencil-and-paper method:
Radicand: z=z2k-1z2k-2…z1z0
Square root: qk-1qk-2…q1q0
Remainder (z-q2) sksk-1sk-2…s1s0 (k+1 digits)
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 15
Append digits
×2
×2
Square Rooting: Example
• Example: sqrt(9 52 41)
q2 q1 q0 q q(0)=0
9 52 41 = z q2=3 q(1)=39
0 52 6q1 × q1 52 q1=0 q(2)=3000
52 41 60q0 × q0 5241 q0=8 q(3)=30848 64
03 77 s = 377 q=308
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 16
Square Rooting: Example (cont.)• Why double the partial root?
Partial root after step 2 is: q(2) = 30 Appending the next digit q0 10 × q(2) + q0 Square of which is 100×(q(2))2 + 20×q(2)×q0 + q0
2 The term 100×(q(2))2 already subtracted Find q0 such that (10×(2×q(2)) + q0) × q0 is the
max number partial remainder
• The binary case: Square of 2×q(2) + q0 is:
4×(q(2))2 + 4×q(2)×q0 + q02
Find q0 such that (4×q(2) + q0) × q0 is partial remainder
For q0=1, the expression becomes 4×q(2)+1 (i.e., append “01” to the partial root)
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 17
Square Rooting: Example Base 2
• Example: sqrt(011101102) = sqrt(118)q3 q2 q1 q0 q q(0)=0
01 11 01 10 = z=(118)10 q3=1 q(1)=101
00 11 101 ? No q2=0 q(2)=10 0 00
0 11 01 1001 ? Yes q1=1 q(3)=10110 01
01 00 10 10101 ? No q0=0 q(4)=101000 00 00
1 00 10 s=1810 q=10102=1010
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 18
Sequential Shift/Subtract Square Rooter Architecture
Square root
Load
sub
(l+2)-bit adder
Trial Difference
l+2
Partial Remainder
q-j
2s(j-1)MSB of
Put z - 1 here at the outset
SelectRoot Digit
l+2
CinCout
Complement
[© Oxford U Press]
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 19
Other Methods for Square Rooting
• Restoring vs. non-restoring We looked at the restoring algorithm
(after subtraction, restore partial remainder if the result is negative)
Non-restoring:Use a different encoding (use digits {-1,1} instead of {0,1}) to avoid restoring
• High-radix Similar to modified Booth encoding
multiplication: take care of more number of bits at a time
More complex circuit, but faster
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 20
• Convergence methods Use the Newton method to approximate the
function f(x) = x2 – z approximates x=z OR f(x) = 1/x2 – z approximates x=1/z , multiply by z to get z
Iteratively improve the accuracy Can use lookup table for the first iteration
Other Methods for Square Rooting (cont.)
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 21
Square Rooting: Abstract Notation
q
z-q3 (q(0) 0q3) 26
-q2 (q(1) 0q2) 24
-q1 (q(2) 0q1) 22
-q0 (q(3) 0q0) 20
s
Floating point format: - Shift left (not right) - Powers of 2 decreasing
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 22
Restoring Floating-Point Square Root Calc.
z 0 1 . 1 1 0 1 1 0 (118/64)
s(0) = z - 1 0 0 0 . 1 1 0 1 1 0 q0=1 q(0)=1.
2s(0) 0 0 1 . 1 0 1 1 0 0 -[2× ( 1.)+2
-1] 1 0 . 1
s(1) 1 1 1 . 0 0 1 1 0 0 q-1
=0 q(1)= 1.0 s(1) = 2 s(0) 0 0 1 . 1 0 1 1 0 02s(1) 0 1 1 . 0 1 1 0 0 0 -[2× ( 1.0)+2 -2] 1 0 . 0 1
s(2) 0 0 1 . 0 0 1 0 0 0 q-2
=1 q(2)= 1.012s(2) 0 1 0 . 0 1 0 0 0 0-[2× ( 1.01)+2 -3] 1 0 . 1 0 1
s(3) 1 1 1 . 1 0 1 0 0 0 q-3
=0 q(3)= 1.010s(3) = 2 s(2) 0 1 0 . 0 1 0 0 0 02s(3) 1 0 0 . 1 0 0 0 0 0-[2× ( 1.010)+2
-4] 1 0 . 1 0 0 1
Restore
Restore
[© Oxford U Press]
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 23
Restoring Floating-Point Sq. Root Calc. (cont.)
s(4) 0 0 1 . 1 1 1 1 0 0 q-4
=1 q(4)= 1.01012s(4) 0 1 1 . 1 1 1 0 0 0-[2× (1.0101)+2 -5] 1 0 . 1 0 1 0 1
s(5) 0 0 1 . 0 0 1 1 1 0 q-5
=1 q(5)= 1.010112s(5) 0 1 0 . 0 1 1 1 0 0-[2×( 1.01011)+2 -6] 1 0 . 1 0 1 1 0 1
s(6) 1 1 1 . 1 0 1 1 1 1 q-6
=0 q(6)= 1.010110s(6) = 2 s(5) 0 1 0 . 0 1 1 1 0 0 (156/64)
s (true remainder) 0 . 0 0 0 0 1 0 0 1 1 1 0 0q 1 . 0 1 0 1 1 0 (86/64)
Restore
s(3) 1 1 1 . 1 0 1 0 0 0 q-3
=0 q(3)= 1.010s(3) = 2 s(2) 0 1 0 . 0 1 0 0 0 02s(3) 1 0 0 . 1 0 0 0 0 0-[2× ( 1.010)+2
-4] 1 0 . 1 0 0 1
Restore
[© Oxford U Press]
(156/642)
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 24
Nonrestoring Floating-Point Square Root Calc.
z 0 1 . 1 1 0 1 1 0 (118/64)
s(0) = z - 1 0 0 0 . 1 1 0 1 1 0 q0
=1 q(0)=1.2 s(0) 0 0 1 . 1 0 1 1 0 0 q
-1=1 q(1)=1.1
-[2× ( 1.)+2-1
] 1 0 . 1 s(1) 1 1 1 . 0 0 1 1 0 0 q
-2=-1 q(2)=1.01
2 s(1) 1 1 0 . 0 1 1 0 0 0 +[2× ( 1.1)-2 -2] 1 0 . 1 1 s(2) 0 0 1 . 0 0 1 0 0 0 q
-3=1 q(3)=1.011
2 s(2) 0 1 0 . 0 1 0 0 0 0-[2× ( 1.01)+2 -3] 1 0 . 1 0 1 s(3) 1 1 1 . 1 0 1 0 0 0 q
-4=-1 q(4)=1.0101
2 s(3) 1 1 1 . 0 1 0 0 0 0+[2× ( 1.011)-2
-4] 1 0 . 1 0 1 1
s(4) 0 0 1 . 1 1 1 1 0 0 q-5
=1 q(5)=1.010112 s(4) 0 1 1 . 1 1 1 0 0 0-[2× (1.0101 )+2
-5] 1 0 . 1 0 1 0 1
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 25
Nonrestoring FP Square Root Calc. (cont.)
s(4) 0 0 1 . 1 1 1 1 0 0 q-5
=1 q(5)=1.010112 s(4) 0 1 1 . 1 1 1 0 0 0-[2× (1.0101 )+2
-5] 1 0 . 1 0 1 0 1
s(5) 0 0 1 . 0 0 1 1 1 0 q-6
=1 q(6)=1.0101112 s(5) 0 1 0 . 0 1 1 1 0 0-[2×( 1.01011)+2 -6] 1 0 . 1 0 1 1 0 1
s(6) 1 1 1 . 1 0 1 1 1 1 Negative (-17/64) 1 0 . 1 0 1 1 0 1 Correct
s(6) (corrected) 0 1 0 . 0 1 1 1 0 0 (156/64)s (true remainder) 0 . 0 0 0 0 1 0 0 1 1 1 0 0q (signed-digit) (87/64)q (corrected bin) 1 . 0 1 0 1 1 0 (86/64)
1 . 1 -1 1 -1 1 1
If final S negative, drop the last ‘1’ in q, and restore the remainder to the last positive value.
s(6)=2 S(5)
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 26
x(0) read out from table = 1.5 accurate to 10-1
x(1) = 0.5(x(0) +2.4/x(0)) = 1.550 000 000 accurate to 10-2
x(2) = 0.5(x(1) +2.4/x(1)) = 1.549 193 548 accurate to 10-4
x(3) = 0.5(x(2) +2.4/x(2)) = 1.549 193 338 accurate to 10-8
[Par00] p354
Square Root Through Convergence
• Newton-Rapson method: Choose f(x)=x2-z x(i+1) = x(i) – f(x(i)) / f’(x(i)) x(i+1) = 0.5 (x(i) + z / x(i))
• Example: compute square root of z=(2.4)10
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 27
Non-Restoring Parallel Square Rooter
q-3
z-1
1
1
1
10
0
0
0
1
Cell
FA
XOR
z-2
z-3 z-4
z-5 z-6
z-7 z-8
q-4
q-2
q-1
s-1 s-2 s-3 s-4 s-5 s-6 s-7 s-8
[© Oxford U Press]
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 28
Function Evaluation
• We looked at square root calculation Direct hardware implementation (binary, BSD,
high-radix)o Serialo Parallel
Approximation (Newton method)
• What about other functions? Direct implementation
o Example: log2 x can be directly implemented in hardware (using square root as a sub-component)
Polynomial approximation Table look-up
o Either as part of calculation or for the full calculation
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 29
Table Lookup
2u x vtable
Result(s) bits
Operand(s) bitsu
v
[© Oxford U Press]
Post-processinglogic
Smaller
table(s)
Operand(s) bitsu
Result(s) bitsv
.
.
.
. . .Pre
pro
cessin
gLo
gic
Direct table-lookupimplementation
Table-lookup with pre-and post-processing
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 30
×
Linear Interpolation Using Four Subintervals
x x
f(x)
f(x)
4-entry tables
a
x
2-bit address
min maxx
x
b /4
RadixPoint
(i)(i)
+
a(0)+b(0)xa(1)+b(1)x
a(2)+b(2)x
a(3)+b(3)x
4x
[© Oxford U Press]
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 31
Piecewise Table Lookup
Table 2 m*
d
d-bit output
b-h h
Z mod p
b-bit inputz
Adder
Table 1 v
d*
d*-h h d*
d*
Table 1
Table 2
v
d d
Adder
Adder
-p
Mux
d-bit output
b-bit inputb-g g
d d
d+1
ddSign
d+1
z
z mod p
LvH
[© Oxford U Press]
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 32
Accuracy vs. Lookup Table Size Trade-off
Wors
t-ca
se a
bso
lute
err
or
10
10
10
10
10
10
10
10
10
-1
-2
-3
-4
-5
-6
-7
-8
-9
Number of address bits (h)
Linear
2nd-degree
3rd-degree
10 4 8 6 2 0
[© Oxford U Press]
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 33
Useful Links
• M. E. Phair, “Free Floating-Point Madness!”, http://www.hmc.edu/chips/