9_high accuracy fixed-width booth multipliers with
TRANSCRIPT
-
8/14/2019 9_High Accuracy Fixed-Width Booth Multipliers With
1/4
High Accuracy Fixed-width Booth Multipliers with
Probabilistic Estimation Compensated Method
Yuan-Ho Chen1, Hsin-Chen Chiang2, Tsin-Yuan Chang2, Chih-Wen Lu1, and Pei-Yi Lai Li2
1Department of Engineering and System Science,2Department of Electrical Engineering,
National Tsing Hua University, Hsinchu 30013, Taiwan, R.O.C.
Email: [email protected]
AbstractIn this research, a probabilistic estimation compen-sation (PEC) method for fixed-width Booth multiplier is pro-posed. According to the probabilistic analysis for the truncationpart,a formula is obtained to calculate the compensation valueeasily. In the application of long bit width, the PEC methodis implemented by a simple compensation circuit without theexhaustive simulation to achieve a high accuracy. Comparedto the previous works, the proposed method achieves better
performance in accuracy. In order to verify the performanceof PEC multipliers in real applications, it is implemented in an8 8 two-dimensional (2-D) discrete cosine transform (DCT).The result shows that the proposed PEC method can save 23%area with 4dB peak signal-to-noise ratio (PSNR) penalty.
I. INTRODUCTION
Fixed-width multipliers are important in the application of
digital signal processing (DSP) systems. In many applications,
it is desirable to remain the same width for the basic arithmetic
operations. For this reason and to reduce the area of the circuit,
the fixed-width multipliers will be only kept the most signifi-
cant half part of the products. A large error would be produced
after doing this, therefore many compensation methods areprovided to solve this problem [1]-[3]. To compare with the
traditional multiplier, the Booth multiplier reduces the partial
product rows to achieve better performance in fixed-width
multipliers. Recently, many research work hard to reduce the
truncation error on Booth multipliers [4]-[9].
Jou et al. present the statistical analysis in [4] to reduce
the complexity of hardware but it cannot obtain depressed
compensation error. On the contrary, compensation error can
be reduced with more hardware by using a threshold value
to adjust the compensation value [5]. Cho et al. in [6] use
more information from Booth encoder to improve the accurate
performance. In [7], Wang et al. present a method to slightly
modify the partial product rows of Booth multiplication and
derived an error compensation function. Although the function
can obtain smaller mean and mean-square error, the bit width
of the application should be fixed because of the time-
consuming simulation. Therefore, a probabilistic estimation
bias (PEB) is proposed to compensate the truncation value
without exhaustive simulation in [9].
This research utilizes the modified partial product rows
method of [10] and proposes a probabilistic estimation com-
pensation (PEC) method to improve the accuracy of the
application in long bit width and reduce the complexity of
m7 m0m1m2m3m4m5m6
n0n1n2n3
p0,7 p0,0p0,1p0,2p0,3p0,4p0,5p0,6
s0p1,7 p1,0p1,1p1,2p1,3p1,4p1,5p1,6
s1p2,7 p2,0p2,1p2,2p2,3p2,4p2,5p2,6
s2p3,7 p3,0p3,1p3,2p3,3p3,4p3,5p3,6
s3
p1,8
p0,8p0,8p0,8
p2,8
p3,8
p0,8p0,8p0,8p0,8p0,8
p1,8p1,8p1,8p1,8p1,8
p2,8p2,8p2,8
p3,8
P7 P0P1P2P3P4P5P6P8P9P10P11P12P13P14P15
TPmajor TPminor
Tuncation part(TP)Main part(MP)
1
Fig. 1. Traditional8 8 Booth multiplier.
p0,7 p0,0p0,1p0,2p0,3p0,4p0,5p0,6
s0p1,7 p1,0p1,1p1,2p1,3p1,4p1,5p1,6
s1p2,7 p2,0p2,1p2,2p2,3p2,4p2,5p2,6
s2p3,7 p3,1p3,2p3,3p3,4p3,5p3,6
p1,8
w0w1w2
p2,8
p3,8
1
1
1
P7 P0P1P2P3P4P5P6P8P9P10P11P12P13P14P15
TPmajor
TPminor
Tuncation part(TP)Main part(MP)
TPse TPse TPseTPsoTPso TPso TPso
3,0
m7 m0m1m2m3m4m5m6
n0n1n2n3
Fig. 2. Modified partial product rows and sign extension.
the circuit, also the area. The PEC formula is acquired by
an expected value analysis. Thus, the compensation value can
be easily obtained through the formula with the bit width by
hand calculation directly. Then, the established time can be
reduced, especially in the long bit width. Therefore, the high
accuracy and low complexity Booth multiplier is achieved in
the proposed PEB circuit.
This paper is organized as follows. In Section II, the
modified Booth multiplier and the regular expected values
are briefly described. The compensation values and circuit
implementation are showed in Section III. Section IV com-pares the accuracy of different compensation methods. Finally,
conclusions are drawn in Section V.
II. FIXED-WIDTHM ODIFIED B OOTHM ULTIPLIER
Modified Booth encoding is an effective method to reduce
the partial product rows. The twox-bit signed numbersM andN and 2x-bit product can be expressed in twos complementrepresentation as below:
M = mx12x1 +
x2i=0
mi2i
-
8/14/2019 9_High Accuracy Fixed-Width Booth Multipliers With
2/4
TABLE IBOOTH ENCODING
n2i+1 n2i n2i1 n
i si0 0 0 0 00 0 1 1 00 1 0 1 00 1 1 2 01 0 0 2 11 0 1 1 1
1 1 0 1 11 1 1 0 0
N = nx12x1 +
x2i=0
ni2i
P = M N. (1)
Booth encoder maps three successive bitsn2i+1,n2i,n2i1into ni, which is tabulated in Table I. There are Q = x/2partial product rows with even width x after Booth encoding.
As8 8 Booth multiplier an example, the calculation processis listed in Fig. 1. Where si is the complement bit, and then
i
is negative assi is 1; otherwise,si is 0. The one is a roundingbit to be close to the correct answer. After we finish all of the
calculation, we need to truncate the least significant 8 bits withrounding for more accurate. To achieve better performance, we
use the method which was proposed by [7] to modify the last
row of partial product as shown in Fig. 2. First, we sum up s3andp3,0 in advance to generate a sum 3,0 and a carryCx0 at
the (x2)th
and (x1)th
bit positions, respectively. Then
the carry Cx0 and1 are added up to generate a sum and acarry Cx1 which is added to the first of the p0,8 to generate
w0, w1, and w2.{Cx0, 3,0} = p3,0+ s3, {Cx1, }= Cx0+ 1
{w2, w1, w0} = {p0,8, p0,8, p0,8}+ Cx1 (2)
The product in (1) can be expressed by two parts: the main
part (MP) and the truncation part (TP). The MP includes eight
most significant columns (MSCs); the TP includes eight least
significant columns (LSCs), and the P can be rewritten asbelow:
P =MP+ TP. (3)
The TP is omitted and the compensation bias is introduced
to the MP based on a probabilistic estimation. Therefore, the
quantized product Pq can be rewritten as below:
PPq =MP+ 2x. (4)
where representing the PEC can be decomposed further intoTPmajor and T Pminor parts as following equations.
= (TPmajor+ TPminor)R (5)
where ()R is rounded to the nearest integer. The product iseffected by TPmajor much more then TPminor due to the
TABLE IIITHE CALCULATION OF ERROR COMPENSATED VALUE
x= 8 x= 10 x= 12 x= 16 x= 323Q216
0.625 0.8125 1.0 1.375 2.125
I 0 0 1 1 2D 1 1 0 0 0
weight of the position. We compute the TPmajor to estimatePEC and derive the TPminor in probabilistic estimation. Bymodifying the last row of partial product, we can estimate
more accurately by decreasing the unknown in TPminor .Therefore, the compensated value can be obtained by calcu-
lating TPmajor and estimating TPminor. The TPminor andTPmajor in equation (5) are listed as below:
TPmajor = 1
2(p0,x1+p1,x3+ +pQ1,1+ )
= 1
2
Q1j=0
pj,x12j+ (6)
TPminor = 1
4(p0,x2+ +pQ2,2+ 3,0) +
1
8(p0,x3+ +pQ2,1) + +
2x (p0,0+ s0) . (7)
III. PROPOSED P ROBABILISTIC E STIMATION
COMPENSATION(PEC)
A. Derivation of error-compensation value
We divide TPminor into two groups. One is the sum ofodd columnsT Pso, and the other is the sum of even columnsTPse as shown in Fig. 2.
TPso = 1
4(p0,x2+ +pQ2,2+ 3,0) +
1
16(p0,x4+ +pQ2,0+ sQ2) +
+ 2x (p0,0+ s0) (8)
TPse = 1
8(p0,x3+ +pQ2,1) + (9)
1
32(p0,x5+ +pQ3,1) + + 2
x+1p0,1
In the regular expected values analysis of Booth multipliers
[9], we can obtain the expected values of partial products as
shown in Table II, whereE{}is the expected value of. Theexpected values ofTPminor shown in Fig. 3 can be derivedas below:
E[TPminor] = E[TPso+ TPse]
= 3
8
Q1n=1
3
2Q + 2
3
2n
22n
1
8+ 2x
= 3Q2
16 + 22(Q+1) (10)
We only keep the first item (3Q2) /16because the valueof22(Q+1) is too smaller than the first item to ignore. The
-
8/14/2019 9_High Accuracy Fixed-Width Booth Multipliers With
3/4
TABLE IIEXPECTED VALUE OF EACH PARTIAL PRODUCT
E{pi,j= 1} E{si = 1} E{p0,0 = 1} E{s0 = 1} E{3,0 = 1}(i, j = 0) (i = 0)
3
8
3
8
1
2
1
2
1
4
p0,7 1/23/83/83/83/83/83/8
1/23/83/83/83/83/8p1,5
3/83/83/83/8p2,3
3/81/4p3,1
TPmajor TPminor (expected value)
m7 m0m1m2m3m4m5m6
n0n1n2n3
Fig. 3. Expected value of truncation part.
longer of the length x is, the more conspicuous the outcomeis. As a result, we can approximate the value ofTPminor asbelow:
E[TPminor]
3Q2
16
R
= (I.d)R
= I+ (d)R = I+
D
2
R
(11)
whereI is the integer part and d is the decimal point part of
(3Q2) /16. D is a binary number (0 or 1) for rounding.The values ofI and D in different length are calculated andtabulated in Table III. The reason we keep the (D/2)R tosubstitute equation (11) to equation (5) but not compute the
rounding value directly is that the approximate value can be
more accurate. Then the PEC formula is summarized as below:
=
TPmajor+
3Q2
16
R
=
TPmajor+
D
2
R
+ I
(12)
B. Proposed PEC circuit
FAFA
FAFAHA
RCA
HA
FA
FA
HA
FA
FA
HA
FA
P14 P13P15 P12 P11 P9P10 P8
1
p3,6 p3,5 p3,4 p3,3 p3,2
1
p2,6 p2,5 p2,4
p1,6
p2,8
p0,7
p2,3
p3,1
D=1FA
FA
p3,7p3,81
p2,7
w2 w1 w0
p1,5HA
I=0
p1,7p1,8
Fig. 4. The Proposed PEC8 8 Booth multiplier.
It is easy to implement the circuit by using full adders (FAs)
and half adders (HAs) as shown in Fig. 4. From the formula
p0,31
p2,27FA
I=2 1
p1,29
p3,25
p4,23
p15,1
FA
FA
c0
c1
c8
c10
c9 1
Fig. 5. The proposed 32-b PEC compensation circuit.
we derived, we can obtain I and D easily, too. I part canbe added to the column ofP8 directly and D part for the P7column. To demonstrate the long bit-width implementation, the
proposed 32-bit PEC compensation circuit is design as Fig. 5,
where I = 2 and D = 0 as listed in Table III. Therefore,the fixed-with Booth multiplier can be easily implemented by
using the proposed PEC compensation circuit for long bit-
width applications.
IV. COMPARISONS AND DISCUSSIONS
A. Comparison with other multipliers
To compare the performance of the accuracy, we introduce
the formula of the absolute average error to calculate as below:
= Avg {|P Pq |} (13)
The comparison of in different methodologies is shownin Table IV. The Direct-T shown in Table IV is to truncate
the least significant x bits directly without any calculation.Thus, the Direct-T multiplier is the worst one in the fixed-
width multipliers, and we show the accuracy in percentage
expression normalized to Direct-T method. The gate counts
TABLE IVCOMPARISONS OF ABSOLUTE AVERAGE ERROR WITH OTHER METHODS
Methods x= 8 x= 10 x= 12 x= 16
Direct-T 100% 100% 100% 100%
(384) (1920) (9216) (196608)
Jou et al. [4] 27.85% 24.85% 22.61% 19.49%
Cho et al. [5 ] 22.00% 18.26% 15.85% 12.70%
Song et al. [6] 26.84% 24.48% 22.49% 19.48%
Wang et al. [7] 20.14% 17.05% 14.75% 11.92%
PEB [9] 23.10% 21.15% 17.95% 15.90%
PEC 22.42% 18.56% 18.70% 14.02%
-
8/14/2019 9_High Accuracy Fixed-Width Booth Multipliers With
4/4
(Gc) are also important information for performance as shownin Table V. The P-T shown in Table V is to truncate the least
significant x bits after all of the calculation, which has thelargest area for fixed-width multipliers design. Therefore, the
percentages ofGc values are normalized to the P-T BoothmultipliersGc. The accuracy of proposed PEC is much betterthan [4] and [6] with about the same Gc. Although the errors of
[5] and [7] are less than ours, their area is larger. Compared tothe other approaches, the proposed PEC still has an advantage
of small established time in long bit width application.
TABLE VCOMPARISONS OF GATE COUNTSGc WITH OTHER METHODS
Methods x= 8 x= 10 x= 12 x= 16
P-T 100% 100% 100% 100%
(655) (991) (1394) (2406)
Jou et al. [4] 57% 56% 55% 54%
Cho et al. [5] 65% 62% 60% 58%
Song et al. [6] 58% 57% 56% 55%
Wang et al. [7 ] 64% 62% 60% 59%
PEB [9] 58% 57% 56% 55%PEC 59% 57% 55% 54%
B. Application of DCT
In order to verify the performance of the proposed PEC
method in real application, the proposed PEC is implemented
in a two-dimensional (2-D) DCT [11]. Also, peak-to-noise
ratio (PSNR) is an important data for evaluating the accuracy
performance of DCT core. There are five test images we chose
for the comparison. They are all comprised of512512pixelswith 8-bit 256 gray level data in each pixel. Table VI showsthe comparison results of the PSNR and the gate counts (Gc).
To compare with the P-T Booth multiplier, the proposed PECsaves 23% Gc with 4dB penalty. Furthermore, there are only2% Gc overhead for the better performance of PSNR with17.5 dB larger than Direct-T Booth multiplier.
Furthermore, the 2-D DCT with four PEC multipliers uses
the Synopsys Design Compiler to synthesize the RTL design
and the Cadence SOC Encounter is adopt for placement and
routing (P&R). Implemented in a TSMC 0.18-m CMOSprocess, the 8 8 2-D DCT core operates in 55 MHz andconsumes power 11.8 mW. The core layout and simulated
characteristic are shown in Fig. 6.
TABLE VI
COMPARISONS OF ACCURACY AND GATE COUNTS Gc I N D CTAPPLICATIONS
P-T Direct-T PEC
PSNR
Lena 56.1 34.6 52.0Baboon 56.0 34.6 52.1Peppers 56.1 34.6 52.0
Elain 56.1 34.6 52.1Barb 56.1 34.6 52.1
Average 56.1 34.6 52.1
Gc22.3K 16.7K 17.2K
(100%) (75%) (77%)
Shift-Register
Array
1-D DCT
Kernel
Technology 0.18m
Supply power 1.8V
Die size 532m x 532m
Gate Count 17.2 K
Max Freq. 55 MHz
Power 11.8 mW @55MHz
Characteristic
Fig. 6. The core layout and characteristic of 2-D DCT.
V. CONCLUSION
A high accuracy and simple PEC Booth multiplier is
proposed in this research. The compensated value is derived
by probability, then we can avoid the time of exhaustive
simulation in the long bit width multiplication. The experiment
results demonstrate that the proposed PEC Booth multiplier
can achieve smaller area than [5], [7] and higher accuracy
than [4], [6].
ACKNOWLEDGMENT
The authors would like to thank the National Chip Imple-
mentation Center (CIC), Taiwan, for providing the electronic
design automation tools. This work was supported in part
by National Science Council under project number NSC-100-
2221-E-007-092.
REFERENCES
[1] L. D. Van and C. C. Yang, Generalized low-error area-efficient fixed-width mu ltipliers,IEEE Trans. Circuits Syst. I, vol. 52, no. 8, pp. 16081619, Aug. 2005.
[2] C. H. Chang and R. K. Satzoda, A low error and high performance
multiplexer-based truncated multiplier,IEEE Trans. VLSI Syst., vol. 18,no. 12, pp. 17671771, Dec. 2010.
[3] N. Petra, D. D. Caro, V. Garofalo, E. Napoli, and A. G. M. Strollo,Truncated binary multipliers with variable correction and minimummean square error, IEEE Trans. Circuits Syst. I, vol. 57, no. 6, pp.13121325, Jun. 2010.
[4] S. J. Jou, M. H. Tsai, and Y. L. Tsao, Low-error reduced-width Boothmultipliers for DSP applications, IEEE Trans. Circuits Syst. I, vol. 50,no. 11, pp. 14701474, Nov. 2003.
[5] K. J. Cho, K. C. Lee, J. G. Chung, and K. K. Parhi, Design of low-error fixed-width modified Booth multiplier, IEEE Trans. VLSI Syst.,vol. 12, no. 5, pp. 522531, May 2004.
[6] M. A. Song, L. D. Van, and S. Y. Kuo, Adaptive low-error fixed- widthBooth multipliers, IEICE Trans. Fundamentals, vol. E90-A, no. 6, pp.11801187, Jun. 2007.
[7] J. P. Wang, S. R. Kuang, and S. C. Liang, High-accuracy fixed-width
modified Booth multipliers for lossy applications, IEEE Trans. VLSISyst., vol. 19, no. 1, pp. 5260, Jan. 2011.[8] Y. H. Chen, T. Y. Chang, and R. Y. Jou, A statistical error-compensated
Booth multiplier and its DCT applications, in Proc. IEEE Region 10Conf., 2010, pp. 11461149.
[9] C. Y. Li, Y. H. Chen, T. Y. Chang, and J. N. Chen, A probabilisticestimation bias circuit for fixed-width Booth multiplier and its DCTapplications,IEEE Trans. Circuits Syst. II, vol. 58, no. 4, pp. 215219,Apr. 2011.
[10] S. R. Kuang, J. P. Wang, and C. Y. Guo, Modified Booth multiplierswith a regular partial product array, IEEE Trans. Circuits Syst. II,vol. 56, no. 5, pp. 404408, May 2009.
[11] S. C. Hsia and S. H. Wang, Shift-register-based data transposition forcost-effective discrete cosine transform,IEEE Trans. VLSI Syst., vol. 15,no. 6, pp. 725728, Jun. 2007.