9_high accuracy fixed-width booth multipliers with

8/14/2019 9_High Accuracy Fixed-Width Booth Multipliers With

1/4

High Accuracy Fixed-width Booth Multipliers with

Probabilistic Estimation Compensated Method

Yuan-Ho Chen1, Hsin-Chen Chiang2, Tsin-Yuan Chang2, Chih-Wen Lu1, and Pei-Yi Lai Li2

1Department of Engineering and System Science,2Department of Electrical Engineering,

National Tsing Hua University, Hsinchu 30013, Taiwan, R.O.C.

Email: [email protected]

AbstractIn this research, a probabilistic estimation compen-sation (PEC) method for fixed-width Booth multiplier is pro-posed. According to the probabilistic analysis for the truncationpart,a formula is obtained to calculate the compensation valueeasily. In the application of long bit width, the PEC methodis implemented by a simple compensation circuit without theexhaustive simulation to achieve a high accuracy. Comparedto the previous works, the proposed method achieves better

performance in accuracy. In order to verify the performanceof PEC multipliers in real applications, it is implemented in an8 8 two-dimensional (2-D) discrete cosine transform (DCT).The result shows that the proposed PEC method can save 23%area with 4dB peak signal-to-noise ratio (PSNR) penalty.

I. INTRODUCTION

Fixed-width multipliers are important in the application of

digital signal processing (DSP) systems. In many applications,

it is desirable to remain the same width for the basic arithmetic

operations. For this reason and to reduce the area of the circuit,

the fixed-width multipliers will be only kept the most signifi-

cant half part of the products. A large error would be produced

after doing this, therefore many compensation methods areprovided to solve this problem [1]-[3]. To compare with the

traditional multiplier, the Booth multiplier reduces the partial

product rows to achieve better performance in fixed-width

multipliers. Recently, many research work hard to reduce the

truncation error on Booth multipliers [4]-[9].

Jou et al. present the statistical analysis in [4] to reduce

the complexity of hardware but it cannot obtain depressed

compensation error. On the contrary, compensation error can

be reduced with more hardware by using a threshold value

to adjust the compensation value [5]. Cho et al. in [6] use

more information from Booth encoder to improve the accurate

performance. In [7], Wang et al. present a method to slightly

modify the partial product rows of Booth multiplication and

derived an error compensation function. Although the function

can obtain smaller mean and mean-square error, the bit width

of the application should be fixed because of the time-

consuming simulation. Therefore, a probabilistic estimation

bias (PEB) is proposed to compensate the truncation value

without exhaustive simulation in [9].

This research utilizes the modified partial product rows

method of [10] and proposes a probabilistic estimation com-

pensation (PEC) method to improve the accuracy of the

application in long bit width and reduce the complexity of

m7 m0m1m2m3m4m5m6

n0n1n2n3

p0,7 p0,0p0,1p0,2p0,3p0,4p0,5p0,6

s0p1,7 p1,0p1,1p1,2p1,3p1,4p1,5p1,6

s1p2,7 p2,0p2,1p2,2p2,3p2,4p2,5p2,6

s2p3,7 p3,0p3,1p3,2p3,3p3,4p3,5p3,6

s3

p1,8

p0,8p0,8p0,8

p2,8

p3,8

p0,8p0,8p0,8p0,8p0,8

p1,8p1,8p1,8p1,8p1,8

p2,8p2,8p2,8

p3,8

P7 P0P1P2P3P4P5P6P8P9P10P11P12P13P14P15

TPmajor TPminor

Tuncation part(TP)Main part(MP)

1

Fig. 1. Traditional8 8 Booth multiplier.

p0,7 p0,0p0,1p0,2p0,3p0,4p0,5p0,6

s0p1,7 p1,0p1,1p1,2p1,3p1,4p1,5p1,6

s1p2,7 p2,0p2,1p2,2p2,3p2,4p2,5p2,6

s2p3,7 p3,1p3,2p3,3p3,4p3,5p3,6

p1,8

w0w1w2

p2,8

p3,8

1

1

1

P7 P0P1P2P3P4P5P6P8P9P10P11P12P13P14P15

TPmajor

TPminor

Tuncation part(TP)Main part(MP)

TPse TPse TPseTPsoTPso TPso TPso

3,0

m7 m0m1m2m3m4m5m6

n0n1n2n3

Fig. 2. Modified partial product rows and sign extension.

the circuit, also the area. The PEC formula is acquired by

an expected value analysis. Thus, the compensation value can

be easily obtained through the formula with the bit width by

hand calculation directly. Then, the established time can be

reduced, especially in the long bit width. Therefore, the high

accuracy and low complexity Booth multiplier is achieved in

the proposed PEB circuit.

This paper is organized as follows. In Section II, the

modified Booth multiplier and the regular expected values

are briefly described. The compensation values and circuit

implementation are showed in Section III. Section IV com-pares the accuracy of different compensation methods. Finally,

conclusions are drawn in Section V.

II. FIXED-WIDTHM ODIFIED B OOTHM ULTIPLIER

Modified Booth encoding is an effective method to reduce

the partial product rows. The twox-bit signed numbersM andN and 2x-bit product can be expressed in twos complementrepresentation as below:

M = mx12x1 +

x2i=0

mi2i


2/4

TABLE IBOOTH ENCODING

n2i+1 n2i n2i1 n

i si0 0 0 0 00 0 1 1 00 1 0 1 00 1 1 2 01 0 0 2 11 0 1 1 1

1 1 0 1 11 1 1 0 0

N = nx12x1 +

x2i=0

ni2i

P = M N. (1)

Booth encoder maps three successive bitsn2i+1,n2i,n2i1into ni, which is tabulated in Table I. There are Q = x/2partial product rows with even width x after Booth encoding.

As8 8 Booth multiplier an example, the calculation processis listed in Fig. 1. Where si is the complement bit, and then

i

is negative assi is 1; otherwise,si is 0. The one is a roundingbit to be close to the correct answer. After we finish all of the

calculation, we need to truncate the least significant 8 bits withrounding for more accurate. To achieve better performance, we

use the method which was proposed by [7] to modify the last

row of partial product as shown in Fig. 2. First, we sum up s3andp3,0 in advance to generate a sum 3,0 and a carryCx0 at

the (x2)th

and (x1)th

bit positions, respectively. Then

the carry Cx0 and1 are added up to generate a sum and acarry Cx1 which is added to the first of the p0,8 to generate

w0, w1, and w2.{Cx0, 3,0} = p3,0+ s3, {Cx1, }= Cx0+ 1

{w2, w1, w0} = {p0,8, p0,8, p0,8}+ Cx1 (2)

The product in (1) can be expressed by two parts: the main

part (MP) and the truncation part (TP). The MP includes eight

most significant columns (MSCs); the TP includes eight least

significant columns (LSCs), and the P can be rewritten asbelow:

P =MP+ TP. (3)

The TP is omitted and the compensation bias is introduced

to the MP based on a probabilistic estimation. Therefore, the

quantized product Pq can be rewritten as below:

PPq =MP+ 2x. (4)

where representing the PEC can be decomposed further intoTPmajor and T Pminor parts as following equations.

= (TPmajor+ TPminor)R (5)

where ()R is rounded to the nearest integer. The product iseffected by TPmajor much more then TPminor due to the

TABLE IIITHE CALCULATION OF ERROR COMPENSATED VALUE

x= 8 x= 10 x= 12 x= 16 x= 323Q216

0.625 0.8125 1.0 1.375 2.125

I 0 0 1 1 2D 1 1 0 0 0

weight of the position. We compute the TPmajor to estimatePEC and derive the TPminor in probabilistic estimation. Bymodifying the last row of partial product, we can estimate

more accurately by decreasing the unknown in TPminor .Therefore, the compensated value can be obtained by calcu-

lating TPmajor and estimating TPminor. The TPminor andTPmajor in equation (5) are listed as below:

TPmajor = 1

2(p0,x1+p1,x3+ +pQ1,1+ )

= 1

2

Q1j=0

pj,x12j+ (6)

TPminor = 1

4(p0,x2+ +pQ2,2+ 3,0) +

1

8(p0,x3+ +pQ2,1) + +

2x (p0,0+ s0) . (7)

III. PROPOSED P ROBABILISTIC E STIMATION

COMPENSATION(PEC)

A. Derivation of error-compensation value

We divide TPminor into two groups. One is the sum ofodd columnsT Pso, and the other is the sum of even columnsTPse as shown in Fig. 2.

TPso = 1

4(p0,x2+ +pQ2,2+ 3,0) +

1

16(p0,x4+ +pQ2,0+ sQ2) +

+ 2x (p0,0+ s0) (8)

TPse = 1

8(p0,x3+ +pQ2,1) + (9)

1

32(p0,x5+ +pQ3,1) + + 2

x+1p0,1

In the regular expected values analysis of Booth multipliers

[9], we can obtain the expected values of partial products as

shown in Table II, whereE{}is the expected value of. Theexpected values ofTPminor shown in Fig. 3 can be derivedas below:

E[TPminor] = E[TPso+ TPse]

= 3

8

Q1n=1

3

2Q + 2

3

2n

22n

1

8+ 2x

= 3Q2

16 + 22(Q+1) (10)

We only keep the first item (3Q2) /16because the valueof22(Q+1) is too smaller than the first item to ignore. The


3/4

TABLE IIEXPECTED VALUE OF EACH PARTIAL PRODUCT

E{pi,j= 1} E{si = 1} E{p0,0 = 1} E{s0 = 1} E{3,0 = 1}(i, j = 0) (i = 0)

3

8

3

8

1

2

1

2

1

4

p0,7 1/23/83/83/83/83/83/8

1/23/83/83/83/83/8p1,5

3/83/83/83/8p2,3

3/81/4p3,1

TPmajor TPminor (expected value)

m7 m0m1m2m3m4m5m6

n0n1n2n3

Fig. 3. Expected value of truncation part.

longer of the length x is, the more conspicuous the outcomeis. As a result, we can approximate the value ofTPminor asbelow:

E[TPminor]

3Q2

16

R

= (I.d)R

= I+ (d)R = I+

D

2

R

(11)

whereI is the integer part and d is the decimal point part of

(3Q2) /16. D is a binary number (0 or 1) for rounding.The values ofI and D in different length are calculated andtabulated in Table III. The reason we keep the (D/2)R tosubstitute equation (11) to equation (5) but not compute the

rounding value directly is that the approximate value can be

more accurate. Then the PEC formula is summarized as below:

=

TPmajor+

3Q2

16

R

=

TPmajor+

D

2

R

+ I

(12)

B. Proposed PEC circuit

FAFA

FAFAHA

RCA

HA

FA

FA

HA

FA

FA

HA

FA

P14 P13P15 P12 P11 P9P10 P8

1

p3,6 p3,5 p3,4 p3,3 p3,2

1

p2,6 p2,5 p2,4

p1,6

p2,8

p0,7

p2,3

p3,1

D=1FA

FA

p3,7p3,81

p2,7

w2 w1 w0

p1,5HA

I=0

p1,7p1,8

Fig. 4. The Proposed PEC8 8 Booth multiplier.

It is easy to implement the circuit by using full adders (FAs)

and half adders (HAs) as shown in Fig. 4. From the formula

p0,31

p2,27FA

I=2 1

p1,29

p3,25

p4,23

p15,1

FA

FA

c0

c1

c8

c10

c9 1

Fig. 5. The proposed 32-b PEC compensation circuit.

we derived, we can obtain I and D easily, too. I part canbe added to the column ofP8 directly and D part for the P7column. To demonstrate the long bit-width implementation, the

proposed 32-bit PEC compensation circuit is design as Fig. 5,

where I = 2 and D = 0 as listed in Table III. Therefore,the fixed-with Booth multiplier can be easily implemented by

using the proposed PEC compensation circuit for long bit-

width applications.

IV. COMPARISONS AND DISCUSSIONS

A. Comparison with other multipliers

To compare the performance of the accuracy, we introduce

the formula of the absolute average error to calculate as below:

= Avg {|P Pq |} (13)

The comparison of in different methodologies is shownin Table IV. The Direct-T shown in Table IV is to truncate

the least significant x bits directly without any calculation.Thus, the Direct-T multiplier is the worst one in the fixed-

width multipliers, and we show the accuracy in percentage

expression normalized to Direct-T method. The gate counts

TABLE IVCOMPARISONS OF ABSOLUTE AVERAGE ERROR WITH OTHER METHODS

Methods x= 8 x= 10 x= 12 x= 16

Direct-T 100% 100% 100% 100%

(384) (1920) (9216) (196608)

Jou et al. [4] 27.85% 24.85% 22.61% 19.49%

Cho et al. [5 ] 22.00% 18.26% 15.85% 12.70%

Song et al. [6] 26.84% 24.48% 22.49% 19.48%

Wang et al. [7] 20.14% 17.05% 14.75% 11.92%

PEB [9] 23.10% 21.15% 17.95% 15.90%

PEC 22.42% 18.56% 18.70% 14.02%


4/4

(Gc) are also important information for performance as shownin Table V. The P-T shown in Table V is to truncate the least

significant x bits after all of the calculation, which has thelargest area for fixed-width multipliers design. Therefore, the

percentages ofGc values are normalized to the P-T BoothmultipliersGc. The accuracy of proposed PEC is much betterthan [4] and [6] with about the same Gc. Although the errors of

[5] and [7] are less than ours, their area is larger. Compared tothe other approaches, the proposed PEC still has an advantage

of small established time in long bit width application.

TABLE VCOMPARISONS OF GATE COUNTSGc WITH OTHER METHODS

Methods x= 8 x= 10 x= 12 x= 16

P-T 100% 100% 100% 100%

(655) (991) (1394) (2406)

Jou et al. [4] 57% 56% 55% 54%

Cho et al. [5] 65% 62% 60% 58%

Song et al. [6] 58% 57% 56% 55%

Wang et al. [7 ] 64% 62% 60% 59%

PEB [9] 58% 57% 56% 55%PEC 59% 57% 55% 54%

B. Application of DCT

In order to verify the performance of the proposed PEC

method in real application, the proposed PEC is implemented

in a two-dimensional (2-D) DCT [11]. Also, peak-to-noise

ratio (PSNR) is an important data for evaluating the accuracy

performance of DCT core. There are five test images we chose

for the comparison. They are all comprised of512512pixelswith 8-bit 256 gray level data in each pixel. Table VI showsthe comparison results of the PSNR and the gate counts (Gc).

To compare with the P-T Booth multiplier, the proposed PECsaves 23% Gc with 4dB penalty. Furthermore, there are only2% Gc overhead for the better performance of PSNR with17.5 dB larger than Direct-T Booth multiplier.

Furthermore, the 2-D DCT with four PEC multipliers uses

the Synopsys Design Compiler to synthesize the RTL design

and the Cadence SOC Encounter is adopt for placement and

routing (P&R). Implemented in a TSMC 0.18-m CMOSprocess, the 8 8 2-D DCT core operates in 55 MHz andconsumes power 11.8 mW. The core layout and simulated

characteristic are shown in Fig. 6.

TABLE VI

COMPARISONS OF ACCURACY AND GATE COUNTS Gc I N D CTAPPLICATIONS

P-T Direct-T PEC

PSNR

Lena 56.1 34.6 52.0Baboon 56.0 34.6 52.1Peppers 56.1 34.6 52.0

Elain 56.1 34.6 52.1Barb 56.1 34.6 52.1

Average 56.1 34.6 52.1

Gc22.3K 16.7K 17.2K

(100%) (75%) (77%)

Shift-Register

Array

1-D DCT

Kernel

Technology 0.18m

Supply power 1.8V

Die size 532m x 532m

Gate Count 17.2 K

Max Freq. 55 MHz

Power 11.8 mW @55MHz

Characteristic

Fig. 6. The core layout and characteristic of 2-D DCT.

V. CONCLUSION

A high accuracy and simple PEC Booth multiplier is

proposed in this research. The compensated value is derived

by probability, then we can avoid the time of exhaustive

simulation in the long bit width multiplication. The experiment

results demonstrate that the proposed PEC Booth multiplier

can achieve smaller area than [5], [7] and higher accuracy

than [4], [6].

ACKNOWLEDGMENT

The authors would like to thank the National Chip Imple-

mentation Center (CIC), Taiwan, for providing the electronic

design automation tools. This work was supported in part

by National Science Council under project number NSC-100-

2221-E-007-092.

REFERENCES

[1] L. D. Van and C. C. Yang, Generalized low-error area-efficient fixed-width mu ltipliers,IEEE Trans. Circuits Syst. I, vol. 52, no. 8, pp. 16081619, Aug. 2005.

[2] C. H. Chang and R. K. Satzoda, A low error and high performance

multiplexer-based truncated multiplier,IEEE Trans. VLSI Syst., vol. 18,no. 12, pp. 17671771, Dec. 2010.

[3] N. Petra, D. D. Caro, V. Garofalo, E. Napoli, and A. G. M. Strollo,Truncated binary multipliers with variable correction and minimummean square error, IEEE Trans. Circuits Syst. I, vol. 57, no. 6, pp.13121325, Jun. 2010.

[4] S. J. Jou, M. H. Tsai, and Y. L. Tsao, Low-error reduced-width Boothmultipliers for DSP applications, IEEE Trans. Circuits Syst. I, vol. 50,no. 11, pp. 14701474, Nov. 2003.

[5] K. J. Cho, K. C. Lee, J. G. Chung, and K. K. Parhi, Design of low-error fixed-width modified Booth multiplier, IEEE Trans. VLSI Syst.,vol. 12, no. 5, pp. 522531, May 2004.

[6] M. A. Song, L. D. Van, and S. Y. Kuo, Adaptive low-error fixed- widthBooth multipliers, IEICE Trans. Fundamentals, vol. E90-A, no. 6, pp.11801187, Jun. 2007.

[7] J. P. Wang, S. R. Kuang, and S. C. Liang, High-accuracy fixed-width

modified Booth multipliers for lossy applications, IEEE Trans. VLSISyst., vol. 19, no. 1, pp. 5260, Jan. 2011.[8] Y. H. Chen, T. Y. Chang, and R. Y. Jou, A statistical error-compensated

Booth multiplier and its DCT applications, in Proc. IEEE Region 10Conf., 2010, pp. 11461149.

[9] C. Y. Li, Y. H. Chen, T. Y. Chang, and J. N. Chen, A probabilisticestimation bias circuit for fixed-width Booth multiplier and its DCTapplications,IEEE Trans. Circuits Syst. II, vol. 58, no. 4, pp. 215219,Apr. 2011.

[10] S. R. Kuang, J. P. Wang, and C. Y. Guo, Modified Booth multiplierswith a regular partial product array, IEEE Trans. Circuits Syst. II,vol. 56, no. 5, pp. 404408, May 2009.

[11] S. C. Hsia and S. H. Wang, Shift-register-based data transposition forcost-effective discrete cosine transform,IEEE Trans. VLSI Syst., vol. 15,no. 6, pp. 725728, Jun. 2007.

9_high accuracy fixed-width booth multipliers with

Documents