9_high accuracy fixed-width booth multipliers with

Upload: anindyasaha

Post on 04-Jun-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 9_High Accuracy Fixed-Width Booth Multipliers With

    1/4

    High Accuracy Fixed-width Booth Multipliers with

    Probabilistic Estimation Compensated Method

    Yuan-Ho Chen1, Hsin-Chen Chiang2, Tsin-Yuan Chang2, Chih-Wen Lu1, and Pei-Yi Lai Li2

    1Department of Engineering and System Science,2Department of Electrical Engineering,

    National Tsing Hua University, Hsinchu 30013, Taiwan, R.O.C.

    Email: [email protected]

    AbstractIn this research, a probabilistic estimation compen-sation (PEC) method for fixed-width Booth multiplier is pro-posed. According to the probabilistic analysis for the truncationpart,a formula is obtained to calculate the compensation valueeasily. In the application of long bit width, the PEC methodis implemented by a simple compensation circuit without theexhaustive simulation to achieve a high accuracy. Comparedto the previous works, the proposed method achieves better

    performance in accuracy. In order to verify the performanceof PEC multipliers in real applications, it is implemented in an8 8 two-dimensional (2-D) discrete cosine transform (DCT).The result shows that the proposed PEC method can save 23%area with 4dB peak signal-to-noise ratio (PSNR) penalty.

    I. INTRODUCTION

    Fixed-width multipliers are important in the application of

    digital signal processing (DSP) systems. In many applications,

    it is desirable to remain the same width for the basic arithmetic

    operations. For this reason and to reduce the area of the circuit,

    the fixed-width multipliers will be only kept the most signifi-

    cant half part of the products. A large error would be produced

    after doing this, therefore many compensation methods areprovided to solve this problem [1]-[3]. To compare with the

    traditional multiplier, the Booth multiplier reduces the partial

    product rows to achieve better performance in fixed-width

    multipliers. Recently, many research work hard to reduce the

    truncation error on Booth multipliers [4]-[9].

    Jou et al. present the statistical analysis in [4] to reduce

    the complexity of hardware but it cannot obtain depressed

    compensation error. On the contrary, compensation error can

    be reduced with more hardware by using a threshold value

    to adjust the compensation value [5]. Cho et al. in [6] use

    more information from Booth encoder to improve the accurate

    performance. In [7], Wang et al. present a method to slightly

    modify the partial product rows of Booth multiplication and

    derived an error compensation function. Although the function

    can obtain smaller mean and mean-square error, the bit width

    of the application should be fixed because of the time-

    consuming simulation. Therefore, a probabilistic estimation

    bias (PEB) is proposed to compensate the truncation value

    without exhaustive simulation in [9].

    This research utilizes the modified partial product rows

    method of [10] and proposes a probabilistic estimation com-

    pensation (PEC) method to improve the accuracy of the

    application in long bit width and reduce the complexity of

    m7 m0m1m2m3m4m5m6

    n0n1n2n3

    p0,7 p0,0p0,1p0,2p0,3p0,4p0,5p0,6

    s0p1,7 p1,0p1,1p1,2p1,3p1,4p1,5p1,6

    s1p2,7 p2,0p2,1p2,2p2,3p2,4p2,5p2,6

    s2p3,7 p3,0p3,1p3,2p3,3p3,4p3,5p3,6

    s3

    p1,8

    p0,8p0,8p0,8

    p2,8

    p3,8

    p0,8p0,8p0,8p0,8p0,8

    p1,8p1,8p1,8p1,8p1,8

    p2,8p2,8p2,8

    p3,8

    P7 P0P1P2P3P4P5P6P8P9P10P11P12P13P14P15

    TPmajor TPminor

    Tuncation part(TP)Main part(MP)

    1

    Fig. 1. Traditional8 8 Booth multiplier.

    p0,7 p0,0p0,1p0,2p0,3p0,4p0,5p0,6

    s0p1,7 p1,0p1,1p1,2p1,3p1,4p1,5p1,6

    s1p2,7 p2,0p2,1p2,2p2,3p2,4p2,5p2,6

    s2p3,7 p3,1p3,2p3,3p3,4p3,5p3,6

    p1,8

    w0w1w2

    p2,8

    p3,8

    1

    1

    1

    P7 P0P1P2P3P4P5P6P8P9P10P11P12P13P14P15

    TPmajor

    TPminor

    Tuncation part(TP)Main part(MP)

    TPse TPse TPseTPsoTPso TPso TPso

    3,0

    m7 m0m1m2m3m4m5m6

    n0n1n2n3

    Fig. 2. Modified partial product rows and sign extension.

    the circuit, also the area. The PEC formula is acquired by

    an expected value analysis. Thus, the compensation value can

    be easily obtained through the formula with the bit width by

    hand calculation directly. Then, the established time can be

    reduced, especially in the long bit width. Therefore, the high

    accuracy and low complexity Booth multiplier is achieved in

    the proposed PEB circuit.

    This paper is organized as follows. In Section II, the

    modified Booth multiplier and the regular expected values

    are briefly described. The compensation values and circuit

    implementation are showed in Section III. Section IV com-pares the accuracy of different compensation methods. Finally,

    conclusions are drawn in Section V.

    II. FIXED-WIDTHM ODIFIED B OOTHM ULTIPLIER

    Modified Booth encoding is an effective method to reduce

    the partial product rows. The twox-bit signed numbersM andN and 2x-bit product can be expressed in twos complementrepresentation as below:

    M = mx12x1 +

    x2i=0

    mi2i

  • 8/14/2019 9_High Accuracy Fixed-Width Booth Multipliers With

    2/4

    TABLE IBOOTH ENCODING

    n2i+1 n2i n2i1 n

    i si0 0 0 0 00 0 1 1 00 1 0 1 00 1 1 2 01 0 0 2 11 0 1 1 1

    1 1 0 1 11 1 1 0 0

    N = nx12x1 +

    x2i=0

    ni2i

    P = M N. (1)

    Booth encoder maps three successive bitsn2i+1,n2i,n2i1into ni, which is tabulated in Table I. There are Q = x/2partial product rows with even width x after Booth encoding.

    As8 8 Booth multiplier an example, the calculation processis listed in Fig. 1. Where si is the complement bit, and then

    i

    is negative assi is 1; otherwise,si is 0. The one is a roundingbit to be close to the correct answer. After we finish all of the

    calculation, we need to truncate the least significant 8 bits withrounding for more accurate. To achieve better performance, we

    use the method which was proposed by [7] to modify the last

    row of partial product as shown in Fig. 2. First, we sum up s3andp3,0 in advance to generate a sum 3,0 and a carryCx0 at

    the (x2)th

    and (x1)th

    bit positions, respectively. Then

    the carry Cx0 and1 are added up to generate a sum and acarry Cx1 which is added to the first of the p0,8 to generate

    w0, w1, and w2.{Cx0, 3,0} = p3,0+ s3, {Cx1, }= Cx0+ 1

    {w2, w1, w0} = {p0,8, p0,8, p0,8}+ Cx1 (2)

    The product in (1) can be expressed by two parts: the main

    part (MP) and the truncation part (TP). The MP includes eight

    most significant columns (MSCs); the TP includes eight least

    significant columns (LSCs), and the P can be rewritten asbelow:

    P =MP+ TP. (3)

    The TP is omitted and the compensation bias is introduced

    to the MP based on a probabilistic estimation. Therefore, the

    quantized product Pq can be rewritten as below:

    PPq =MP+ 2x. (4)

    where representing the PEC can be decomposed further intoTPmajor and T Pminor parts as following equations.

    = (TPmajor+ TPminor)R (5)

    where ()R is rounded to the nearest integer. The product iseffected by TPmajor much more then TPminor due to the

    TABLE IIITHE CALCULATION OF ERROR COMPENSATED VALUE

    x= 8 x= 10 x= 12 x= 16 x= 323Q216

    0.625 0.8125 1.0 1.375 2.125

    I 0 0 1 1 2D 1 1 0 0 0

    weight of the position. We compute the TPmajor to estimatePEC and derive the TPminor in probabilistic estimation. Bymodifying the last row of partial product, we can estimate

    more accurately by decreasing the unknown in TPminor .Therefore, the compensated value can be obtained by calcu-

    lating TPmajor and estimating TPminor. The TPminor andTPmajor in equation (5) are listed as below:

    TPmajor = 1

    2(p0,x1+p1,x3+ +pQ1,1+ )

    = 1

    2

    Q1j=0

    pj,x12j+ (6)

    TPminor = 1

    4(p0,x2+ +pQ2,2+ 3,0) +

    1

    8(p0,x3+ +pQ2,1) + +

    2x (p0,0+ s0) . (7)

    III. PROPOSED P ROBABILISTIC E STIMATION

    COMPENSATION(PEC)

    A. Derivation of error-compensation value

    We divide TPminor into two groups. One is the sum ofodd columnsT Pso, and the other is the sum of even columnsTPse as shown in Fig. 2.

    TPso = 1

    4(p0,x2+ +pQ2,2+ 3,0) +

    1

    16(p0,x4+ +pQ2,0+ sQ2) +

    + 2x (p0,0+ s0) (8)

    TPse = 1

    8(p0,x3+ +pQ2,1) + (9)

    1

    32(p0,x5+ +pQ3,1) + + 2

    x+1p0,1

    In the regular expected values analysis of Booth multipliers

    [9], we can obtain the expected values of partial products as

    shown in Table II, whereE{}is the expected value of. Theexpected values ofTPminor shown in Fig. 3 can be derivedas below:

    E[TPminor] = E[TPso+ TPse]

    = 3

    8

    Q1n=1

    3

    2Q + 2

    3

    2n

    22n

    1

    8+ 2x

    = 3Q2

    16 + 22(Q+1) (10)

    We only keep the first item (3Q2) /16because the valueof22(Q+1) is too smaller than the first item to ignore. The

  • 8/14/2019 9_High Accuracy Fixed-Width Booth Multipliers With

    3/4

    TABLE IIEXPECTED VALUE OF EACH PARTIAL PRODUCT

    E{pi,j= 1} E{si = 1} E{p0,0 = 1} E{s0 = 1} E{3,0 = 1}(i, j = 0) (i = 0)

    3

    8

    3

    8

    1

    2

    1

    2

    1

    4

    p0,7 1/23/83/83/83/83/83/8

    1/23/83/83/83/83/8p1,5

    3/83/83/83/8p2,3

    3/81/4p3,1

    TPmajor TPminor (expected value)

    m7 m0m1m2m3m4m5m6

    n0n1n2n3

    Fig. 3. Expected value of truncation part.

    longer of the length x is, the more conspicuous the outcomeis. As a result, we can approximate the value ofTPminor asbelow:

    E[TPminor]

    3Q2

    16

    R

    = (I.d)R

    = I+ (d)R = I+

    D

    2

    R

    (11)

    whereI is the integer part and d is the decimal point part of

    (3Q2) /16. D is a binary number (0 or 1) for rounding.The values ofI and D in different length are calculated andtabulated in Table III. The reason we keep the (D/2)R tosubstitute equation (11) to equation (5) but not compute the

    rounding value directly is that the approximate value can be

    more accurate. Then the PEC formula is summarized as below:

    =

    TPmajor+

    3Q2

    16

    R

    =

    TPmajor+

    D

    2

    R

    + I

    (12)

    B. Proposed PEC circuit

    FAFA

    FAFAHA

    RCA

    HA

    FA

    FA

    HA

    FA

    FA

    HA

    FA

    P14 P13P15 P12 P11 P9P10 P8

    1

    p3,6 p3,5 p3,4 p3,3 p3,2

    1

    p2,6 p2,5 p2,4

    p1,6

    p2,8

    p0,7

    p2,3

    p3,1

    D=1FA

    FA

    p3,7p3,81

    p2,7

    w2 w1 w0

    p1,5HA

    I=0

    p1,7p1,8

    Fig. 4. The Proposed PEC8 8 Booth multiplier.

    It is easy to implement the circuit by using full adders (FAs)

    and half adders (HAs) as shown in Fig. 4. From the formula

    p0,31

    p2,27FA

    I=2 1

    p1,29

    p3,25

    p4,23

    p15,1

    FA

    FA

    c0

    c1

    c8

    c10

    c9 1

    Fig. 5. The proposed 32-b PEC compensation circuit.

    we derived, we can obtain I and D easily, too. I part canbe added to the column ofP8 directly and D part for the P7column. To demonstrate the long bit-width implementation, the

    proposed 32-bit PEC compensation circuit is design as Fig. 5,

    where I = 2 and D = 0 as listed in Table III. Therefore,the fixed-with Booth multiplier can be easily implemented by

    using the proposed PEC compensation circuit for long bit-

    width applications.

    IV. COMPARISONS AND DISCUSSIONS

    A. Comparison with other multipliers

    To compare the performance of the accuracy, we introduce

    the formula of the absolute average error to calculate as below:

    = Avg {|P Pq |} (13)

    The comparison of in different methodologies is shownin Table IV. The Direct-T shown in Table IV is to truncate

    the least significant x bits directly without any calculation.Thus, the Direct-T multiplier is the worst one in the fixed-

    width multipliers, and we show the accuracy in percentage

    expression normalized to Direct-T method. The gate counts

    TABLE IVCOMPARISONS OF ABSOLUTE AVERAGE ERROR WITH OTHER METHODS

    Methods x= 8 x= 10 x= 12 x= 16

    Direct-T 100% 100% 100% 100%

    (384) (1920) (9216) (196608)

    Jou et al. [4] 27.85% 24.85% 22.61% 19.49%

    Cho et al. [5 ] 22.00% 18.26% 15.85% 12.70%

    Song et al. [6] 26.84% 24.48% 22.49% 19.48%

    Wang et al. [7] 20.14% 17.05% 14.75% 11.92%

    PEB [9] 23.10% 21.15% 17.95% 15.90%

    PEC 22.42% 18.56% 18.70% 14.02%

  • 8/14/2019 9_High Accuracy Fixed-Width Booth Multipliers With

    4/4

    (Gc) are also important information for performance as shownin Table V. The P-T shown in Table V is to truncate the least

    significant x bits after all of the calculation, which has thelargest area for fixed-width multipliers design. Therefore, the

    percentages ofGc values are normalized to the P-T BoothmultipliersGc. The accuracy of proposed PEC is much betterthan [4] and [6] with about the same Gc. Although the errors of

    [5] and [7] are less than ours, their area is larger. Compared tothe other approaches, the proposed PEC still has an advantage

    of small established time in long bit width application.

    TABLE VCOMPARISONS OF GATE COUNTSGc WITH OTHER METHODS

    Methods x= 8 x= 10 x= 12 x= 16

    P-T 100% 100% 100% 100%

    (655) (991) (1394) (2406)

    Jou et al. [4] 57% 56% 55% 54%

    Cho et al. [5] 65% 62% 60% 58%

    Song et al. [6] 58% 57% 56% 55%

    Wang et al. [7 ] 64% 62% 60% 59%

    PEB [9] 58% 57% 56% 55%PEC 59% 57% 55% 54%

    B. Application of DCT

    In order to verify the performance of the proposed PEC

    method in real application, the proposed PEC is implemented

    in a two-dimensional (2-D) DCT [11]. Also, peak-to-noise

    ratio (PSNR) is an important data for evaluating the accuracy

    performance of DCT core. There are five test images we chose

    for the comparison. They are all comprised of512512pixelswith 8-bit 256 gray level data in each pixel. Table VI showsthe comparison results of the PSNR and the gate counts (Gc).

    To compare with the P-T Booth multiplier, the proposed PECsaves 23% Gc with 4dB penalty. Furthermore, there are only2% Gc overhead for the better performance of PSNR with17.5 dB larger than Direct-T Booth multiplier.

    Furthermore, the 2-D DCT with four PEC multipliers uses

    the Synopsys Design Compiler to synthesize the RTL design

    and the Cadence SOC Encounter is adopt for placement and

    routing (P&R). Implemented in a TSMC 0.18-m CMOSprocess, the 8 8 2-D DCT core operates in 55 MHz andconsumes power 11.8 mW. The core layout and simulated

    characteristic are shown in Fig. 6.

    TABLE VI

    COMPARISONS OF ACCURACY AND GATE COUNTS Gc I N D CTAPPLICATIONS

    P-T Direct-T PEC

    PSNR

    Lena 56.1 34.6 52.0Baboon 56.0 34.6 52.1Peppers 56.1 34.6 52.0

    Elain 56.1 34.6 52.1Barb 56.1 34.6 52.1

    Average 56.1 34.6 52.1

    Gc22.3K 16.7K 17.2K

    (100%) (75%) (77%)

    Shift-Register

    Array

    1-D DCT

    Kernel

    Technology 0.18m

    Supply power 1.8V

    Die size 532m x 532m

    Gate Count 17.2 K

    Max Freq. 55 MHz

    Power 11.8 mW @55MHz

    Characteristic

    Fig. 6. The core layout and characteristic of 2-D DCT.

    V. CONCLUSION

    A high accuracy and simple PEC Booth multiplier is

    proposed in this research. The compensated value is derived

    by probability, then we can avoid the time of exhaustive

    simulation in the long bit width multiplication. The experiment

    results demonstrate that the proposed PEC Booth multiplier

    can achieve smaller area than [5], [7] and higher accuracy

    than [4], [6].

    ACKNOWLEDGMENT

    The authors would like to thank the National Chip Imple-

    mentation Center (CIC), Taiwan, for providing the electronic

    design automation tools. This work was supported in part

    by National Science Council under project number NSC-100-

    2221-E-007-092.

    REFERENCES

    [1] L. D. Van and C. C. Yang, Generalized low-error area-efficient fixed-width mu ltipliers,IEEE Trans. Circuits Syst. I, vol. 52, no. 8, pp. 16081619, Aug. 2005.

    [2] C. H. Chang and R. K. Satzoda, A low error and high performance

    multiplexer-based truncated multiplier,IEEE Trans. VLSI Syst., vol. 18,no. 12, pp. 17671771, Dec. 2010.

    [3] N. Petra, D. D. Caro, V. Garofalo, E. Napoli, and A. G. M. Strollo,Truncated binary multipliers with variable correction and minimummean square error, IEEE Trans. Circuits Syst. I, vol. 57, no. 6, pp.13121325, Jun. 2010.

    [4] S. J. Jou, M. H. Tsai, and Y. L. Tsao, Low-error reduced-width Boothmultipliers for DSP applications, IEEE Trans. Circuits Syst. I, vol. 50,no. 11, pp. 14701474, Nov. 2003.

    [5] K. J. Cho, K. C. Lee, J. G. Chung, and K. K. Parhi, Design of low-error fixed-width modified Booth multiplier, IEEE Trans. VLSI Syst.,vol. 12, no. 5, pp. 522531, May 2004.

    [6] M. A. Song, L. D. Van, and S. Y. Kuo, Adaptive low-error fixed- widthBooth multipliers, IEICE Trans. Fundamentals, vol. E90-A, no. 6, pp.11801187, Jun. 2007.

    [7] J. P. Wang, S. R. Kuang, and S. C. Liang, High-accuracy fixed-width

    modified Booth multipliers for lossy applications, IEEE Trans. VLSISyst., vol. 19, no. 1, pp. 5260, Jan. 2011.[8] Y. H. Chen, T. Y. Chang, and R. Y. Jou, A statistical error-compensated

    Booth multiplier and its DCT applications, in Proc. IEEE Region 10Conf., 2010, pp. 11461149.

    [9] C. Y. Li, Y. H. Chen, T. Y. Chang, and J. N. Chen, A probabilisticestimation bias circuit for fixed-width Booth multiplier and its DCTapplications,IEEE Trans. Circuits Syst. II, vol. 58, no. 4, pp. 215219,Apr. 2011.

    [10] S. R. Kuang, J. P. Wang, and C. Y. Guo, Modified Booth multiplierswith a regular partial product array, IEEE Trans. Circuits Syst. II,vol. 56, no. 5, pp. 404408, May 2009.

    [11] S. C. Hsia and S. H. Wang, Shift-register-based data transposition forcost-effective discrete cosine transform,IEEE Trans. VLSI Syst., vol. 15,no. 6, pp. 725728, Jun. 2007.