chang sik 2000

3
IEE E TRANSACTIONS ON CIRCUI TS AND SYSTEMS—II: ANALOG AND DIGI T AL SIGNAL PR OCE SSI NG, VOL. 47, NO. 9, SEP TEMBE R 2000 935 In summa ry, the MAXLINE2 algorithm is the clear winner in terms of data buffer requirements and average execution time on i.i.d. data. However, for applications involving correlated input data or for real-time applications where worst-case execution time is important, the MAXTREE and MAXTREE2 algorithms are the preferred choice. Finally, it should be noted that although the Pitas algorithm from [ 5] is, as noted above, uncompetitive as a software algorithm, it remains a good choice for a parallel hardware implementation since it requires only   registers and      comparators. ACKNOWLEDGMENT The author would like to thank the anonymous reviewers and Dr. J. Chambers for their helpful comments which substantially improved this paper. REFERENCES [1] G. R. Arce and M. P. McLo ughl in, “Theor etical analys is of the max- median filter,”  IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, pp. 60–69, Jan. 1987. [2] P . A. Maragas and R. W. Schafer , “Morphological filters—Part I: Their set theor etic analysis and relations to linear shift inv ariant filters ,”  IEEE Tran s. Acoust., Speech, Signal Processing, vol. ASSP- 35, pp. 1153–1184, Aug. 1987. [3] R. Martin, “Spectral subtraction based on minimu m statistics,” in Proc.  EUSIPCO-94, Edinburgh, Scotland, Sept. 1994, pp. 1182–1185. [4] I. Pitas,“Fast algor ithmsfor runn ing orde ringand max/mi n calcul ation ,”  IEEE Trans. Circuits Syst., vol. 36, pp. 795–804, 1989. [5] D. Co ltuc and I. Pitas, “On f ast running max-min filtering,” IEEE Trans. Circuits Syst. II , vol. 44, pp. 660–664, Aug. 1997. [6] J. Garofolo et al., “DARPA TIMIT acoustic-pho netic continuous speech corpus (CD-ROM),” National Institute of Standards and Technology, 1990. A CMOS Buffer Without Short-Circuit Power Consumption Changsik Yoo  Abstract—A new CMOS buffer without short-circuit power consump- tion is proposed. The gate- driving signal of the output pull-up (pull-down) transistor is fed back to the output pull-down (pull-up) transistor to get tri-st ate outpu t moment arily , elimina ting the short- circui t power consu mp- tion . The HSPICE simulat ion res ults ver if ied theoperat ion of the pro posed buffe r andshowedthe power- del ay produc t is about 15%smalle r tha n con- ventional tapered CMOS buffer.  Index Terms—CMOS buffer, short-circuit power consumption. I. INTRODUCTION With the high integration level of CMOS very large scale integra- tion (VLSI), the capacitive load of periodic signals such as clock has become very large. With such a large capacitive load, driving circuits consume a relatively large portion of the total power of a VLSI. The Manuscript received June 1999; revised June 2000. This paper was recom- mended by Associate Editor M. Bayoumi. The autho r was withIntegrate d Syste ms Labo rator y (IIS) , SwissFederal Insti- tute of Technology, Zurich, Switzerland. He is now with Samsung Electronics, Kiheung, Korea. Publisher Item Identifier S 1057-7130(00)07752-1. Fig. 1. (a) T apered CMOS buffer and (b) its timing diagram. Fig. 2. (a) Feedba ck-co ntrol led split-p ath CMOS buf fer and (b) its timing diagram. power consumption of a CMOS buffer driving a capacitive load con- sists of dynamic switching power and short-circuit power. While the swit ching -power consu mpti on is unavoidable to driv e a capa citi ve load, short-circuit power is a waste of current and should be minimized or even eliminated for low-power operation. A conventional tapered CMOS buffer , shown in Fig. 1(a), consumes both the dynamic switching power and short-circuit power due to si- multaneous turn-on of the pull-up/pull-down transistors, as illustrated in Fig. 1(b) [1]. Short-circuit power consumption can be eliminated by tri- stat ing the output node mome ntari ly before ev ery output signa l tran- sition. In [2], asymmetric inverters were used as waveform shaper to get momentary tri-state output period, but the propagation delay is in- crea sed by the asymmetr ic inv erte rs. As an alternative, a feed back- con- trolled split-path (FS) CMOS buffer was proposed, where the output signal is fed back to control the output pull-up and pull-down transis- tors, as shown in Fig. 2, tri-stating the output momentarily and thereby eliminating the short-circuit power consumption [ 3]. But, in the FS CMOS buffer, the logic states of the split output stage drivers change 1057–7130/00$10.00 © 2000 IEEE

Upload: dang-duy

Post on 16-Feb-2018

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chang Sik 2000

7/23/2019 Chang Sik 2000

http://slidepdf.com/reader/full/chang-sik-2000 1/3

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 47, NO. 9, SEPTEMBER 2000 935

In summary, the MAXLINE2 algorithm is the clear winner in

terms of data buffer requirements and average execution time on i.i.d.

data. However, for applications involving correlated input data or for

real-time applications where worst-case execution time is important,

the MAXTREE and MAXTREE2 algorithms are the preferred choice.

Finally, it should be noted that although the Pitas algorithm from [5]

is, as noted above, uncompetitive as a software algorithm, it remains a

good choice for a parallel hardware implementation since it requires

only 

  registers and  

 

  comparators.

ACKNOWLEDGMENT

The author would like to thank the anonymous reviewers and Dr.

J. Chambers for their helpful comments which substantially improved

this paper.

REFERENCES

[1] G. R. Arce and M. P. McLoughlin, “Theoretical analysis of the max-median filter,”   IEEE Trans. Acoust., Speech, Signal Processing, vol.ASSP-35, pp. 60–69, Jan. 1987.

[2] P. A. Maragas and R. W. Schafer, “Morphological filters—Part I: Theirset theoretic analysis and relations to linear shift invariant filters,”

 IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, pp.1153–1184, Aug. 1987.

[3] R. Martin, “Spectral subtraction based on minimum statistics,” in Proc. EUSIPCO-94, Edinburgh, Scotland, Sept. 1994, pp. 1182–1185.

[4] I. Pitas,“Fast algorithmsfor running orderingand max/min calculation,” IEEE Trans. Circuits Syst., vol. 36, pp. 795–804, 1989.

[5] D. Coltuc and I. Pitas, “On fast running max-min filtering,” IEEE Trans.Circuits Syst. II , vol. 44, pp. 660–664, Aug. 1997.

[6] J. Garofolo et al., “DARPA TIMIT acoustic-phonetic continuous speechcorpus (CD-ROM),” National Institute of Standards and Technology,1990.

A CMOS Buffer Without Short-CircuitPower Consumption

Changsik Yoo

 Abstract—A new CMOS buffer without short-circuit power consump-tion is proposed. The gate- driving signal of the output pull-up (pull-down)transistor is fed back to the output pull-down (pull-up) transistor to gettri-state output momentarily, eliminating the short-circuit power consump-tion. The HSPICE simulation results verified theoperation of the proposedbuffer andshowedthe power-delay product is about 15% smaller than con-ventional tapered CMOS buffer.

 Index Terms—CMOS buffer, short-circuit power consumption.

I. INTRODUCTION

With the high integration level of CMOS very large scale integra-

tion (VLSI), the capacitive load of periodic signals such as clock has

become very large. With such a large capacitive load, driving circuits

consume a relatively large portion of the total power of a VLSI. The

Manuscript received June 1999; revised June 2000. This paper was recom-mended by Associate Editor M. Bayoumi.

The author was withIntegrated Systems Laboratory (IIS), SwissFederal Insti-tute of Technology, Zurich, Switzerland. He is now with Samsung Electronics,Kiheung, Korea.

Publisher Item Identifier S 1057-7130(00)07752-1.

Fig. 1. (a) Tapered CMOS buffer and (b) its timing diagram.

Fig. 2. (a) Feedback-controlled split-path CMOS buffer and (b) its timingdiagram.

power consumption of a CMOS buffer driving a capacitive load con-

sists of dynamic switching power and short-circuit power. While the

switching-power consumption is unavoidable to drive a capacitive load,

short-circuit power is a waste of current and should be minimized or

even eliminated for low-power operation.

A conventional tapered CMOS buffer, shown in Fig. 1(a), consumes

both the dynamic switching power and short-circuit power due to si-multaneous turn-on of the pull-up/pull-down transistors, as illustrated

in Fig. 1(b) [1]. Short-circuit power consumption can be eliminated by

tri-stating the output node momentarily before every output signal tran-

sition. In [2], asymmetric inverters were used as waveform shaper to

get momentary tri-state output period, but the propagation delay is in-

creased by the asymmetric inverters. As an alternative, a feedback-con-

trolled split-path (FS) CMOS buffer was proposed, where the output

signal is fed back to control the output pull-up and pull-down transis-

tors, as shown in Fig. 2, tri-stating the output momentarily and thereby

eliminating the short-circuit power consumption [3]. But, in the FS

CMOS buffer, the logic states of the split output stage drivers change

1057–7130/00$10.00 © 2000 IEEE

Page 2: Chang Sik 2000

7/23/2019 Chang Sik 2000

http://slidepdf.com/reader/full/chang-sik-2000 2/3

936 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 47, NO. 9, SEPTEMBER 2000

Fig. 3. (a) Proposed CMOS buffer without short-circuit power consumption.(b) Timing diagram of the buffer.

Fig. 4. Simulated waveform of the proposed CMOS buffer.

twice for every output signal transition, increasing the power consump-

tion. In charge-transfer feedback-controlled split-path (CFS) CMOS

buffer, this additional power consumption is minimized by transferring

the large charge stored in the output-stage driver to the output node

[4]. In both the FS and CFS buffer, the feedback delay

  and

  should

be controlled very well because if 

  and

  are too small, the output

transistors can be turned off before the complete output transition. The

TABLE ITRANSISTOR   SIZES OF THE   PROPOSED

CMOS BUFFER IN FIG. 3

Fig. 5. (a) Total power consumption and (b) propagation delay of CMOSbuffers as a function of load capacitance.

TABLE IITOTAL ACTIVE AREA OF CMOS BUFFERS

feedback delay

  and

  are dependent on the load capacitance, which

makes the control of 

  and

  complicated.

Page 3: Chang Sik 2000

7/23/2019 Chang Sik 2000

http://slidepdf.com/reader/full/chang-sik-2000 3/3

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 47, NO. 9, SEPTEMBER 2000 937

In this brief, a new CMOS buffer without short-circuit power con-

sumption is proposed. The output pull-up and pull-down transistors

are driven by separate driving signals generated so the pull-up and

pull-down transistors do not turn on simultaneously.

II. A CMOS BUFFER  WITHOUT  SHORT-CIRCUIT  POWER

CONSUMPTION

The schematic and timing diagram of the proposed CMOS bufferare shown in Fig. 3(a) and (b), respectively. While the output signal

itself is fed back in case of the FS and CFS buffer, the gate driving

signal 

  ( 

  ) of the output pull-up (pull-down) transistor is fed back 

to the output pull-down (pull-up) transistor to get tri-state output mo-

mentarily, eliminating the short-circuit power consumption. The logic

states of the output-stage driver change only once for each output tran-

sition in the proposed buffer as opposed to twice in the FS and CFS

buffer. Since the gate driving signals are fed back instead of the output

signal itself, the feedback delay is independent of the output capacitive

load, making the optimization of the circuit much easier. The pull-up

and pull-down operations are explained respectively in the following.

 A. Output Pull-Up Operation

When the input signal  

  rises from 0 to 

 

  , the internal node

 

falls from 

 

  to zero, turning off the output pull-down transistor

 

. Then, the node 

 

  rises from zero to 

 

  and after some delay,

the node 

  falls from 

 

  to zero. Now, the output pull-up transistor

 

isturnedon and the output voltage begins to rise fromzeroto 

 

  .

Since the node 

  is driven to zero before the node 

  , the pull-down

transistor 

  is turned off before the pull-up transistor 

  is turned

on. Therefore, there is no period when both the pull-up and pull-down

transistors are turned on simultaneously and thus no short-circuit power

consumption.

 B. Output Pull-Down Operation

When theinput signal  

  falls from 

 

  to 0, the node 

  is driven

to  

 

  , turning off the output pull-up transistor  

  . Then, the node

 

falls from 

 

  to zero and after some delay, the node 

  rises

from zero to 

 

  . Now, the output pull-down transistor 

  is turned

on and the output voltage begins to fall from 

 

  to zero. Since the

node 

  is driven to 

 

  before the node 

  , there is no period when

both the pull-up and pull-down transistors are turned on simultaneously

and thus no short-circuit power consumption in this case, as well.

III. SIMULATION  RESULTS

The proposed CMOS buffer has been simulated by HSPICE with

0.25- 

  m 2.5-V CMOS parameters. Thesizes of thetransistorsare listed

in Table I. The simulated waveforms of the proposed buffer is shown

in Fig. 4 when the input clock frequency is 200 MHz and the output

load capacitance is 50 pF. It can be seen the gate driving signals  

and 

  are generated, so the transistors 

  and 

  do not turn on

simultaneously, eliminating the short-circuit power consumption. The

total power consumption and the propagation delay of the proposed

buffer are compared with those of conventional tapered CMOS buffer

and FS buffer in Fig. 5. From the figure, it is clear the power consump-

tion of the proposed buffer is smaller than that of earlier works. The

power-delay product of the proposed buffer is smaller by about 15%

than conventional tapered CMOS buffer, although the proposed buffer

occupies larger silicon area than a conventional tapered CMOS buffer

because of the separate control of pull-up and pull-down paths, as com-

pared in Table II.

IV. CONCLUSION

A new CMOS buffer has been proposed which has no short-circuit

power consumption. The output pull-up and pull-down transistors are

driven by separate driving stages which ensure pull-up and pull-down

transistors do not turn on simultaneously. The HSPICE simulation re-

sults show about 15% improvement in the power-delay product com-

pared to a conventional tapered CMOS buffer, and thus the proposed

buffer is suitable for low-power operation.

REFERENCES

[1] N. Li, F. Haviland, and A. Tuszynski, “A CMOS tapered buffer,” IEEE  J. Solid-State Circuits, vol. 25, pp. 1005–1008, Aug. 1990.

[2] K. Y. Khoo and A. N. Wilson Jr., “Low power CMOS clock buffer,”Proc. Int. Symp. Circuits and Systems, vol. 4, pp. 355–358, 1994.

[3] H.-Y. Huang and Y.-H. Chu, “Feedback-controlled split-path CMOSclock buffer,” Proc. Int.Symp. Circuits and Systems, vol. 4,pp. 300–303,1996.

[4] K.-H. Cheng, W.-B. Yang, and H.-Y. Huang, “The charge-transfer feed-back-controlled split-path CMOS buffer,” IEEE Trans. Circuits Syst, II ,vol. 46, pp. 346–348, Mar. 1999.

RNS Arithmetic Multiplier for Medium and Large Moduli

Ahmad A. Hiasat

 Abstract—In implementing Residue Number System (RNS) arithmeticmultipliers, ROM-based structures are very efficient for small moduli.However, due to their exponential growth, ROM implementations arenot suitable for medium and large moduli. This paper introduces anarchitecture for a RNS-based multiplier which combines the use of small-size ROMs and arithmetic components. The design is most suitablefor medium and large moduli. Compared with other implementations, theVLSI layout implementation of this new approach is shown to be more

efficient in terms of area and delay requirements.

 Index Terms—Area and time complexity, modular multiplication, multi-operand modular adder, residue number system, VLSI.

I. INTRODUCTION

Residue Number System (RNS) has the advantage of carry-free

arithmetic operations. Therefore, using residue arithmetic would, in

principle, increase the speed of computations. Specifically, addition,

subtraction, and multiplication can be carried out on each residue digit

concurrently and independently. RNS has demonstrated a high effi-

ciency in implementing different types of digital filters, which depends

mainly on the above mentioned operations. It has been successfully

implemented in applications involving the design of fast number

theoretic transform, discrete Fourier transform, and many other areas[1]. Therefore, designing an efficient modular multiplier has been

an important task in realizing different RNS-based applications and

processors. The modular multiplication is defined as evaluating

 

 

, where        

  . Defining 

  as     

 

  , then

 

is the least nonnegative remainder when dividing the product

Manuscript received July 1998; revised May 2000. This paper was recom-mended by Associate Editor W. Liu.

The author is with Electronics Engineering Department, Princess SumayaUniversity, Amman 11941, Jordan.

Publisher Item Identifier S 1057-7130(00)07746-6.

1057–7130/00$10.00 © 2000 IEEE