designing cmos circuits - labros · pdf filedesigning cmos circuits ... the above department...
TRANSCRIPT
DESIGNING CMOS CIRCUITSFOR LOW POWERBased on selected partner contributionsof the European Low Power Initiative forElectronic System Design of the EuropeanCommunity ESPRIT4 programme and other
Edited by
DIMITRIOS SOUDRISDemocritus University of Thrace, Xanthi, Greece
CHRISTIAN PIGUETCSEM, Neuchâtel, Switzerland
COSTAS GOUTISUniversity of Patras, Patras, Greece
Series Editors:
RENE VAN LEUKEN
ALEXANDER DE GRAAF
REINDER NOUTA
TUDelft/DIMES, Delft, Netherlands
Kluwer Academic PublishersBoston/Dordrecht/London
Contents
List of Figures vii
List of Tables ix
Contributing Authors xi
Foreword xvii
Introduction xxv
Part I LOW POWER DESIGN METHODS
1Motivation, Context and Objectives 3Dimitrios Soudris, Christian PiguetandCostas Goutis
2Sources of power dissipation in CMOS circuits 9Dimitrios SoudrisandAntonios Thanailakis
2.1 Introduction 92.2 The components of power dissipation in CMOS circuits 10
2.2.1 Dynamic Power dissipation 102.2.2 Short-circuit power dissipation 122.2.3 Leakage Power Dissipation 152.2.4 Static Power dissipation 18
2.3 Low Power Design with Multiple��� �
and��� �
19
References 20
3Logic level Power Optimization 23George TheodoridisandDimitrios Soudris
3.1 Introduction 233.2 Problem Formulation 243.3 Combinational Circuits Technology-Independent Optimization 26
vi DESIGNING CMOS CIRCUITS FOR LOW POWER
3.3.1 Two-Level Logic Circuits 263.3.2 Multi-Level Circuits 28
3.4 Sequential Circuits Technology-Independent Optimization 313.4.1 State Assignment 313.4.2 Logic Optimizations 33
3.5 Technology-Dependent Optimization 353.5.1 Path Balancing 353.5.2 Technology Decomposition 363.5.3 Technology Mapping 373.5.4 Post mapping Optimization 38
3.6 Conclusions 40
References 41
4Circuit-Level Low-Power Design 45Spiridon NikolaidisandAlexander Chatzigeorgiou
4.1 Introduction 454.2 Logic Style 46
4.2.1 Static Logic 464.2.2 Dynamic Logic 484.2.3 Pass-Transistor Logic 504.2.4 Single-Rail Pass-Transistor Logic 524.2.5 Other Logic Styles 544.2.6 Logic Styles: Discussion 57
4.3 Latches and Flip-Flops 584.3.1 Latches 584.3.2 Flip - Flops 61
4.4 Transistor Sizing and Ordering 634.4.1 Transistor ordering 634.4.2 Transistor sizing 64
4.5 Drivers for large loads 644.6 Conclusions 66
References 67
5Circuit Techniques forReducing Power Consumptionin Addersand Multipliers
71
Labros Bisdounis, Dimitrios GouvetasandOdysseas Koufopavlou5.1 Introduction 725.2 Power and Delay in CMOS Circuits 735.3 CMOS Circuit Design Styles 74
5.3.1 Conventional Static CMOS Logic - CSL 745.3.2 Complementary Pass-Transistor Logic - CPL 755.3.3 Double Pass-Transistor Logic - DPL 765.3.4 Static Differential Cascode Voltage Switch Logic - SDCVSL 765.3.5 Static Differential Split-level Logic - SDSL 775.3.6 Dual-Rail Domino Logic - DRDL 78
Contents vii
5.3.7 Dynamic Differential Cascode Voltage Switch Logic -DDCVSL 79
5.3.8 Enable/disabled CMOS Differential Logic- ECDL 805.4 Power, Delay and Area Comparisons of a 4-Bit Ripple Carry Adder 815.5 Adders 875.6 Multipliers 88
5.6.1 SPICE-like Power Measurements of Multipliers 895.6.2 High-Level Power Characterization of Multipliers 89
5.7 Conclusions 93
References 94
6Computer Arithmetic Techniques for Low-Power Systems 97Vassilis PaliourasandThanos Stouraitis
6.1 Introduction 976.2 Conventional Arithmetic and Low-power design 99
6.2.1 Basic Arithmetic Operations 996.2.2 Number Representations 100
6.3 The Logarithmic Number System 1006.3.1 LNS Basics 1016.3.2 LNS and Power Dissipation 105
6.4 The Residue Number System 1066.4.1 RNS Basics 1066.4.2 RNS and Power Dissipation 1086.4.3 RNS Signal Activity for Gaussian Input 1086.4.4 Algebraic extensions: The Quadratic RNS 109
6.5 Conclusions 115
References 115
7Reducing Power Consumption in Memories 117Alexander ChatzigeorgiouandSpiridon Nikolaidis
7.1 Introduction 1177.2 Static Random Access Memories 118
7.2.1 Sources of power dissipation in SRAMs 1197.2.2 Low-Power SRAM Circuit Techniques 121
7.3 Dynamic Random Access Memories 1287.3.1 Sources of power dissipation in DRAMs 1317.3.2 Low-Power DRAM Circuit Techniques 133
7.4 Conclusions 138
References 139
8Low-Power Clock, Interconnect and Layout Designs 141Christian Piguet
8.1 Introduction 1418.2 Low-Power Clock Design 142
viii DESIGNING CMOS CIRCUITS FOR LOW POWER
8.2.1 Clock Skew 1428.2.2 Latch-based Clocking Scheme 1458.2.3 Gated Clock with Latch-based Designs 1468.2.4 Results 1488.2.5 Asynchronous Design 148
8.3 Interconnect Delays 1498.3.1 The increased delays of wires 1508.3.2 New Materials for wires and dielectric 1528.3.3 Design methods taking into account interconnection delays 1538.3.4 Crosstalk 154
8.4 Low-Power Optimization for Layout Design 1568.4.1 Masks and Layout 1578.4.2 Industrial Layout Rules 1578.4.3 Layout Optimization for Low-Power 1588.4.4 Low-power Standard Cell Libraries 1608.4.5 Automatic Layout Synthesis 1638.4.6 Transistor Sizing 1638.4.7 Layout in Deep Submicron Technologies 164
References 165
9Logic Level Power Estimation 169George TheodoridisandCostas Goutis
9.1 Introduction 1699.2 Problem Formulation 170
9.2.1 Sources of Power Dissipation 1709.2.2 Problem Statement 170
9.3 Background 1729.3.1 Signal Correlations 1729.3.2 Structural Dependencies 1749.3.3 Sequential Correlations 1759.3.4 Gate Delay Model 176
9.4 Classification of Power Estimation Methodologies 1769.5 Simulation-based Power Estimation 178
9.5.1 Monte-Carlo Power Estimation 1789.5.2 Advanced Sampling Techniques 1809.5.3 Vector Compaction 182
9.6 Probabilistic methods 1869.6.1 Combinational Circuits 1879.6.2 Real-Delay Gate Power Estimation 1939.6.3 Sequential Circuits 196
9.7 Conclusions 198
References 199
Part II LOW POWER DESIGN STORIES
10Low-Power Design for Safety-Critical Applications 205
Contents ix
Athanassios Kakarountas, Kyriakos PapadomanolakisandCostas Goutis10.1 Introduction 20510.2 Basic background on testing 206
10.2.1 Safety properties 20710.2.2 Safe operation constraints vs. low-power techniques 212
10.3 Unsuitable design techniques for safety-critical applications 21310.3.1 Unsuitable design strategies 21410.3.2 Unsuitable low-power techniques 216
10.4 Low-power and safely-operating circuits 21810.4.1 Appropriate low-power techniques for safe operation 21810.4.2 Safe-operating circuits 21910.4.3 Transforming circuits into TSC circuits 22110.4.4 Evaluating the presented design techniques 231
10.5 Conclusions 232
References 233
11Design of a Low Power Ultrasound Beamformer ASIC 235Robert Schwann, Thomas Heselhaus, Oliver WeissandTobias G. Noll
11.1 Introduction 23611.2 The Key to Physically Oriented Design: Automated Datapath Gen-
eration 23911.3 Beamformer’s System Overview and Requirements 24211.4 Optimization and Benchmarks of Basic Building Blocks 243
11.4.1 Coarse Delay Unit: Optimization of Customized Memories 24311.4.2 Delay Generator 24911.4.3 Fine Delay Interpolation Filter 25411.4.4 Apodization Multiplier 26111.4.5 Accumulation Chain 263
11.5 Conclusion 265
References 266
12Epilogue 271Christian Piguet
Contributing Authors
Labros Bisdounis was born in Agrinio, Greece, in 1970. He received theDiploma and the Ph.D. degrees in Electrical Engineering from the Departmentof Electrical and Computer Engineering, University of Patras, Greece, in 1993and 1999, respectively. His Ph.D. thesis is in the area of timing and powermodeling of submicrometer CMOS circuits. During his collaboration withthe above department he participated as a researcher in several EU projectsrelated to VLSI design. Since 2000, he is with the Development ProgrammesDepartment of INTRACOM S.A., Athens, Greece, working on the design anddevelopment of VLSI circuits and systems. His main research interests is onvarious aspects of VLSI circuits and systems design such as: circuit timinganalysis, power dissipation modeling, design and analysis of submicrometerCMOS circuits, low-power and high-speed CMOS digital circuits and systemsdesign and System-on-Chip design for telecom applications. Dr. Bisdounis haspublished 20 papers in international journals and conferences on these areas,and he has received more than 70 references. He is a member of the TechnicalChamber of Greece.
Alexander Chatzigeorgioureceived the Diploma in Electrical Engineering andthe Ph.D. degree in Computer Science, from the Aristotle University of Thes-saloniki, Thessaloniki, Greece, in 1996 and 2000, respectively. From 1997 to1999 he was with Intracom S.A., Greece, as a telecommunications softwaredesigner. Since May 2002 he has been with the Department of Applied In-formatics at the University of Macedonia, Thessaloniki, Greece, as a Lecturerin Software Engineering and Object-Oriented Design. His research interestsinclude low-power hardware and software design, embedded systems architec-ture and timing and power modeling of digital CMOS circuits. He is a memberof the IEEE and the Technical Chamber of Greece.
Costas Goutiswas a Lecturer at the School of Physics and Mathematics at theUniversity of Athens, Greece, from 1970 to 1972. In 1973, he was the TechnicalManager in the Greek P.T.T., responsible for the installation and maintenance of
xx DESIGNING CMOS CIRCUITS FOR LOW POWER
the telephoneexchanges in a large provincial region. He was ResearchAssistantand Research Fellow in the Department of Electrical and Electronic Engineer-ing at the University of Strathclyde, U.K., from 1976 to 1979, and Lecturer inthe Department of Electrical and Electronic Engineering at the University ofNewcastle upon Tyne, U.K., from 1979 to 1985. Since 1985 he has been Asso-ciate Professor and Full Professor in the Department of Electrical and ComputerEngineering, University of Patras, Greece. His recent research interests focuson VLSI Circuit Design, Low Power VLSI Design, Systems Design, Analysisand Design of Systems for Signal Processing and Telecommunications. He haspublished more than 160 papers in international journals and conferences. Hehas been awarded a large number of Research Contracts from ESPRIT, RACE,IST and National Programs. Prof. Goutis has close collaboration with the in-dustry. His group has recently won an international low power design contestsponsored by Intel and IBM.
Dimitrios Gouvetas received the Diploma degree in Electrical Engineeringfrom the Department of Electrical and Computer Engineering, University ofPatras, Greece in 1997. His Diploma thesis is in the area of CMOS Technologyfor Low Power and High Speed Circuits. He is a member of the TechnicalChamber of Greece.
Thomas Heselhausreceived the Dipl.-Ing. degree in 1998 from Aachen Uni-versity of Technology. Since 1998 he has been working as a research assistant atthe Chair of Electrical Engineering and Computer Systems, Aachen Universityof Technology. His main topic are memory architectures and the modelling andoptimization of basic cells.
Athanassios Kakarountasis a postgraduate student at the VLSI Laboratoryof the University of Patras. He received his Diploma in Electrical and Elec-tronics Engineering at the University of Patras. The research field his Ph. D.thesis is based on is in the area of Fault detection and fault tolerance in VLSIdesigning. He has the working experience of two ESPRIT European projects :i) LPGD : Contribution in the formation of a low power methodology/ flow andits application to the implementation of DCS1800 - GSM/DECT ModulatorDemodulator for a DECT/GSM (DCS1800) dual mode mobile phone and ii)CoSafe : Design of a Low Power safety-critical pump for in-vein infusion incollaboration with Micrel Medical Devices Ltd.
Odysseas Koufopavloureceived his Diploma and the Ph.D. degrees in Electri-cal Engineering in 1983and 1990, both from Universityof Patras, Greece. From
Contributing Authors xxi
1990 to 1994 he was at the IBM Thomas J. Watson Research Center, YorktownHeights, NY, USA. He is currently an Associate Professor with the Departmentof Electrical and Computer Engineering, University of Patras. His research in-terests include VLSI, low power design, and high performance communicationsubsystems architecture and implementation. He served as conference Gen-eral Chair of the 1999 International Conference on Electronics, Circuits andSystems, and on organizing committees and technical program committees ofmany major conferences. He leads several projects, the Greek government, andmajor companies. Dr. Koufopavlou has published 83 journal and conferencepapers and received patents and inventions. He is a member of IEEE, IFIP 10.5WG and the Technical Chamber of Greece.
Spiridon Nikolaidis was born in Eleochori of Kavala, Greece in 1965. Hereceived the Diploma and Ph.D. degrees in Electrical Engineering from PatrasUniversity, Greece, in 1988 and 1994, respectively. Since September 1996he has been with the Department of Physics of the Aristotle University ofThessaloniki, Thessaloniki, Geece. He is now an assistant professor at thisDepartment in the field of digital design. His research interests include CMOSgate propagation delay and power-consumption modeling, high speed and lowpower CMOS circuit design techniques, power estimation of DSP architectures,and design of high speed and low power DSP architectures. He is author and co-author in more than 60 papers and articles in conferences. He also participatesin many European (ESPRIT, IST) and Greek government projects.
Tobias G. Noll received the Ing. (grad.) degree in Electrical Engineeringfrom the Fachhochschule Koblenz in 1974, the Dipl.-Ing. degree in ElectricalEngineering from the Technical University of Munich in 1982, and the Dr.-Ing.degree from the Ruhr-University of Bochum in 1989. From 1974 to 1976, hewas with the Max-Planck-Institute of Radio Astronomy, Bonn. Since 1976 hewas with the Corporate Research and Development Department of Siemens andsince 1987 he headed a group of laboratories concerned with CMOS circuits fordigital signal processing. In 1992, he joined the RWTH Aachen University ofTechnology where he is a Professor holding the Chair of Electrical Engineeringand Computer Systems. His activities focus on low power deep submicronCMOS architectures, circuitsand designmethodologies, as well as digital signalprocessing for communications and medicine electronics.
Vassilis Paliourasreceived the Diploma in electrical engineering in 1992 andthe Ph.D. degree in electrical engineering in 1999, from the Electrical andComputer Engineering Department, University of Patras, Greece. He worksas a researcher at the VLSI Design Laboratory, ECE Dept., while teaching
xxii DESIGNING CMOS CIRCUITS FOR LOW POWER
microprocessor-based system design at the Computer Engineering and Infor-matics Dept., both at the University of Patras, Greece. His research interestsinclude computer arithmetic algorithms and circuits, microprocessor architec-ture, and VLSI signal processing, areas where he has published more than 30conference and journal articles. Dr. Paliouras received the MEDCHIP VLSIDesign Award in 1997. He is also the recipient of the 2000 IEEE Circuits andSystems Society Guillemin-Cauer best paper Award. He is a Member of ACM,SIAM, and the Technical Chamber of Greece.
Kyriakos Papadomanolakisis a postgraduate student at the VLSI Laboratoryof the University of Patras. He received his Diploma in Electrical and Elec-tronics Engineering at the University of Patras. He is currently working on hisPh. D. thesis in the area of Hardware safety in VLSI designing at the VLSILaboratory of the University of Patras, at Rion. He has the working experienceof two ESPRIT European projects: i) ASPIS: Contribution in the designing ofa DECT/GSM (DCS1800) dual mode mobile phone, in INTRACOM S.A. atthe department of R&D and ii) CoSafe: Design of a Low Power safety-criticalpump for in-vein infusion in collaboration with Micrel Medical Devices Ltd.
Christian Piguet received the M. S. and Ph. D. degrees in electrical engi-neering from the EPFL, respectively in 1974 and 1981. He joined the CentreElectronique Horloger S.A., Neuchâtel, Switzerland, in 1974. He is now Headof the Ultra Low Power Circuits section at the CSEM S.A. He is presentlyinvolved in the design of low-power low-voltage integrated circuits. He is Pro-fessor at EPFL and also lectures at the University of Neuchâtel, Switzerland.He is author or co-author of more than 100 scientific papers and has contributedto numerous advanced engineering courses.
Robert Schwannreceived the Dipl.-Ing. degree in 1997 from Aachen Univer-sity of Technology. Since 1998 he has been working as a research assistant atthe Chair of Electrical Engineering and Computer Systems, Aachen Universityof Technology. His field of research are the signal processing and image qualityof medical ultrasound systems.
Dimitrios Soudris received his Diploma in Electrical Engineering from theUniversity of Patras, Greece, in 1987. He received the Ph.D. Degree from inElectrical Engineering, from the University of Patras in 1992. He is currentlyworking as Ass. Professor in Dept. of Electrical and Computer Engineering,Democritus University of Thrace, Greece. His research interests include lowpower design, parallel architectures, embedded systems design, and VLSI sig-
Contributing Authors xxiii
nal processing. He has published more than 80 papers in international journalsand conferences. He was leader and principal investigator in numerous researchprojects funded from the Greek Government and Industry as well as the Euro-pean Commission (ESPRIT II-III-IV and 5th IST). He has served as GeneralChair and Program Chair for the International Workshop on Power and TimingModelling, Optimisation, and Simulation (PATMOS). Recently, received anaward from INTEL and IBM for the project results of LPGD #25256 (ESPRITIV). He is a member of the IEEE, the VLSI Systems and Applications TechnicalCommittee of IEEE CAS and the ACM.
Thanos Stouraitis received a B.S. in Physics and an M.S. in Electronic Au-tomation from the University of Athens, Greece, in 1979 and 1981, respectively,an M.S. in Electrical Engineering from the University of Cincinnati in 1983 andthe Ph.D. degree from the Univ. of Florida in 1986. He was awarded the Out-standing Ph.D. Dissertation award of the University of Florida and a Certificateof Appreciation by the IEEE Society of Circuits and Systems in 1997. He isa Professor of Electrical and Computer Engineering at the University of Pa-tras, Greece. He has served on the faculty of the University of Florida andthe Ohio State University. He has published 2 books, several book chapters,about 30 journal and 70 conference papers in the areas of computer architecture,computer arithmetic, VLSI signal and image processing, and low-power pro-cessing. He is currently the chair of the IEEE Circuits and Systems Society’stechnical committee on VLSI Systems and Applications, while he serves onthe digital signal processing and the multimedia systems committees. He hasserved as the General Chair of the IEEE 3rd Int. Conference on Electronics,Circuits, and Systems (ICECS ’96), co-Program chair for EUSIPCO ’98, andregularly serves on the Program Committee of various circuits-and-systems andsignal-processing conferences.
Antonios Thanailakis was born in Greece on August 5, 1940. He receivedB.Sc. degrees in physics and electrical engineering from the University ofThessaloniki, Greece, 1964 and 1968, respectively, and the MSc . and Ph.D.Degrees in electrical engineering and electronics from UMIST, Manchester,U.K. in 1968 and 1971, respectively. He has been a Professor of Microelec-tronics in Dept. of Electrical and Computer Eng., Democritus Univ. of Thrace,Xanthi, Greece, since 1977. He has been active in electronic device and VLSIsystem design research since 1968. His current research activities include mi-croelectronic devices and VLSI systems design. He has published a great num-ber of scientific and technical papers, as well as five textbooks. He was leaderfor carrying out research and development projects funded by Greece, EU, or
xxiv DESIGNING CMOS CIRCUITS FOR LOW POWER
other organizations on various topics of Microelectronics and VLSI SystemsDesign (e.g. NATO, ESPRIT, ACTS, STRIDE).
George Theodoridisreceived the Diploma in Electrical Engineering for theUniversity of Patras in 1994, and the Ph.D. Degree in Electrical and ComputerEng. from the same institution in 2001. He is currently working as researcherat the VLSI Design laboratory of Electrical and Computer Eng. Department ofPatras University. His working interests include several aspects of developmentof methodologies and techniques for power estimation and optimization, recon-figurable computing, parallel architectures and embedded DSP systems. He haspublished more than 20 papers in international journals and conferences. Healso has participated in numerous research projects funding from the EuropeanUnion and Greek Government and Industry.
Oliver Weiss received the Dipl.-Ing. degree in 1995 from Aachen Universityof Technology. Since 1995 he has been working as a research assistant atthe Chair of Electrical Engineering and Computer Systems, Aachen Universityof Technology. His main research interest is the development of a datapathgenerator for the physically oriented design of optimized CMOS macros fordigital signal processing.
Foreword
This book is the fourth in a series on novel low power design architectures,methods and design practices. It results from of a large European project startedin 1997, whose goal is to promote the further development and the faster andwider industrial use of advanced design methods for reducing the power con-sumption of electronic systems.
Low power design became crucial with the wide spread of portable infor-mation and communication terminals, where a small battery has to last for along period. High performance electronics, in addition, suffers from a per-manent increase of the dissipated power per square millimeter of silicon, dueto the increasing clock-rates, which causes cooling and reliability problems orotherwise limits the performance.
The European Union’s Information Technologies Programme ’Esprit’ didtherefore launch a ’Pilot action for Low Power Design’, which eventually grewto 19 R&D projects and one coordination project, with an overall budget of 14million EURO. It is meanwhile known as European Low Power Initiative forElectronic System Design (ESD-LPD) and will be completed in the year 2002.It involves to develop or demonstrate new design methods for power reduction,while the coordination project takes care that the methods, experiences andresults are properly documented and publicised.
The initiative addresses low power design at various levels. This includessystem and algorithmic level, instruction set processor level, custom processorlevel, RT-level, gate level, circuit leveland layout level. It coversdatadominatedand control dominated as well as asynchronous architectures. 10 projects dealmainly with digital, 7 with analog and mixed-signal, and 2 with software relatedaspects. The principalapplicationareas are communication, medicalequipmentand e-commerce devices.
xxvi DESIGNING CMOS CIRCUITS FOR LOW POWER
The following list describes the objectives of the 20 projects. It is sorted bydecreasing funding budget.
CRAFT CMOS Radio Frequency Circuit Design for Wireless Application
Advanced CMOS RF circuit design including blocks such as LNA,down converter mixers & phase shifters, oscillator and frequencysynthesiser, integrated filters delta sigma conversion, power ampli-fier
Development of novel models for active and passive devices as wellas fine-tuning and validation based on first silicon fabricates
Analysis and specification of sophisticated architectures to meet inparticular low power single chip implementation
PAPRICA Power and Part Count Reduction Innovative Communication Ar-chitecture
Feasibility assessment of DQIF, through physical design and char-acterisation of the core blocks
Low-power RF design techniques in standard CMOS digital process
RF design tools and framework; PAPRICA Design Kit.
Demonstration of a practical implementation of a specific applica-tion
MELOPAS Methodology for Low Power Asic design
To develop a methodology to evaluate the power consumption of acomplex ASIC early on in the design flow
To develop a hardware/software co-simulation tool
To quickly achieve a drastic reduction on the power consumptionof electronic equipment
TARDIS Technical Coordination and Dissemination
To organise the communication between design experiments and toexploit their potential synergy
To guide the capturing of methods and experiences gained in thedesign experiments
To organise and promote the wider dissemination and use of thegathered design know-how and experience
xxvii
LUCS Low Power Ultrasound Chip Set.
Design methodology on low power ADC, memory and circuit de-sign
Prototype demonstration of a handheld medical ultrasound scanner
ALPINS Analog Low Power Design for Communications Systems
Low-voltage voice band smoothing filters and analog-to-digital anddigital-to-analog converters for an analog front-end circuit of aDECT system
High linear transconductor-capacitor (gm-C) filter for GSM AnalogInterface Circuit operating at supply voltages as low as 2.5V
Formal verification tools, which will be implemented in the indus-trial partners design environment. These tools support the completedesign process from system level down to transistor level
SALOMON System-level analog-digital trade-off analysis for low power
A general top-down design flow for mixed-signal telecom ASICs
High-levelmodelsofanaloganddigital blocks and power estimatorsfor these blocks
A prototype implementation of the design flow with particular soft-ware tools to demonstrate the general design flow
DESCALE Design Experiment on a Smart Card Application for Low Energy
The application of highly innovative handshake technology
Aiming at some 3 to 5 times less power and some 10 times smallerpeak currents compared to synchronously operated solutions
SUPREGE A low power SUPerREGEnerative transceiver for wireless datatransmission at short distances
Design trade-offs and optimisation of the micro power receiver /transmitteras a functionof various parameters (power consumption,area, bandwidth, sensitivity, etc)
Modulation / demodulation and interface with data transmissionsystems
Realisation of the integrated micro power receiver / transmitterbased on the super-regeneration principle
xxviii DESIGNING CMOS CIRCUITS FOR LOW POWER
PREST Power REduction for System Technologies
Survey of contemporary Low Power Design techniques and com-mercial power analysis software tools
Investigation of architectural and algorithmic design techniqueswith a power consumption comparison
Investigation of Asynchronous design techniques and Arithmeticstyles
Set-up and assessment of a low power design flow
Fabrication and characterisation of a Viterbi demonstrator to assessthe most promising power reduction techniques
DABLP Low Power Exploration for Mapping DAB Applications to Multi-Processors
A DAB channel decoder architecture with reduced power consump-tion
Refined and extended ATOMIUM methodology and supportingtools
COSAFE Low Power Hardware-Software Co-Design for Safety-Critical Ap-plications
The development of strategies for power efficient assignment ofsafety critical mechanisms to hardware or software
The designand implementationof a low-power, safety-criticalASIP,which realises the control unit of a portable infusion, pump system
AMIED Asynchronous Low-Power Methodology and Implementation of anEncryption/Decryption System
Implementation of the IDEA encryption/decryption method withdrastically reduced power consumption
Advanced low power design flow with emphasis on algorithm andarchitecture optimisations
Industrial demonstration of the asynchronous design methodologybased on commercial tools
xxix
LPGD A Low-Power Design Methodology/Flow and its Application to theImplementation of a DCS1800-GSM/DECT Modulator/Demodulator
To complete the development of a top-down, low power designmethodology/flow for DSP applications
To demonstrate the methods at the example of an integratedGFSK/GMSK Modulator-Demodulator (MODEM)for DCS1800-GSM/DECT applications
SOFLOPO Low Power Software Development for Embedded Applications
Develop techniquesand guidelines for mapping a specific algorithmcode onto appropriate instruction subsets
Integrate these techniques into software for the power-consciousARM-RISC and DSP code optimisation
I-MODE Low Power RF to Base band Interface for Multi-Mode PortablePhone
To raise the level of integration in a DECT/DCS1800 transceiver, byimplementing the necessary analog base band low-pass filters anddata converters in CMOS technology using low power techniques
COOL-LOGOS Power Reduction through the Use of Local don’t Care Con-ditions and Global Gate Resizing Techniques: An Experimental Evalua-tion.
To apply the developed low power design techniques to the existing24-bit DSP, which is already fabricated
To assess the merit of the new techniques using experimental siliconthrough comparisons of the projected power reduction (in simula-tion) and actually measured reduction of new DSP; assessment ofthe commercial impact
LOVO Low Output VOltage DC/DC converters for low power applications
Development of technical solutions for the power supplies of ad-vanced low power systems, comprising the following topics
New methods for synchronousrectification for very low output volt-age power converters
xxx DESIGNING CMOS CIRCUITS FOR LOW POWER
PCBIT Low Power ISDN Interface for Portable PC’s
Design of a PC-Card board that implements the PCBIT interface
Integrate levels 1 and 2 of the communication protocol in a singleASIC
Incorporate power management techniques in the ASIC design:
– system level: shutdown of idle modules in the circuit– gate level: precomputation, gated-clock FSMs
COLOPODS Design of a Cochlear Hearing Aid Low-Power DSP System
Selectionof a future oriented low-power technology enabling futurepower reduction through integration of analog modules
Design of a speech processor IC yielding a power reduction of 90%compared to the 3.3 Volt implementation
xxxi
The low power design projects have achieved the following results:
Projects, who have designed a prototype chip, can demonstrate a powerreduction of 10 to 30 percent.
New low power design libraries have been developed.
New proven low power RF architectures are now available.
New smaller and lighter mobile equipment is developed.
Instead of running a number of Esprit projects at the same time indepen-dently of each other, during this pilot action the projects have collaboratedstrongly. This is achieved mostly by the novelty of this action, which is thepresence and role of the coordinator: DIMES - the Delft Institute of Mi-croelectronics and Submicron-technology, located in Delft, the Netherlands(http://www.dimes.tudelft.nl). The task of the coordinator is to co-ordinate,facilitate, and organize:
The information exchange between projects.
The systematic documentation of methods and experiences.
The publication and the wider dissemination to the public.
The most important achievements, credited to the presence of the coordinatorare:
New personnel contacts have been made, and as a consequence the result-ing synergy between partners resulted in better and faster developments.
The organization of low power design workshops, special sessions atconferences, and a low power design web site,http://www.esdlpd.dimes.tudelft.nl. At this site all public reports of theprojects can be found and all kind of information about the initiativeitself.
The used design methodology, design methods and/or design experienceare disclosed, are well documented and available.
Based on the work of the projects, in cooperation with the projects, thepublication of a low power design book series is planned. Written bymembers of the projects this series of books on low power design willdisseminate novel design methodologies and design experiences, whichwere obtained during the runtime of the European Low Power Initiativefor Electronic System Design, to the general public.
xxxii DESIGNING CMOS CIRCUITS FOR LOW POWER
In conclusion, the major contribution of this project cluster is that, except thealreadymentioned technical achievements, the introductionof novel knowledgeon low power design methods into the mainstream development processes isaccelerated.
We would like to thank all project partners from all the different companiesand organizations who make the Low Power Initiative a success.
Rene van Leuken, Reinder Nouta, Alexander de Graaf, Delft, May 2002
Introduction
Modern electronic systems have reached a significant turning point in the lastdecade, from low performance products such as wristwatches and calculatorsto high performance products such as laptops and personal digital assistants.The introduction of these devices to the consumer market raised to the surfacea characteristic that had been previously omitted. This was low power dissipa-tion. Gradually, engineers invented novel techniques, which may be includedin efficient design methodologies, for designing and implementing efficient cir-cuits not only in terms of area and performance, as they were used to, but in theterm of low power consumption.
The material in this book is based on the background and the innovative re-sultsof thedifferent partners involved in the AMIED, LPGD, PREST, COSAFE,and LUCS projects of the European Low Power Initiative for Electronic Sys-tem Design under the successful coordination of DIMES, Delft. The partnershave been studying for many years low-power design field introducing novelconcepts and efficient techniques. Due to close collaboration of academic andindustrial research groups the presented material have been influenced by theplethora of disseminations e.g. public deliverables, technical meetings, work-shops, during the projects execution.
The book consists of two parts: The first part includes the low power de-sign techniques for power optimization and estimation, while the second oneprovides the results from the projects COSAFE and LUCS. Starting from thedescription of the power consumption sources, low power optimization and es-timation techniques for logic design level, circuit/transistor design level, andlayout design level are provided in eight chapters (i.e. Chapters 2-9). Thenext two chapters describe the novel low power techniques, which were usedduring the implementation of the safety-critical Application Specific Instruc-tion Processor designed in COSAFE project, and the implementation of thelow power 16-channel ultrasound beamformer application specific integratedcircuit (ASIC) designed for LUCS project.
xxxiv DESIGNING CMOS CIRCUITS FOR LOW POWER
A top-down approach with respect to the design level is adopted in the pre-sentation of the low power design techniques . However,it was not possible topresent in detail manner all the low power optimization and estimation tech-niques from the logic level to the layout level. Only the most important researchcontributions presented in a tutorial manner, are included. The book can alsobe used as a textbook for undergraduate and graduate students, VLSI designengineers,and professionals, who have had a basic knowledge of VLSI digitaldesign.
The authors of the chapters of this book together with the editors wouldlike to use this opportunity to thank the many people, i.e. colleagues and Ph.D.students, whose dedicationand industry during the projects execution lead to theintroduction of novel scientific results and realization of innovative integratedsystems.
The authors of Chapter 3 and 9 would like to thank Dr. S. Theoharis for hiscontribution in the software development of logic optimization and estimationtools, which were necessary for making power measurements.
The authors of Chapter 6 wish to acknowledge the discussions with Dr.Karagianni, who significantly influenced the particular chapter. Also the au-thors appreciate the financial support from the "C. Caratheodory’s" fund for theUniversity of Patras.
The authors of Chapter 10 would like to thank V. Spiliotopoulos for hissupport in the designing and realization of the COSAFE ASIP. Also, manythanks to V. Kokkinos for his contribution in the implementation of the manyfault-secure multiplier designs.
All authors of this book together with the editors would like to thank DIMES,Delft, for their continuous support during the running of the low power projectsfor the dissemination of the scientific results. This book is one of the activitiesof this dissemination task.
The authors of Chapter 4 would like to thank the people, who contributedto PREST, whose public deliverable reports used as an inspiration source forchapter’s preparation.
Last but not least, D. Soudris would like to thank his parents for being aconstant source of moral support and for firmly imbibing into him from a veryyoung age thatperseverantia omnia vincit- it is this perseverance that kept himgoing. This book is dedicated to them.
Dimitrios Soudris, Christian Piguet, Costas Goutis, May 2002
Chapter5
CIRCUIT TECHNIQ UES FORREDUCING POWER CONSUMPTIONIN ADDERS AND MULTIPLIERS
LabrosBisdounis
INTRACOMS.A.,Athens,Greece
Dimitrios Gouvetas
OdysseasKoufopavlou
University of Patras,Rio,Greece
Abstract An importantissuein thedesignof VLSI Circuitsis thechoiceof thebasiccircuitapproachandtopologyfor implementingvariouslogic andarithmeticfunctionssuchasaddersandmultipliers. In thischapter, severalstaticanddynamicCMOScircuit designstylesareevaluatedin termsof area,propagationdelayandpowerdissipation. The differentdesignstylesarecomparedby performingdetailedtransistor-level simulationson a benchmarkcircuit (ripple carry adder)usingHSPICE,andanalyzingthe resultsin a statisticalway. After the comparisonbetweenthe different designstyles,a numberof well known typesof adders(ripplecarry, carryskip,carrylookahead,carryselectetc.) arecomparedin termsof propagationdelay, numberof gatesand logic transition’s averagenumber.Furthermore,powermeasurementsandcomparisonsfor anumberof well-knownmultipliersareprovided. Basedon theresultsof theprovidedanalysissomeofthe tradeoffs that arepossibleduring the designphasein orderto improve thecircuit power-delayproductareidentified.
Keywords: circuit designtechniques,circuit macroblocks,adders,circuit styles
D R A F T July 12, 2002, 2:00pm D R A F T
72 DESIGNINGCMOSCIRCUITSFORLOW POWER
5.1 Intr oduction
Muchof theresearcheffortsof thepastyearsin theareaof digitalelectronicshasbeendirectedtowardsincreasingthe speedof digital systems.Recently,the requirementof portability andthe moderateimprovementin batteryper-formanceindicatethat thepower dissipationis oneof themostcritical designparameters[1]. The threemostwidely acceptedmetricsto measurethequal-ity of a circuit or to comparevariouscircuit stylesarearea,delayandpowerdissipation.Portability imposesa strict limitation on power dissipationwhilestill demandshigh computationalspeeds.Hence,in recentVLSI systemsthepower-delayproductbecomesthemostessentialmetricof performance.
The reductionof the power dissipationandthe improvementof the speedrequireoptimizationsat all levels of the designprocedure. In this chapter,the propercircuit style andmethodologyis considered.Since,most digitalcircuitry is composedof simpleand/orcomplex gates,we studythebestwayto implementaddersin orderto achieve low powerdissipationandhighspeed.Severalcircuit designtechniquesarecomparedin orderto find theirefficiencyin termsof speedand power dissipation. A review of the existing CMOScircuit designstylesis given,describingtheiradvantagesandtheir limitations.Furthermore,a four-bit ripple carryadderfor useasa benchmarkcircuit wasdesignedin a full-custom mannerby using the different designstyles,anddetailedtransistor-level simulationsusingHSPICE[2] wereperformed.Also,variousdesignsand implementationsof four multipliers areanalysedin thetermsof delayandpowerconsumption.Two waysof powermeasurementsareused.
ConventionalstaticCMOShasbeenatechniqueof choicein mostprocessordesign.Alternatively, staticpasstransistorcircuitshavealsobeensuggestedforlow-powerapplications[3]. Dynamiccircuits,whenclockedcarefully, canalsobeusedin low-power, high speedsystems[4]. However, severalotherdesigntechniquesneedto beappliedandevaluatedalongwith thesecircuit stylesinorderto improve thespeedandreducethepowerdissipationof VLSI systems.In thischapterwestudyeightdifferentCMOSlogic styles:
ConventionalStaticCMOS- CSL,
ComplementaryPass-transistor- CPL [5],
DoublePass-transistor- DPL [6],
StaticandDynamicDifferentialCascodeVoltageSwitch- DCVSL[7,8],
StaticDifferentialSplit-level - SDSL[9],
Dual-RailDomino- DRDL [10,11],and
Enable/disabledCMOSDifferential- ECDL [12].
D R A F T July 12, 2002, 2:00pm D R A F T
Circuit TechniquesforReducingPowerConsumptioninAddersandMultipliers 73
The restof thechapteris structuredasfollows. In thenext sectiona briefintroductionof thepowerdissipationandthedelayinCMOScircuitsisgiven. Insection3, theCMOSadderlogic stylesandtheir characteristicsaredescribedin details. The different adderlogic stylesarecomparedin termsof speed,powerdissipationandsiliconarea,in section4. Also, thepower-delayproductof the designsis considered,dueto the importanceof this metric in modernVLSI applications.Comparisonresultsamongdifferentrealizationsof a16-bitadderin termsof area,delay, andpower arepresentedin Section6. Thenextsectionprovidesresultsfor four implementationsof multipliers. Finally, themainpointsaresummarizedin Section7 of conclusions.
5.2 Power and Delay in CMOS Cir cuits
Sincetheobjectiveistoinvestigatethetradeoffs thatarepossibleatthecircuitlevel in orderto reducepowerdissipationwhile maintainingtheoverallsystemthroughput,wemustfirst studytheparametersthataffect thepowerdissipationandthe speedof a circuit. It is well known that oneof the major advantageof CMOS circuits over singlepolarity MOS circuits, is that the staticpowerdissipationisverysmallandlimited to leakage.However, in somecasessuchasbiascircuitry andpseudo-nMOSlogic, staticpower is dissipated.Consideringthat in CMOS circuits the leakagecurrentbetweenthe diffusion regionsandthesubstrateis negligible, the two majorsourcesof power dissipationaretheswitchingandtheshort-circuitpower dissipation[1],=$>@?BA�C�D"E(FG G+HJILK M N E G G O (5.1)
where?BA
is thenodetransitionactivity factor,CD
is theloadcapacitance,E G G
is the supplyvoltage,H is the switching frequency. K M N is the currentwhichariseswhena direct pathfrom power supplyto groundis caused,for a shortperiodof time during low to high or high to low nodetransitions[13]. Theswitchingcomponentof power ariseswhenenergy is drawn from the powersupplyto chargeparasiticcapacitors.It is thedominantpowercomponentin awell designedcircuit andit canbeloweredby reducingoneor moreof
?BA,CD
,E G G and H , while retainingtherequiredspeedandfunctionality.Even thoughthe exact analysisof circuit delayis quite complex, a simple
first-orderderivationcanbeused[14,15]in orderto show its dependency of thecircuit parameters P G$Q CD7E G GR�S E G G5T EVU W�X Y O (5.2)
whereR
dependsonthetransistorsaspectratio( Z / [ X andotherdeviceparam-eters,
E�\�]is the transistorthresholdvoltage,and ^ is thevelocity saturation
index which variesbetween1 and2 ( ^ is equalto 1.4 for the1.5_ m process
D R A F T July 12, 2002, 2:00pm D R A F T
74 DESIGNINGCMOSCIRCUITSFORLOW POWER
technologywhich is usedin the experimentsof the next section). Sinceaquadraticimprovementin power dissipationmaybeobtainedby loweringthesupplyvoltage(equation(5.1)),many researchershave investigatedtheeffectsof lowering thesupplyvoltagein VLSI circuits. Unfortunately, reducingthesupplyvoltagereducespower, but thedelayincreases(equation(5.2))with theeffectbeingmoredrasticat voltagescloseto thethresholdvoltage[16]. Equa-tions (5.1) and(5.2) indicatethat by reducingthe nodeparasiticcapacitancein a CMOS circuit, the power dissipationis reducedandthe circuit speedisincreased.
5.3 CMOS Cir cuit DesignStyles
In thefollowing, thecircuit designstylesaredescribedusingthefull addercircuit, which is themostcommonlyusedcell in arithmeticunits. Also, theircharacteristicsin termsof power dissipationanddelayareinvestigated.
5.3.1 Conventional Static CMOS Logic - CSL
ConventionalStaticCMOSlogic is usedin mostchip designsin therecentVLSI applications. The schematicdiagramof a conventionalstatic CMOSfull addercell is illustratedin Figure5.1. The signalsnotedwith “-” arethecomplementarysignals. The pMOSFETnetwork of eachstageis the dualnetwork of the nMOSFETone. In order to obtaina reasonableconductingcurrentto drivecapacitive loadsthewidth of thetransistorsmustbeincreased.Thisresultsin increasedinputcapacitanceandthereforehighpowerdissipationandpropagationdelay.
-A
-C
-B
-A
-B
-A
-B - C
-A
-B
CARRY
-A
-B
A `B a
-C
A `-B
-A
B
C SUM
-C
-A
-B
A
B
C
A
-B
-A
B
Figure 5.1. ConventionalstaticCMOSfull adder
D R A F T July 12, 2002, 2:00pm D R A F T
Circuit TechniquesforReducingPowerConsumptioninAddersandMultipliers 75
5.3.2 Complementary Pass-Transistor Logic - CPL
ThemainconceptbehindCPL [5] is theuseof only annMOSFETnetworkfor theimplementationof logic functions.Thisresultsin low inputcapacitanceandhighspeedoperation.Theschematicdiagramof theCPLfull addercircuitis shown in Figure5.2. Becausethe high voltagelevel of the pass-transistoroutputsis lower thanthe supplyvoltagelevel by the thresholdvoltageof thepasstransistors,thesignalshave to beamplifiedby usingCMOSinvertersattheoutputs.CPL circuitsconsumelesspower thanconventionalstaticcircuitsbecausethelogic swingof thepasstransistoroutputsis smallerthanthesupplyvoltagelevel. Theswitchingpowerdissipatedfrom chargingor dischargingthepasstransistoroutputsis givenby
b/cedgfVh h/fBi j�k l m/n l o h p�q�r(5.3)
wherefBi j�k l m"dgf h h5s f�t u l
. In thecaseof conventionalstaticCMOScircuitsthevoltageswingattheoutputnodesis equalto thesupplyvoltage,resultinginhigherpowerdissipation.To minimizethestaticcurrentdueto theincompleteturn-off of thepMOSFETin theoutputinverters,aweakpMOSFETfeedbackdevice canalsobeaddedin theCPL circuitsof Figure5.2, in orderto pull thepass-transistoroutputsto full supplyvoltagelevel. However, thiswill increasetheoutputnodecapacitance,leadingto higherswitchingpowerdissipationandhigherpropagationdelay.
B -B -A -C
A
-A
-C C
SUM
B
-B
A -A
CARRY
Figure 5.2. Complementarypass-transistorfull adder
D R A F T July 12, 2002, 2:00pm D R A F T
76 DESIGNINGCMOSCIRCUITSFORLOW POWER
5.3.3 DoublePass-Transistor Logic - DPL
DPL [6] is a modifiedversionof CPL.Thecircuit diagramof theDPL fulladderisgivenin Figure5.3. In DPLcircuitsfull-swingoperationisachievedbysimplyaddingpMOSFETtransistorsin parallelwith thenMOSFETtransistors.Hence,theproblemsof noisemargin andspeeddegradationat reducedsupplyvoltageswhicharecausedin CPLcircuitsdueto thereducedhighvoltagelevel,areavoided. However, the additionof pMOSFETsresultsin increasedinputcapacitances.
A
-A
B
-B C
-C SUM
C
-C CARRY
A
B
-B
-A
Figure5.3. Doublepass-transistorfull adder
5.3.4 Static Differ ential CascodeVoltageSwitch Logic -SDCVSL
StaticDCVSL[7], isadifferentialstyleof logic requiringbothtrueandcom-plementarysignalstoberoutedtogates.Figure5.4showsthecircuitdiagramofthestaticDCVSL full adder. Two complementarynMOSFETswitchingtreesareconstructedtoapairof cross-coupledpMOSFETtransistors.Dependingon
D R A F T July 12, 2002, 2:00pm D R A F T
Circuit TechniquesforReducingPowerConsumptioninAddersandMultipliers 77
thedifferentialinputsoneof theoutputsis pulleddown by thecorrespondingnMOSFETnetwork. Thedifferentialoutputisthenlatchedbythecross-coupledpMOSFETtransistors.Sincetheinputsdriveonly thenMOSFETtransistorsoftheswitchingtrees,theinputcapacitanceis typically two or threetimessmallerthanthatof theconventionalstaticCMOSlogic.
B vC
A wB v
C -C
-A
-B
-B
-C
-CARRY CARRY A
B
C -C
B
A w-A
-B
-SUM SUM
Figure 5.4. Staticdifferentialcascodevoltageswitchfull adder
5.3.5 Static Differ ential Split-level Logic - SDSL
A variationof thedifferential logic describedabove is the StaticDSL [9].TheSDSLfull addercircuit diagramis illustratedin Figure5.5. Two nMOS-FET transistorswith their gatesconnectedto a referencevoltage ( x�y z {}|~ xB� � � � ����x�� � � , x�� � � : nMOSFETthresholdvoltage)areaddedto reducethelogic swingattheoutputnodes.Theoutputnodesareclampedatthehalf of thesupplyvoltagelevel. Thus,thecircuit operationbecomesfasterthanstandardDCVSL circuits. However, dueto theincompleteturn-off of thecross-coupledpMOSFETtransistors,SDSL circuits dissipatehigh staticpower dissipation.Also, theadditionof two extra nMOSFETtransistorspergateresultsin areaoverhead.
D R A F T July 12, 2002, 2:00pm D R A F T
78 DESIGNINGCMOSCIRCUITSFORLOW POWER
B �C
A �B C -C
-A
-B
-B
-C
- CARRY CARRY
A
B
C -C
B
A -A
-B
Vref �
-SUM SUM
Vref
Figure 5.5. Staticdifferentialsplit-level full adder
5.3.6 Dual-Rail Domino Logic - DRDL
Dual-Rail Domino Logic [10,11] is a precharged circuit techniquewhichis usedto improve the speedof CMOS circuits. Figure 5.6 shows a Dual-Rail Domino full addercell. A domino gateconsistsof a dynamicCMOScircuit followed by a staticCMOS buffer. The dynamiccircuit consistsof apMOSFETprecharge transistorandan nMOSFETevaluationtransistorwiththe clock signal (CLK) appliedto their gatenodes,andan nMOSFETlogicblock which implementsthe requiredlogic function. During the prechargephase(CLK = 0) the outputnodeof the dynamiccircuit is charged throughtheprechargedpMOSFETtransistorto thesupplyvoltagelevel. Theoutputofthe staticbuffer is discharged to ground. During the evaluationphase(CLK= 1) the evaluationnMOSFETtransistoris ON, anddependingon the logicperformedby the nMOSFETlogic block, the output of the dynamiccircuitis eitherdischargedor it will stayprecharged. Sincein dynamiclogic everyoutputnodemustbeprechargedeveryclockcycle,somenodesareprechargedonly to be immediatelydischargedagainasthe nodeis evaluated,leadingtohigherswitchingpower dissipation[1]. Onemajoradvantageof thedynamic,prechargeddesignstylesoverthestaticstylesis thatthey eliminatethespurioustransitionsandthecorrespondingpowerdissipation.Also, dynamiclogic doesnotsuffersfromshort-circuitcurrentswhichflow in staticcircuitswhenadirect
D R A F T July 12, 2002, 2:00pm D R A F T
Circuit TechniquesforReducingPowerConsumptioninAddersandMultipliers 79
pathfrom power supplyto groundis caused.However, in dynamiccircuits,additionalpower is dissipatedby thedistributionnetwork andthedriversof theclocksignal.
C
A B �
B
A
CLK
CARRY
C
A
B -B
A
-B
-A
B
-C
CLK
SUM
CLK
CLK
Figure 5.6. Dual-rail dominofull adder
5.3.7 Dynamic Differ ential CascodeVoltageSwitch Logic- DDCVSL
DynamicDCVSL [8], is a combinationbetweenthedominologic andthestaticDCVSL. Thecircuit diagramof thedynamicDCVSL full adderis givenin Figure5.7. Theadvantageof this style over dominologic is theability togenerateany logic function. Dominologiccanonlygeneratenoninvertedformsof logic. For example,in thedesignof a ripple carryadder, two cellsmustbedesignedfor the carry propagation,onefor the true carry signalandanotherfor thecomplementaryone(in Figure5.6, thecell for the truecarrysignalisonly shown, but theonefor thecomplementarysignalis alsorequired).UsingDCVSL to designdynamiccircuitswill eliminatep-logic gatesbecauseof theinherentavailability of complementarysignals.Thep-logicgatesusuallycauselongdelaytimesandconsumeslargeareas.
D R A F T July 12, 2002, 2:00pm D R A F T
80 DESIGNINGCMOSCIRCUITSFORLOW POWER
B �C
A �B C -C
-A
-B
-B
-C
CARRY -CARRY
A
B �
C -C
B �A
-A
-B
-SUM SUM
CLK
CLK
CLK
CLK
Figure 5.7. Dynamicdifferentialcascodevoltageswitchfull adder
5.3.8 Enable/disabledCMOS Differ ential Logic- ECDL
ECDL [12] is a self-timeddifferential logic which is usedin the caseofimplementinglogic functionsusingiterative networks. It usesextra signalstoindicatethebeginningandendingof a functionevaluation,in orderto improvethecircuit speed.Thestructureof theECDL full adderis illustratedin Figure5.8. ThesignalsDonei � 1 andDonei aretheinputandoutputself-timingcontrolsignals. During the disabledstate,Donei � 1 hasa valueof logic one,whichdischargesboththetrueandthecomplementaryoutputsto logic zero. Duringtheenabledstate,Donei � 1 changesto logic zeroandthetopmostpMOSFETtransistor(Figure5.8) is ON to provide power to the invertersbelow. Then,dependingonthelogicof thedifferentialnMOSFETnetwork,apathexistsfromoneof theoutputnodestoground,holdingthatnodetogroundwhile leaving theotheroutputnodeto bedrivento logic one.Onemajoradvantageof theECDLcircuitsis thatthereis nominimumclockingfrequency requirement.However,ECDL circuitssuffer from extra power dissipationdueto the inverterswhichareneededto changethe polarity of the outputnodes. Also, their complexpull-upcircuitry leadsin extrasiliconarea.
D R A F T July 12, 2002, 2:00pm D R A F T
Circuit TechniquesforReducingPowerConsumptioninAddersandMultipliers 81
B
C
A
B C -C
-A
-B
-B
-C
-CARRY CARRY
A
B �
C -C
B
A -A
-B
-SUM SUM M
Done i-1
Done i
Done i-1
Done i-1
Done i-1
Done i-1
Figure 5.8. Enable/DisableCMOSdifferentialfull adder
5.4 Power, Delay and Ar eaComparisonsof a 4-Bit RippleCarry Adder
Theexperimentalresultsdescribedin thissectionwereobtainedusingafour-bit ripple carry adder. A generalblock diagramof the adderis illustratedinFigure5.9.
FA �
0
A �0 B 0
C in
S 0
FA 1
B �1
S 1
FA 2
B 2
S 2
A 1 A �
2
FA 3
B �
3
S 3
A 3
C 1 C 2 C 3 C out
Figure 5.9. Block diagramof thefour-bit ripplecarryadder
D R A F T July 12, 2002, 2:00pm D R A F T
82 DESIGNINGCMOSCIRCUITSFORLOW POWER
The circuit wasdesignedin a full custommannerfor all the designstylesdescribedin the previous section,usinga 1.5� m CMOS processtechnology.Thechannelwidth of thetransistorswas4.8� m for thenMOSFETs,and9.6� mfor thepMOSFETs.Thedesignwasbasedon thefull addercellspresentedinFigures5.1 to 5.8.
Figure5.10shows thelayoutof theconventionalstaticfour-bit ripple carryadder, asanexampleof thedesignedcircuits.
Figure 5.10. Layoutof theconventionalstaticfour-bit ripplecarryadder
In Table5.1 theaddersilicon areaandthenumberof the transistorsfor eachdesignstylearegiven. Althoughnoextensiveattemptsweremadeto minimizearea,the numberspresentedarea goodindicationof the relative areasof theeightadderimplementations,whichaccountnotonly for thetransistors,but forthe interconnectionsaswell. For example,eventhoughDPL adderhasfewertransistorsthantheCSL one,it haslongerinterconnections,which is reflectedby its largearea.Dynamicdesignstylesandstyleswhich usescontrolsignals(suchasECDL) occupy extra areafor theroutingof theclock andthecontrolsignals. The smallestareais occupiedby the CPL circuit, which hasfewertransistorsandshorterinterconnectionsthantheotheradderimplementations.
After the designof the layouts, circuit equivalentswere extractedfor adetailedcircuit simulationusing HSPICE[2] to obtain the power anddelaymeasurements.In our experiments,a supplyvoltageof 5Volts is used. Allmeasurementswereobtainedwith eachinputsuppliedthroughadriverconsist-ing of two minimum-sizedinvertersin series,andeachoutputnodedriving aminimum-sizedinverterload.
Theestimationof powerdissipationis adifficult problembecauseof its datadependency, andhasreceived a lot of attention[17]. Somedirect simulativepowerestimationmethodshavebeenproposed[18,19],whichareexpensive in
D R A F T July 12, 2002, 2:00pm D R A F T
Circuit TechniquesforReducingPowerConsumptioninAddersandMultipliers 83
Table5.1. Areaandnumberof transistorsof thefour-bit ripplecarryadderimplementations
DesignStyle Adder Ar ea( � 10�L� m� ) No. of Transistors
CSL 5.42 144CPL 4.46 88DPL 6.52 136
SDCVSL 5.19 114SDSL 6.39 130DRDL 6.48 146
DDCVSL 7.22 154ECDL 7.65 166
termsof time. Also, several power estimationmethodshave beenproposed,wherepossibilitiesareusedtosolvethepattern-dependenceproblem. However,in ordertoachievegoodaccuracy, thespatialandtemporalcorrelationsbetweeninternalnodesshouldbe modeled[20,21]. An alternative way is the useofstatisticalmethods[22,23,17],thatcombinestheaccuracy of simulation-basedtechniqueswith thespeedof probabilisticapproaches.
In thischapter, thestatisticalapproachproposedby Burchetal. [22] is usedin ordertoestimatethepowerdissipationof ourdesigns.Usingthepowermetersub-circuitproposedby Kang [18], HSPICEcanmeasurethe averagepowerconsumedby a circuit given a setof input transitionsanda time interval. Inthemethod,theinputsarerandomlygeneratedandstatisticalmeanestimationtechniquesareusedto determinethe final result. In our casefor eachadderdesignwe use200 independent,pseudorandominput transitionsamples,andthepowerconsumedfor eachsampleis monitoredby HSPICE.All
simulationswerecarriedout at 27˚C,with aninput frequency of 50MHz inorderto accommodatetheslowestadder. Thepower dissipationmeasuresdonotincludethepowerconsumedbythedriversandtheloads.In Figure5.11,theprobabilitydistributionsof thepowerdissipationperadditionderivedfrom themeasurements,for theeightadderimplementations,areshown. Sincethedatainputsareindependent,power canbeapproximatedto benormallydistributed[22]. Thisconclusioncanalsobeextractedfrom thecurvesof Figure5.11.
Hence,themeanpower dissipationis givenby������ � � � ¡ ¢g£ (5.4)
where��
is thesampleaverage, is thestandarddeviation,¢
is thenumberofsamples,and
� � � � is obtainedfrom the� ¤
distribution for a (1-¥ )% confidence
D R A F T July 12, 2002, 2:00pm D R A F T
84 DESIGNINGCMOSCIRCUITSFORLOW POWER
interval [24]. Themeanpower dissipationof theeightadderimplementationsusingthesimulationresultsandtheequation(4) is givenin Table5.2.
Thenumberof therequiredsamplesis extractedusingthestoppingcriterion[22] of theabove method ¦ § ¨ ©�ª«¬@ ®�¯�° ± (5.5)
where° is thedesiredpercentageerrorin thepower estimate.Theerrorin ourstatisticalpoweranalysisfor
®= 200and95%confidenceinterval (
¦ § ¨ ©= 1.96)
is lessthan7%. In Table5.2,thepercentageerrorfor eachadderdesignis alsogiven. For the four last designsthe error is quite small becauseof the highnormalityof theirdistributionswhich leadsto smallstandarddeviation.
Figure 5.11. Powerdissipationhistograms
D R A F T July 12, 2002, 2:00pm D R A F T
Circuit TechniquesforReducingPowerConsumptioninAddersandMultipliers 85
Thedelayof eachdesignwasmeasureddirectly from theoutputwaveformsgeneratedby simulatingtheadderusingHSPICEfor theworstcaseinputs,thatis, inputswhichcausethecarryto ripplefromtheleastsignificantbit positiontomostsignificantbit position.Theworstcasedelaysof theeightadderdesignsare listed in the fourth column of Table 5.2. As mentionedin Section5.1,themostessentialmetric of performancein modernVLSI applicationsis thepower-delayproduct.By multiplying eachpowermeasurementwith theworstcasedelay, wecanfoundthemeanpower-delayproductof thedesignsusingamethodsimilar to thatusedfor themeanpower dissipation.Hence,themeanpower-delayproductis givenby²´³¶µ¸·º¹ » ¼ ½¿¾À Ág (5.6)
Figure 5.12. Power-delayproducthistograms
where²L³¶µ
is thesampleaveragepower-delayproduct. Themeanpower-delayproductvaluesof theeightadderdesignsarelistedin Table5.2,andtheprobabilitydistributionsof thepower-delayproductareshown in Figure5.12.
D R A F T July 12, 2002, 2:00pm D R A F T
86 DESIGNINGCMOSCIRCUITSFORLOW POWER
Table5.2. Powerdissipation,delayandpower-delayproductof thefour- bit ripplecarryadderimplementations
AdderDesignStyle
Mean PowerDissipation peraddition (mW)
Statist.Err or(%)
Worst CaseDelay (nsec)
Mean Power-DelayProduct peraddition (pJ)
CSL 0.422 Ã 0.0302 6.1 6.125 2.585 Ã 0.1850CPL 0.238 Ã 0.0208 4.8 4.042 0.962 Ã 0.0841DPL 0.305 Ã 0.0263 6.9 3.345 1.020 Ã 0.0879SDCVSL 0.432 Ã 0.0362 6.5 7.986 3.450 Ã 0.2891SDSL 2.383 Ã 0.0129 0.6 4.606 10.976Ã 0.0594DRDL 0.641 Ã 0.0091 1.4 2.909 1.865 Ã 0.0265DDCVSL 0.957 Ã 0.0074 0.8 3.453 3.304 Ã 0.0255ECDL 1.721 Ã 0.0096 0.6 2.892 4.977 Ã 0.0278
As we canseein theprobabilitydistributionsof Figure5.12,thecurvesofthe dynamicdesigns(DRDL andDDCVSL) areshiftedto the right, becauseof the power dissipateddueto the precharge cycles. The samephenomenonoccursin theECDLadderdueto thepowerdissipationof itsdisabledstate.Theshiftingto therightof theSDSLaddercurveiscausedbecauseof thehighstaticpower which is dissipateddueto the incompleteturn-off of thecross-coupledpMOSFETtransistors.Theotherstaticdesignstylesaremorepower efficientcomparedto thedynamiccircuits.
ThestaticDCVSL circuit consumesmorepowerthantheconventionalstaticcircuit dueto thedifferenceof thecharging anddischarging timesof its outputnodes.Theasymmetryin the riseandfall timesof thepotentialat theseout-put nodeswill prolongtheperiodof currentflow throughthe latchduringthetransientstate,thusincreasingthepower dissipation.
It canbe obtainedfrom the resultsof Table5.2, that the dynamiccircuitsexhibit anincreasein speedcomparedto theconventionalstaticcircuit. Com-paringthedynamiclogic styles,Dominologic hasbetterpower-delayproductcharacteristics(Figure5.12).Thecircuitoperationin theSDSLcircuitbecomesfasterthanthestandardSDCVSLcircuit, dueto thereducedlogic swingat theoutputnodes,but in thecostof high staticpower dissipation.ECDL circuit isthe fasterone,but consumeshigh switchingpower dueto the inverterswhichareneededto changethepolarity of theoutputs.
Thedesignstyleswhichusepass-transistorlogic (CPLandDPL)arethebestin termsof powerdissipation.CPLcircuit consumeslowerpowerthantheDPLone,becauseof its lowerparasiticcapacitance.Onthecontrary, DPL circuit is
D R A F T July 12, 2002, 2:00pm D R A F T
Circuit TechniquesforReducingPowerConsumptioninAddersandMultipliers 87
fasterthantheCPL, becausetheadditionof pMOSFETtransistorsin parallelwith thenMOSFETtransistorsresultsin highercircuit drivability. Also, DPLavoidstheproblemsof noisemargin andspeeddegradationat reducedsupplyvoltageswhich arecausedin CPL circuits. As shown in Figure5.12andinTable5.2, the two stylesexhibit similar power-delayproductcharacteristics,andthey arethemostefficient for low-power andhigh-speedapplications.
The meanpower dissipationandthepropagationdelayvaluesof theeightadderimplementationsaresummarizedin Figure5.13. Thefastaddercircuitslie to theleft of thefigure, andthosewith low power consumptionlie towardthebottomof thefigure.
Delay (ns)
P Ä o w
e r Å D Æ i s
s i Ç p a
t i o n
È ( m W )
SDSL
ECDL
DDCVSL É
DRDL
DPL CPL
CSL SDCVSL
2 3 4 5 6 7 8
0
0.5
1
1.5
2
2.5
Figure 5.13. Power dissipationversusdelayof theadderimplementations
5.5 Adders
In staticCMOSthedynamicpowerdissipationof acircuit dependsprimaryon thenumberof transitionsperunit area.As a result,theaveragenumberoflogic transitionsperadditioncanserveasthebasisof comparingtheefficiencyof a varietyof adderdesigns.If two addersrequireroughly thesameamountof timeandroughlythesamenumberof gates,thecircuit whichrequiresfewer
D R A F T July 12, 2002, 2:00pm D R A F T
88 DESIGNINGCMOSCIRCUITSFORLOW POWER
logic transitionsis moredesirableasit will requirelessdynamicpower. This isonly afirst orderapproximationasthepoweralsodependsonswitchingspeed,gatesize,fan-out,outputloadinge.t.c.
Thefollowing typesof addersweresimulated:RippleCarry, ConstantBlockWidth Single-level Carry Skip, VariableBlock Width Multi-level Carry skip,CarryLookahead,CarrySelect,andConditionalSum. Table5.3 presentstheworstcasenumberof gatedelays,thenumberof gates,andtheaveragenumberof logic transitionsfor thesix 16-bit addertypes.All thegatesareassumedtohave thesamedelay, regardlessof thefan-inor fan-out.
Table5.3. WorstCaseDelay, Numberof Gates,andAverageNumberof Logic Transitionsfora 16-bitAdder
Adder Type Worst CaseDelay (ingatesunits)
Number ofGates
AverageNumberof logic Transi-tions
RippleCarry 36 144 90Constant BlockWidth Single-levelCarrySkip
23 156 102
Variable BlockWidth Multi-levelCarryskip
17 170 108
CarryLookahead 10 200 100CarrySelect 14 284 161ConditionalSum 12 368 218
5.6 Multipliers
Themajority of thereal life applications,suchasmicroprocessorsanddig-ital processingimplementations,requirethecomputationof themultiplicationoperation. Specifically, speed,areaandpower efficient implementationof amultiplier is averychallengingproblem.Here,four well-known multipliers: i)theArray Multiplier [25], ii) theSplit Array Multiplier [26], iii) WallaceTreeMultiplier [27] and iv) the Radix-4 Modified Booth RecodedWallaceTreeMultipliers [28], arestudiedin termsof power consumption.
Twokindsof measurementsandcomparisonsin thetermsof differentdesignparametersareperformedproviding to thedesigneraplethoraof alternativeim-plementations.Particularly, weprovideSPICE-likemeasurementswith respectto theaveragelogic transitionsaswell asthepower consumption.Thesecond
D R A F T July 12, 2002, 2:00pm D R A F T
Circuit TechniquesforReducingPowerConsumptioninAddersandMultipliers 89
kind of measurementsareperformedfollowing a typical high-level analysisflow andis basedon astate-of-the-artCAD framework.
5.6.1 SPICE-likePower Measurementsof Multipliers
The multipliers weredescribedusingonly AND, OR, andINVERT gates.Thesimulationwasmadeusinga programcalledCazM[29], which is similarto SPICE.Eachmultiplier wasfed with 1.000pseudorandominputs. For thesake of completeness,the carry save arraymultiplier andthe Wallacetreeispresentedin thefollowing section,in orderto briefly describethearchitecturesof themostcommonmultipliers. The former is a representative paradigmofarraymultipliers, while the Wallacetree is an efficient way to addmultiplepartialproductstogether.
A gatelevel simulationwith 10.000pseudorandominputs,enabledthegath-eringof averagenumberof gate-outputtransitionsfor eachmultiplier. Duringeachinput, the numberof gatesthat switch outputstatesis recorded,andanaveragenumberof gate-outputtransitionsper multiplicationarecomputedattheendof thesimulation.Table5.4presentstheresults.
Table5.4. AverageNumberof Gate-OutputTransistions
AverageNumber of Gate-Output Transitions
Multiplier Type 8-bit 16-bit 32-bitArray 570 7224 99906Split array 569 4874 52221Wallace 549 3793 20055Modified booth 964 3993 19542
The averagepower dissipationper multiplication is shown in Table 5.5.Theseresultswereobtainedby simulatingthemultiplicationof 1000pseudo-randominputswith a clock periodof 100 ns. The resultsvary significantly.TheWallacemultiplier, which presentsthelower power dissipation,is neitherthesmallestnor theslowestone.
5.6.2 High-Level Power Characterization of Multipliers
Thesecondpowerestimationprocedureis illustratedin Figure5.14.Thefirststepis thelogicsynthesisof theparameterizedandstructuralVHDL descriptionof the arithmeticmodules. Here, a 0.6-micronprocess,AMS standard-celllibrary hasbeenused. For power characterization,only the dynamicpowerdissipation,which formsthedominantcomponentof the total power, is taken
D R A F T July 12, 2002, 2:00pm D R A F T
90 DESIGNINGCMOSCIRCUITSFORLOW POWER
Table5.5. AvaragePower Dissipationfrom CAzM
Multiplier Type Power (mW) Logic Transitions
Array 43.5 7224Split array 38.0 4874Wallace 32.0 3793Modified booth 41.3 3993
into account[30]. Specifically, theactivity pernode,resultingvia logic-levelsimulationthat takesplacein a secondstep,is combinedwith thecapacitancepernode,to computethepower for acertaininputvector, accordingto
Ê(Ë Ì7Í ÎÐÏÒÑÓ Ô Õ/Ö/×Ø Ù Ú Û ÜÝ ÞÛ Û+ß7àÔ
(5.7)
where × Ø Ù Ú Û Ü is thecapacitanceat nodeá , Ý Û Û is thepower supplyvoltage, ß isthefrequency and àÔis theactivity factorat nodeá . Theterm f â EÔ of Eq. 5.7
is actuallythenumberof transitionsfrom logic ‘1’ to logic ‘0’ pertimeunit forthenodeá , which is equalto theratioof numberof nodetransitionsfrom logic‘1’ to logic ‘0’, dividedby thetotalnumberof input vectors:
ß â àÔ Ï ß Ö ã"ä Ïæå(ç
Î è é+ê Ö ã(ä Üå(ë Í ì ç Ë Î ê (5.8)
FromEq. 5.7and5.8,thepower is:
Ê(Ë Ì7Í ÎÐÏ Ý Þí�íå(ë Í ì ç Ë Î ê ÑÓ Ô Õ+Ö × Ø Ù Ú Û Ü å(ç Î è é+ê Ö ã(ä Ü (5.9)
Following this procedure,the power estimationerrorsare in the rangeof10-25%[31], comparedwith SPICEtransistor-level simulator. However, theaccuracy of the estimatessuffices for the purposeof comparingalternativemodulearchitectures,sinceits relative evaluationis of importanceandnot theabsoluteaccuracy. The8-bit wide input modulesweresimulatedwith 50.000randomvectors,the16-bit moduleswith 100.000vectors,the32-bit moduleswith 150.000randomvectors,andthe64-bitmoduleswith 200.000vectors.Itshouldbestressedhere,that theenergyfigures,givenlaterin thecharacterizationsectionsof the arithmeticcomponents,correspondto the averageenergy peroperation.Thedifferenceof thetwo characterizationproceduresin thenumberof testvectorsis significant.Inthissection,powermeasuresfor thesynthesizedmultipliers,namelythecarrysave array, theBoothencodedWallacetreeand
D R A F T July 12, 2002, 2:00pm D R A F T
Circuit TechniquesforReducingPowerConsumptioninAddersandMultipliers 91
Synthesis
Parameters Set
Logic Level îDescription ïLogic Level
Simulation
Switching Activity per node ð
Delay ï
Estimation
Power Estimation ñ
Parametrical VHDL
Description ï
Capacitance per Node ò
Target ó
Technology
Target Technology
Area ð
Estimation
Figure 5.14. High level CharacterizationFlow
D R A F T July 12, 2002, 2:00pm D R A F T
92 DESIGNINGCMOSCIRCUITSFORLOW POWER
thenon-BoothencodedWallacetree,will bepresented,whileanalysisof resultsandcomparisonswith previouswork aremade.
Table5.6presentsthepower estimationof thesynthesizedmultipliers. Themeasurementsof power are normalizedby frequency, reflectingthe fact ofsimulation,with differentoperatingfrequencies.In this way, a representativepowermeasureis given,for everykind of multiplierandfor everybit width. Asit is shown in Table5.6,themostpower-efficientmultiplier for smallbit-widths(lessthan32bits) is thecarrysavearray, but with asmalldifference,comparedto theWallacetreewith non-Boothencoding.In the64-bit implementations,theWallacetreewith non-Boothencodingmultiplier is themostpowerefficientchoice.Thisfactisexplainedby theglitchesarisingfromtherippleof carriesofthearraymultiplier for largebit-widths. Finally, for all bit-widths,theWallacetreewith Boothencodingmultiplier hastheworstpower dissipation.
Table5.6. Powerdissipationestimates
Power (mW)
Multiplier Type/Multiplier Width
8-bit 16-bit 32-bit 64-bit
CarrySaveArray 0.3084 2.2484 3.057023 20.96759WallaceTree(BoothEncoded)
0.7868 3.1204 5.384 23.7164
WallaceTree(Non BoothEncoded)
0.5488 2.5992 4.2212 18.8588
Forcomparisonpurposes,Table5.7showstheresultspresentedin [29], con-sidering16-bit implementationsof arrayandWallacetreemultipliers. Thesedesignsweredescribedonly by AND, OR,andINVERT gates.Theimplemen-tation technologywasa 2-level metal2-ô m process.It canbe seenthat thearraymultiplier consumesmoreenergy thantheWallacetreemultiplier, whichis contradictoryto correspondingvaluesshown in Table5.6. Only in the64-bitcase,thearraymultiplier consumesmoreenergy thantheWallacetreemulti-plier. Thereasonfor this is thespurioustransitionsthatoccurby theripplingof carriesfor thearraymultiplier of the64-bit implementation,which cannotcompensatetheinterconnectarea/capacitanceswitchedby thearraymultiplier.For theremainingcases,thefactorsof thegreaterinterconnectandcell areafortheWallacemultiplier dominatethepower performanceof thesetwo kindsofmultipliers.
D R A F T July 12, 2002, 2:00pm D R A F T
Circuit TechniquesforReducingPowerConsumptioninAddersandMultipliers 93
Table5.7. 16-bitMultiplier AveragePowerDissipation[29]
Multiplier Power (mW) Logic transitions
CarrySaveArray 43.5 7224WallaceTree 32 3793
TheWallacemultiplierwith Boothencodingdissipatesthemost power, whileit is not thelargest.It is thefastestmultiplier for bit-widthslargerthan16 bitsandcanbeassumedthatBoothencodingis a ratherpower-hungryoperation.
Finally, Table5.8 depictsthe Power õ Delay product of the multipliers at1MHz frequency. More specifically, the carry save arraymultiplier exhibitsthe worst productfor all bit-widths, except the 8-bit, due to the large delayandthesubstantialpower consumptionAlthoughthenon-BoothWallacetreemultiplier is the largest, it shows the bestPowerõ Delay product for everybit-width. Wherespeedis of greatinterest,especiallyif large bit widths arerequired,andchip areais not a problem,thenon-BoothencodedWallacetreemultiplier is thebestcandidatefor selection.
Table5.8. Power-Delayproductof Multipliers at1MHz
Power*Delay (mW*ns)Multiplier Type/Multiplier Width
8-bit 16-bit 32-bit 64-bit
CarrySave Array 4,13256 53,51192 141,9987 2090,049WallaceTree(BoothEncoded)
4,980444 25,96173 58,09336 312,345
WallaceTree(NonBoothEncoded)
3,172064 19,8059 44,95578 251,9536
5.7 Conclusions
In this chapter, themostcommonkindsof addersandmultipliershavebeencharacterizedin termsof power, usingeitheratraditionallow-level designflowparadigm,which is rathertediousandincompatiblewith moderndesignflows,but providesthe mostaccurateresults,or a high-level designflow paradigm,which is commonlyused.
D R A F T July 12, 2002, 2:00pm D R A F T
94 DESIGNINGCMOSCIRCUITSFORLOW POWER
A four-bit ripplecarryadderwasused,asthebenchmarkcircuit. All thecir-cuitshavebeendesignedinafull-custommanner, andsimulatedusingHSPICE.A statisticalapproachwasusedin orderto analyzethe simulationresults. Ithasbeenshown thatthecircuitswhichusepass-transistorlogic (CPLandDPL)exhibit betterpower andthepower-delayproductcharacteristicscomparedtootherdesignstyles.
Thearraymultiplier is power-efficient for smallbit widths. Its power con-sumptiongrows in proportionto thecubeof theword size. TheWallacemul-tiplier is lessregular, but is morepower efficient, while its power dissipationgrows with thesquareof thewordsize.
The speedof the synthesizedoptimizedcarry look-aheadis traded-off fortheworstenergy powerconsumptionamongall theinvestigatedadders,whichhave beensynthesized.As opposedto thespeedoptimizedarchitectureof thecarrylook-aheadadder, anon-optimizedarchitectureispower-efficient,thoughmuchslower. Power-efficient is theripple carryadder, too.
References
[1] A. Chandrakasan,R. Brodersen,Low PowerDigital Design,Kluwer Aca-demicPublishers,1995.
[2] Meta-Software,HSPICEUser’s Manual- Version96.1,1996.
[3] K. Yano,Y. Sasaki,K. Rikino, K. Seki,“Top-Down Pass-TransistorLogicDesign”.IEEEJournalof Solid-StateCircuits,vol.31,pp.792-803.1996
[4] MIPSTechnologies,“R4200MicroprocessorProductInformation”,MIPSTechnologiesInc., 1994
[5] K. Yano, T. Yamanaka,T. Nishida, M Saito, K. Shimohigashi,A.Shimizu, “A 3.8-nsCMOS 16ö 16-b Multiplier Using ComplementaryPass-TransistorLogic”, IEEE Journalof Solid-StateCircuits, vol.25, pp.388-395,1990.
[6] M. Suzuki,N. Ohkubo,T. Shinbo,T. Yamanaka,A. Shimizu,K. Sasaki,Y.Nakagome,, 1993,“A 1.5-ns32-bCMOSALU in DoublePass-TransistorLogic”, IEEEJournalof Solid-StateCircuits,vol.28,pp.1145-1151,1993.
[7] L. Heller, W. Griffin, J.Davis, N. Thoma,“CascodeVoltageSwitchLogic:A DifferentialCMOS Logic Family”, Proceedingsof IEEE InternationalSolid-StateCircuit Conference,pp.16-17,1984.
[8] K. Chu,D. Pulfrey, “DesignProceduresfor DifferentialCascodeVoltageSwitch Circuits”, IEEE Journalof Solid-StateCircuits,vol.21,pp. 1082-1087,1986.
[9] L. Pfennings,W. Mol, J. Bastiaens,J. Van Dirk, “Dif ferentialSplit-levelCMOS Logic for SubnanosecondSpeeds”,IEEE Journalof Solid-StateCircuits,vol.20,pp.1050-1055,1985.
D R A F T July 12, 2002, 2:00pm D R A F T
REFERENCES 95
[10] R. Krambeck, C. Lee, H Law, “High-SpeedCompactCircuits withCMOS”, IEEEJournalof Solid-StateCircuits,vol.17,pp.614-619,1982.
[11] V. Oklobdzija,R. Montoye, “Design-PerformanceTrade-offs in CMOS-DominoLogic”, IEEEJournalof Solid-StateCircuits,vol.21,pp.304-309,1986.
[12] S.Lu, S.,“Implementationof IterativeNetworkswith CMOSDifferentialLogic”, IEEEJournalof Solid-StateCircuits,vol.23,pp.1013-1017,1988.
[13] L. Bisdounis,O. Koufopavlou, S. Nikolaidis, “AccurateEvaluationofCMOSShort-CircuitPowerDissipationfor Short-ChannelDevices”,Pro-ceedingsof IEEEInternationalSymposiumonLow PowerElectronicsandDesign,pp.189-192,1996.
[14] A. Bellaouar, M. Elmasry, Low-PowerDigital VLSI Design:CircuitsandSystems,Kluwer AcademicPublishers,1995.
[15] T. Sakurai,A. Newton, “Alpha-Power Law MOSFETModel andits Ap-plicationsto CMOSInverterDelayandOtherFormulas”,IEEE JournalofSolid-StateCircuits,vol.25,pp.584-594,1990.
[16] S.Sun,P. Tsui, “Limitation of CMOSSupply-VoltageScalingby MOS-FETThreshold-Voltage”,IEEEJournalof Solid-StateCircuits,vol.30,pp.947-949,1995
[17] F. Najm, “A Survey of Power EstimationTechniquesin VLSI Circuits”,IEEE Transactionson VLSI Systems,vol.2, pp.446-455,1994
[18] S. Kang,“AccurateSimulationof Power Dissipationin VLSI Circuits”,IEEE Journalof Solid-StateCircuits,vol.21,pp.889-891,1986
[19] G.Yacoub,W. Ku,“An EnhancedTechniquefor SimulatingShort-CircuitPower Dissipation”IEEE Journalof Solid-StateCircuits,vol.24,pp.844-847,1989.
[20] S. Devadas,K. Keutzer, J. White, J., “Estimationof Power Dissipationin CMOSCombinationalCircuitsusingBooleanFunctionManipulation”,IEEE Transactionson Computer-Aided Designof IntegratedCircuitsandSystems,vol.11,pp.373-383,1992.
[21] P. Schneider, U. Schlichtmann,B. Wurth,“FastPowerEstimationof LargeCircuits”, IEEE DesignandTestof ComputersMagazine,vol.13, 70-78,1996.
[22] R.Burch,F Najm,P. Yang,T. Trick, “A MonteCarloApproachfor PowerEstimation”,IEEE TransactionsonVLSI Systems,vol.1, 63-71,1993.
[23] M. Xakellis, F. Najm,“StatisticalEstimationof theSwitchingActivity inDigital Circuits”, Proceedingsof ACM/IEEE DesignAutomationConfer-ence,728-733,1994.
D R A F T July 12, 2002, 2:00pm D R A F T
96 DESIGNINGCMOSCIRCUITSFORLOW POWER
[24] I. Miller, J.Freund,R. Johnson,ProbabilityandStatisticsfor Engineers,PrenticeHall, 1990.
[25] A.D. Pezaris,“A 40ns17-bit by 17-bit arraymultiplier” in IEEE Trans.On Computers,vol. C-20,1971,pp.442-447.
[26] J. Iwamura,K. Siganuma,M. Kimura, andS. Taguchi,"A CMOS/SOSmultiplier," in Proceedingsof ISSCC,1984,pp.92-93.
[27] C.S. Wallace,“A suggestionfor a fast multiplier,” in IEEE Trans.OnElectronicComputers,vol. EC-13,1964,pp.14-17.
[28] O.L. MacSorley, “High-speedarithmeticin binaryadders,” IRE Proceed-ingsvol.EC-11,June1962,pp.67-91.
[29] T. K. Callaway and E. E. Swartzlander, "The Power ConsumptionofCMOSAddersandMultipliers," Low-Power CMOSDesign,IEEE Press,pp.218-224,(invitedpaper).
[30] J.Rabaey,Digital IntegratedCircuits:A DesignPerspective,PrenticeHall,1996.
[31] Power CompilerUserGuide,SynopsysOnlineDocumentation(SOLD),v1999.10
D R A F T July 12, 2002, 2:00pm D R A F T