asics the course book
TRANSCRIPT
-
8/9/2019 ASICs the Course Book
1/505
ASICs... the course
Michael John Sebastian Smith
This course is based on ASICs... the book
Application-Specic Integrated CircuitsMichael J. S. SmithVLSI Design Series1,040 pagesISBN 0-201-50022-1LOC TK7874.6.S63Addison Wesley Longman, http://www.awl.com
Additional material (gures, resources, source code) is located atASICs... the website
http://spectra.eng.hawaii.edu/~msmith/ASICs/HTML/ASICs.htm
-
8/9/2019 ASICs the Course Book
2/505
Some material in this work is reprinted from IEEE Std 1149.1-1990, “IEEE Standard Test Access Port and Boundary-Scan Archi-tecture,” Copyright © 1990; IEEE Std 1076/INT-1991 “IEEE Standards Interpretations: IEEE Std 1076-1987, IEEE StandardVHDL Language Reference Manual,” Copyright © 1991; IEEE Std 1076-1993 “IEEE Standard VHDL Language ReferenceManual,” Copyright © 1993; IEEE Std 1164-1993 “IEEE Standard Multivalue Logic System for VHDL Model Interoperability(Std_logic_1164),” Copyright © 1993; IEEE Std 1149.1b-1994 “Supplement to IEEE Std 1149.1-1990, IEEE Standard TestAccess Port and Boundary-Scan Architecture,” Copyright © 1994; IEEE Std 1076.4-1995 “IEEE Standard for VITAL Applica-tion-Specic Integerated Circuit (ASIC) Modeling Specication,” Copyright © 1995; IEEE 1364-1995 “IEEE Standard Descrip-tion Language Based on the Verilog ® Hardware Description Language,” Copyright © 1995; and IEEE Std 1076.3-1997 “IEEEStandard for VHDL Synthesis Packages,” Copyright © 1997; by the Institute of Electrical and Electronics Engineers, Inc. TheIEEE disclaims any responsibility or liability resulting from the placement and use in the described manner. Information isreprinted with the permission of the IEEE . Figures describing Xilinx FPGAs are courtesy of Xilinx, Inc. ©Xilinx, Inc. 1996,1997, 1998. All rights reserved. Figures describing Altera CPLDs are courtesy of Altera Corporation. Altera is a trademark andservice mark of Altera Corporation in the United States and other countries. Altera products are the intellectual property of AlteraCorporation and are protected by copyright laws and one or more U.S. and foreign patents and patent applications. Figuresdescribing Actel FPGAs iare courtesy of Actel Corporation.
The programs and applications presented in this work have been included for their instructional value. They have been tested with
care but are not guaranteed for any particular purpose. The author does not offer any warranties, representations, or accept any lia-bilities with respect to the programs or applications.
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where thosedesignations appear in this work, and the author was aware of a trademark claim, the designations have been printed in initial caps orall caps.
Figures copyright © 1997 by Addison Wesley Longman, Inc. Text copyright © 1997, 1998 by Michael John Sebastian Smith.
-
8/9/2019 ASICs the Course Book
3/505
ASICs...THE COURSE (1 WEEK)
1
INTRODUCTIONTO ASICs
An ASIC (“a-sick”) is an application-specific integrated circuit
A gate equivalent is a NAND gate F = A • B (IBM uses a NOR gate), or four transistors
History of integration: small-scale integration (SSI , ~10 gates per chip, 60’s), medium-scale integration (MSI, ~100–1000 gates per chip, 70’s), large-scale integration (LSI ,~1000–10,000 gates per chip, 80’s), very large-scale integration (VLSI , ~10,000–100,000gates per chip, 90’s), ultralarge scale integration (ULSI , ~1M–10M gates per chip)
History of technology: bipolar technology and transistor–transistor logic (TTL) precededmetal-oxide-silicon (MOS ) technology because it was difcult to make metal-gate n-chan-nel MOS ( nMOS or NMOS ); the introduction of complementary MOS (CMOS , never cMOS)greatly reduced power
The feature size is the smallest shape you can make on a chip and is measured in λ orlambda
Origin of ASICs: the standard parts , initially used to design microelectronic systems ,were gradually replaced with a combination of glue logic , custom ICs , dynamic random-access memory (DRAM) and static RAM (SRAM )
History of ASICs: The IEEE Custom Integrated Circuits Conference (CICC) and IEEE Inter-national ASIC Conference document the development of ASICs
Application-specific standard products (ASSPs ) are a cross between standard parts andASICs
1.1 Types of ASICsICs are made on a wafer . Circuits are built up with successive mask layers . The number ofmasks used to dene the interconnect and other layers is different between full-customICs and programmable ASICs
Key concepts: The difference between full-custom and semicustom ASICs • The difference
between standard-cell, gate-array, and programmable ASICs • ASIC design flow • Design
economics • ASIC cell library
-
8/9/2019 ASICs the Course Book
4/505
2 SECTION 1 INTRODUCTION TO ASICs ASICS... THE COURSE
1.1.1 Full-Custom ASICsAll mask layers are customized in a full-custom ASIC .
It only makes sense to design a full-custom IC if there are no libraries available.
Full-custom offers the highest performance and lowest part cost (smallest die size) with thedisadvantages of increased design time, complexity, design expense, and highest risk.
Microprocessors were exclusively full-custom, but designers are increasingly turning tosemicustom ASIC techniques in this area too.
Other examples of full-custom ICs or ASICs are requirements for high-voltage (automobile),analog/digital (communications), or sensors and actuators.
1.1.2 Standard-Cell–Based ASICs
In datapath (DP ) logic we may use a datapath compiler and a datapath library . Cells suchas arithmetic and logical units (ALUs ) are pitch-matched to each other to improve timingand density.
A silicon chip or integrated cicuit (IC) is more properly called a die
A cell-based ASIC (CBIC —“sea-bick”)• Standard cells
• Possibly megacells , megafunctions , full-custom blocks , system-level macros (SLMs ),fixed blocks , cores , or Functional StandardBlocks (FSBs )
• All mask layers are customized—transistors andinterconnect
• Custom blocks can be embedded
• Manufacturing lead time is about eight weeks.
silicondie
(a) (b)0.1 inch
4 5
standard-cellarea
2
fixedblocks
3
0.02in500 µm
1
-
8/9/2019 ASICs the Course Book
5/505
ASICs... THE COURSE 1.1 Types of ASICs 3
1.1.3 Gate-Array–Based ASICs
A gate array , masked gate array , MGA, or prediffused array uses macros (books ) toreduce turnaround time and comprises a base array made from a base cell or primitivecell . There are three types:
• Channeled gate arrays
• Channelless gate arrays
• Structured gate arrays
Looking down on the layout of a standard cell from a standard-cell library
pdiff
n-well
p-well
ndiff
pdiff
ndiff
VDD
GND
via
cell bounding box(BB)
m1
contact
poly
A1 B1Z
10λ
(AB)cell abutment box
pdiff
metal2
-
8/9/2019 ASICs the Course Book
6/505
4 SECTION 1 INTRODUCTION TO ASICs ASICS... THE COURSE
Routing a CBIC (cell-based IC)
• A “wall” of standard cells forms a flexible block
• metal2 may be used in a feedthrough cell to cross over cell rows that use metal1 for wir-
ing• Other wiring cells: spacer cells , row-end cells , and power cells
A note on the use of hyphens and dashes in the spelling (orthography) of compound nouns: Be
careful to distinguish between a “high-school girl” (a girl of high-school age) and a “high school
girl” (is she on drugs or perhaps very tall?).
We write “channeled gate array,” but “channeled gate-array architecture” because the gate
array is channeled; it is not “channeled-gate array architecture” (which is an array of chan-
neled-gates) or “channeled gate array architecture” (which is ambiguous).
We write gate-array–based ASICs (with a en-dash between array and based) to mean (gate
array)-based ASICs.
expanded viewof part of flexible
block 1
rows of standard cells
terminal250 λ
50 λ
VDDVSS
Z
cell A.11
cell A.132
I1
VDDVSS
metal1
metal2 power cell
row-endcells
spacer
cells
to powerpads
metal2
metal1
cell A.23cell A.14
to powerpads
metal2
metal1
noconnection
connection
1
feedthrough
-
8/9/2019 ASICs the Course Book
7/505
ASICs... THE COURSE 1.1 Types of ASICs 5
1.1.4 Channeled Gate Array
1.1.5 Channelless Gate Array
1.1.6 Structured Gate Array
A channeled gate array
• Only the interconnect is customized
• The interconnect uses predefined spaces between rowsof base cells
• Manufacturing lead time is between two days and twoweeks
A channelless gate array (channel-free gate array , sea-of-gates array , or SOG array)
• Only some (the top few) mask layers are customized—the interconnect
• Manufacturing lead time is between two days and twoweeks.
array ofbase cells(not allshown)
base cell
array ofbase cells(not allshown)
base cell
-
8/9/2019 ASICs the Course Book
8/505
6 SECTION 1 INTRODUCTION TO ASICs ASICS... THE COURSE
1.1.7 Programmable Logic Devices
An embedded gate array or structured gatearray (masterslice or masterimage )
• Only the interconnect is customized
• Custom blocks (the same for each design)can be embedded
• Manufacturing lead time is between two daysand two weeks.
Examples and types of PLDs: read-only memory (ROM ) • programmable ROM or PROM •
electrically programmable ROM , or EPROM • An erasable PLD (EPLD) • electrically eras-
able PROM , or EEPROM • UV-erasable PROM , or UVPROM • mask-programmable ROM
• A mask-programmed PLD usually uses bipolar technology
Logic arrays may be either a Programmable Array Logic (PAL ® , a registered trademark of
AMD) or a programmable logic array (PLA); both have an AND plane and an OR plane
A programmable logic device (PLD )
• No customized mask layers or logic cells
• Fast design turnaround
• A single large block of programmable intercon-nect
• A matrix of logic macrocells that usually consist ofprogrammable array logic followed by a flip-flop orlatch
embeddedblock
array ofbase cells(not allshown)
macrocell
programmableinterconnect
-
8/9/2019 ASICs the Course Book
9/505
ASICs... THE COURSE 1.2 Design Flow 7
1.1.8 Field-Programmable Gate Arrays
1.2 Design FlowA design flow is a sequence of steps to design an ASIC
1. Design entry . Using a hardware description language (HDL) or schematic entry.
2. Logic synthesis . Produces a netlist —logic cells and their connections.
3. System partitioning . Divide a large system into ASIC-sized pieces.
4. Prelayout simulation . Check to see if the design functions correctly.
5. Floorplanning . Arrange the blocks of the netlist on the chip.
6. Placement . Decide the locations of cells in a block.
7. Routing . Make the connections between cells and blocks.
8. Extraction . Determine the resistance and capacitance of the interconnect.
9. Postlayout simulation . Check to see the design still works with the added loads of theinterconnect.
1.3 Case StudySPARCstation 1: Better performance at lower cost • Compact size, reduced power, and quietoperation • Reduced number of parts, easier assembly, and improved reliability
A field-programmable gate array (FPGA ) orcomplex PLD
• None of the mask layers are customized• A method for programming the basic logiccells and the interconnect
• The core is a regular array of programmablebasic logic cells that can implement combina-tional as well as sequential logic (flip-flops)
• A matrix of programmable interconnect sur-rounds the basic logic cells
• Programmable I/O cells surround the core
• Design turnaround is a few hours
programmablebasic logiccell
programmableinterconnect
-
8/9/2019 ASICs the Course Book
10/505
8 SECTION 1 INTRODUCTION TO ASICs ASICS... THE COURSE
ASIC design ow. Steps 1–4 are logical design , and steps 5–9 are physical design
The ASICs in the Sun Microsystems SPARCstation 1
SPARCstation 1 ASIC Gates (k-gates)1 SPARC integer unit (IU) 202 SPARC oating-point unit (FPU) 503 Cache controller 94 Memory-management unit (MMU) 55 Data buffer 36 Direct memory access (DMA) controller 97 Video controller/data buffer 48 RAM controller 19 Clock generator 1
design entry
systempartitioning
floorplanning
placement
routing
logic synthesis
VHDL/Verilog
chip
block
logic cells
netlist
prelayoutsimulation
circuitextraction
postlayoutsimulation
back-annotatednetlist finish
start
physicaldesign
logicaldesign
A B
A
14
2
3
59
6
78
-
8/9/2019 ASICs the Course Book
11/505
ASICs... THE COURSE 1.4 Economics of ASICs 9
1.4 Economics of ASICsWe’ll compare the most popular types of ASICs: an FPGA, an MGA, and a CBIC. The g-ures in the following sections are approximate and used to illustrate the different compo-nents of cost.
1.4.1 Comparison Between ASIC Technologies
Example of an ASIC part cost : A 0.5 µm, 20k-gate array might cost 0.01–0.02 cents/gate(for more than 10,000 parts) or $2–$4 per part, but an equivalent FPGA might be $20.
When does it make sense to use a more expensive part? This is what we shall examinenext.
The CAD tools used in the design of the Sun Microsystems SPARCstation 1
Design level Function ToolASIC design ASIC physical design LSI Logic
ASIC logic synthesis Internal tools and UC Berkeley toolsASIC simulation LSI Logic
Board design Schematic capture Valid LogicPCB layout Valid Logic AllegroTiming verication Quad Design Motive and internal tools
Mechanical design Case and enclosure AutocadThermal analysis Pacic NumerixStructural analysis Cosmos
Management Scheduling SuntracDocumentation Interleaf and FrameMaker
-
8/9/2019 ASICs the Course Book
12/505
10 SECTION 1 INTRODUCTION TO ASICs ASICS... THE COURSE
1.4.2 Product CostIn a product cost there are fixed costs and variable costs (the number of products sold isthe sales volume ):
In a product made from parts the total cost for any part is
For example, suppose we have the following (imaginary) costs:
• FPGA: $21,800 (xed) $39 (variable)
• MGA: $86,000 (xed) $10 (variable)
• CBIC $146,000 (xed) $8 (variable)
Then we can calculate the following break-even volumes :
• FPGA/MGA ≈ 2000 parts
• FPGA/CBIC ≈ 4000 parts
• MGA/CBIC ≈ 20,000 parts
total product cost = xed product cost + variable product cost × products sold
total part cost = xed part cost + variable cost per part × volume of parts
Break-even graph
cost of parts
number of parts or volume
$10,000
$100,000
$1,000,000
10 100 1000 10,000 100,000
break-evenFPGA/MGA
FPGA
MGA
CBIC
break-evenFPGA/CBIC
break-evenMGA/CBIC
-
8/9/2019 ASICs the Course Book
13/505
ASICs... THE COURSE 1.4 Economics of ASICs 11
1.4.3 ASIC Fixed Costs
Spreadsheet, “Fixed Costs”
Examples of xed costs: training cost for a new electronic design automation (EDA ) sys-
tem • hardware and software cost • productivity • production test and design for test •programming costs for an FPGA • nonrecurring-engineering (NRE ) • test vectors and
test-program development cost • pass (turn or spin ) • profit model represents the profit
flow during the product lifetime • product velocity • second source
FPGA MGA CBICTraining: $800 $2,000 $2,000
Days 2 5 5Cost/day $400 $400 $400
Hardware $10,000 $10,000 $10,000Software $1,000 $20,000 $40,000Design: $8,000 $20,000 $20,000
Size (gates) 10,000 10,000 10,000Gates/day 500 200 200
Days 20 50 50Cost/day $400 $400 $400
Design for test: $2,000 $2,000Days 5 5
Cost/day $400 $400
NRE: $30,000 $70,000 Masks $10,000 $50,000 Simulation $10,000 $10,000
Test program $10,000 $10,000Second source: $2,000 $2,000 $2,000
Days 5 5 5Cost/day $400 $400 $400
Total fixed costs $21,800 $86,000 $146,000
-
8/9/2019 ASICs the Course Book
14/505
12 SECTION 1 INTRODUCTION TO ASICs ASICS... THE COURSE
Prot model
delay to market, d
peak sales
end ofproduct life
sales perquarter, s
timeQ1 Q2 Q3 Q4 Q1 Q2
$10M
$20M productintroduction
t1 t2 t3
s1
s2
lost sales
-
8/9/2019 ASICs the Course Book
15/505
ASICs... THE COURSE 1.4 Economics of ASICs 13
1.4.4 ASIC Variable Costs
Spreadsheet, “Variable Costs”
Factors affecting xed costs: wafer size • wafer cost • Moore’s Law (Gordon Moore of Intel)
• gate density • gate utilization • die size • die per wafer • defect density • yield • die cost• profit margin (depends on fab or fabless ) • price per gate • part cost
FPGA MGA CBIC UnitsWafer size 6 6 6 inchesWafer cost 1,400 1,300 1,500 $Design 10,000 10,000 10,000 gatesDensity 10,000 20,000 25,000 gates/sq.cmUtilization 60 85 100 %
Die size 1.67 0.59 0.40 sq.cmDie/wafer 88 248 365Defect density 1.10 0.90 1.00 defects/sq.cmYield 65 72 80 %Die cost 25 7 5 $Profit margin 60 45 50 %Price/gate 0.39 0.10 0.08 cents
Part cost $39 $10 $8
-
8/9/2019 ASICs the Course Book
16/505
14 SECTION 1 INTRODUCTION TO ASICs ASICS... THE COURSE
Example price per gate gures
0.01
0.10
1.00
cents/gate
1984 1986 1988 1990 1992 1994 1996
CBIC 2 µm
CBIC 1.5 µm
CBIC 1 µm
CBIC 0.6 µm
FPGA 1 µm
FPGA 0.6 µm –32%/year
-
8/9/2019 ASICs the Course Book
17/505
ASICs... THE COURSE 1.5 ASIC Cell Libraries 15
1.5 ASIC Cell Libraries
You can:
(1) use a design kit from the ASIC vendor(2) buy an ASIC-vendor library from a library vendor
(3) you can build your own cell library
(1) is usually a phantom library —the cells are empty boxes, or phantoms , you hand off your
design to the ASIC vendor and they perform phantom instantiation (Synopsys CBA)
(2) involves a buy-or-build decision . You need a qualified cell library (qualified by the ASIC
foundry ) If you own the masks (the tooling ) you have a customer-owned tooling (COT , pro-
nounced “see-oh-tee”) solution (which is becoming very popular)
(3) involves a complex library development process: cell layout • behavioral model • Ver-
ilog/VHDL model • timing model • test strategy • characterization • circuit extraction • pro-
cess control monitors (PCMs ) or drop-ins • cell schematic • cell icon • layout versus
schematic (LVS ) check • cell icon • logic synthesis • retargeting • wire-load model • rout-
ing model • phantom
-
8/9/2019 ASICs the Course Book
18/505
-
8/9/2019 ASICs the Course Book
19/505
ASICs... THE COURSE 1.9 References 17
1.9 ReferencesGlasser, L. A., and D.W.Dobberpuhl. 1985. The Design and Analysis of VLSI Circuits.
Reading, MA: Addison-Wesley, 473 p. ISBN 0-201-12580-3. TK7874.G573. Detailed anal-ysis of circuits, but largely nMOS.
Mead, C. A., and L. A. Conway. 1980. Introduction to VLSI Systems. Reading, MA: Addison-Wesley, 396 p. ISBN 0-201-04358-0. TK7874.M37.
Weste, N. H. E., and K. Eshraghian. 1993. Principles of CMOS VLSI Design: A Systems Per-spective. 2nd ed. Reading, MA: Addison-Wesley, 713 p. ISBN 0-201-53376-6.TK7874.W46. Concentrates on full-custom design.
-
8/9/2019 ASICs the Course Book
20/505
-
8/9/2019 ASICs the Course Book
21/505
-
8/9/2019 ASICs the Course Book
22/505
2 SECTION 2 CMOS LOGIC ASICS... THE COURS
CMOS logic • a two-inputNAND gate • a two-inputNOR gate • Good '1's • Good '0's
off
off
0 1A
B
1 0
1 10
1
F=NAND(A, B)
VDD
off off
F =1B=0
A=0 on
on
VDD
off on
F =0B=0
A=1 off
on
B=1
VDD
A=1
off off
on
on
F=0
B=0
VDD
A=1
on off
off
on
F=1
B=1
VDD
A=0
off on
on
off
F=1
VDD
on off
F =0B=1
A=0 on
off
VDD
on on
F =0B=1
A=1
0 1AB
0 0
1 00
1
F=NOR(A, B)
p-channeln-channel
p-channeln-channel
(a)
(b)
F=1
B=0
VDD
A=0
on on
off
off
-
8/9/2019 ASICs the Course Book
23/505
ASICs... THE COURSE 2.1 CMOS Transistors 3
2.1 CMOS Transistors
• Channel charge = Q (imagine taking a picture and counting the electrons)• t f is time of flight or transit time
• µn is the electron mobility (µ p is the hole mobility )• E is the electric eld (units Vm –1)
An n-channel transistor •channel • source • drain • depletion region • gate • bulk
current (amperes) = charge (coulombs) per unit time (second)
The drain-to-source current I DSn =Q / t f
The (vector) velocity of the electronsv = –µnE
L L2
t f = ––– = –––––––v x µnV DS
GND orVSS
+
V DS
LW
V GS
bulksource drain
Tox
E x
electrons
++
V DS
bulk
drain
gate
sourceV GS
+
mobile channel charge
depletionregion
p-type
n-type n-type
gate
fixed depletion charge
-
8/9/2019 ASICs the Course Book
24/505
4 SECTION 2 CMOS LOGIC ASICS... THE COURS
• The linear region (triode region) extends until V DS =V GS –Vtn• V DS =V GS –Vtn =V DS (sat) (saturation voltage )• V DS >V GS –Vtn (the saturation region , or pentode region, of operation)• saturation current , I DSn (sat)
Q = C(V GC – Vtn) =C [ (V GS – Vtn) – 0.5V DS ] = WLCox [ (V GS – Vtn) – 0.5V DS ]
I DSn = Q/ t f = (W/L)µnCox[ (V GS – Vtn)–0 .5 V DS ]V DS = (W/L)k'n [ (V GS – Vtn)–0 .5 V DS ]V DS
k'n = µnCox is the process transconductance parameter (orintrinsic transconductance )
βn = k'n(W/L) is thetransistor gain factor (or justgain factor )
I DSn (sat) = (βn /2)(V GS – Vtn)2 ; V GS > Vtn
-
8/9/2019 ASICs the Course Book
25/505
ASICs... THE COURSE 2.1 CMOS Transistors 5
2.1.1 P-Channel Transistors
• Vt p is negative• V DS and V GS are normally negative (and –3V
-
8/9/2019 ASICs the Course Book
26/505
6 SECTION 2 CMOS LOGIC ASICS... THE COURS
2.1.2 Velocity Saturation
• vmaxn =105ms –1
• velocity saturation
• t f =Leff /vmaxn• mobility degradation
2.1.3 SPICE Models
• KP (in µAV –2) = k'n (k' p)• VT0 and TOX = Vtn (Vt p) and Tox• U0 (in cm2V –1s –1) = µn (and µ p)
I DSn (sat) = WvmaxnCox (V GS – Vtn) ; V DS >V DS (sat) (velocity saturated).
SPICE parameters
.MODEL CMOSN NMOS LEVEL=3 PHI=0.7 TOX=10E-09 XJ=0.2U TPG=1 VTO=0.65DELTA=0.7+ LD=5E-08 KP=2E-04 UO=550 THETA=0.27 RSH=2 GAMMA=0.6 NSUB=1.4E+17NFS=6E+11+ VMAX=2E+05 ETA=3.7E-02 KAPPA=2.9E-02 CGDO=3.0E-10 CGSO=3.0E-10
CGBO=4.0E-10+ CJ=5.6E-04 MJ=0.56 CJSW=5E-11 MJSW=0.52 PB=1.MODEL CMOSP PMOS LEVEL=3 PHI=0.7 TOX=10E-09 XJ=0.2U TPG=-1 VTO=-0.92 DELTA=0.29+ LD=3.5E-08 KP=4.9E-05 UO=135 THETA=0.18 RSH=2 GAMMA=0.47NSUB=8.5E+16 NFS=6.5E+11+ VMAX=2.5E+05 ETA=2.45E-02 KAPPA=7.96 CGDO=2.4E-10 CGSO=2.4E-10CGBO=3.8E-10+ CJ=9.3E-04 MJ=0.47 CJSW=2.9E-10 MJSW=0.505 PB=1
-
8/9/2019 ASICs the Course Book
27/505
-
8/9/2019 ASICs the Course Book
28/505
8 SECTION 2 CMOS LOGIC ASICS... THE COURS
2.2 The CMOS Process
The CMOS manufacturing process
Key words: boule • wafer • boat • silicon dioxide • resist • mask • chemical etch • isotropicplasma etch • anisotropic • ion implantation • implant energy and dose •polysilicon • chemicalvapor deposition (CVD) • sputtering • photolithography • submicron and deep-submicrprocess • n-well process • p-well process • twin-tub (or twin-well) • triple-well •substratecontacts (well contacts or tub ties) •active (CAA) •gate oxide • field • field implant or chan-nel-stop implant •field oxide (FOX) • bloat • dopant •self-aligned process • positive resist •negative resist • drain engineering • LDD process • lightly doped drain • LDD diffusion or Limplant • stipple-pattern
1
2 43
6
As+
5
7 8 9 10 11 12
1hour
grow crystal saw
resistspinfurnace
mask
etchresistoxide
wafer
grow oxide
-
8/9/2019 ASICs the Course Book
29/505
ASICs... THE COURSE 2.2 The CMOS Process 9
Mask/layer nameDerivation
from drawn
layers
Alternative names for mask/layer Mask label
n-well =nwell bulk, substrate, tub, n-tub, moat CWNp-well =pwell bulk, substrate, tub, p-tub, moat CWPactive =pdiff+ndiff thin oxide, thinox, island, gate oxide CAA
polysilicon =poly poly, gate CPGn-diffusion
implant =grow(ndiff) ndiff, n-select, nplus, n+ CSN
p-diffusionimplant =grow(pdiff) pdiff, p-select, pplus, p+ CSP
contact =contact contact cut, poly contact, diffusion con-tact CCP and CCAmetal1 =m1 rst-level metal CMFmetal2 =m2 second-level metal CMS
via2 =via2 metal2/metal3 via, m2/m3 via CVSmetal3 =m3 third-level metal CMTglass =glass passivation, overglass, pad COG
-
8/9/2019 ASICs the Course Book
30/505
10 SECTION 2 CMOS LOGIC ASICS... THE COURS
(a) nwell (b) pwell (c) ndiff (d) pdiff
(e) poly (f) contact (g) m1 (h) via
(i) m2 (j) cell (k) phantom
The mask layers of a standard cell
-
8/9/2019 ASICs the Course Book
31/505
-
8/9/2019 ASICs the Course Book
32/505
12 SECTION 2 CMOS LOGIC ASICS... THE COURS
Drawn layers and stipple patterns
The transistor layers
pdiff polynwell pwell ndiff contact
via1 via2m1 m2 m3 glass
(or solid)
(or solid)(or solid)
pdiff
nwell
poly
p-diffusion
polysiliconfield oxide
n-well (or substrate)
gate oxide
(a) (b)
y
x
field implant
source/drain diffusionLDD diffusion
2λ
x
y z
2λ
-
8/9/2019 ASICs the Course Book
33/505
ASICs... THE COURSE 2.2 The CMOS Process 13
2.2.1 Sheet Resistance
The interconnect layers
Sheet resistance (1 µm ) Sheet resistance (0.35 µm)
Layer Sheetresistance Units LayerSheet
resistance Units
n-well 1.15± 0.25 kΩ /square n-well 1± 0.4 kΩ /squarepoly 3.5± 2.0 Ω /square poly 10± 4.0 Ω /square
n-diffusion 75± 20 Ω /square n-diffusion 3.5± 2.0 Ω /squarep-diffusion 140± 40 Ω /square p-diffusion 2.5± 1.5 Ω /square
m1/2 70± 6 mΩ /square m1/2/3 60± 6 mΩ /squarem3 30± 3 mΩ /square metal4 30± 3 mΩ /square
Key words: diffusion • Ω /square (ohms per square) •sheet resistance • silicide • self-aligned silicide (salicide ) • LI, white metal, local interconnect, metal0, or m0 •m1 or metal1• diffusion contacts • polysilicon contacts • barrier metal • contact plugs (via plugs ) •chemical–mechanical polishing (CMP ) • intermetal oxide (IMO) • interlevel dielectric(ILD) • metal vias , cuts , or vias • stacked vias and stacked contacts • two-level metal(2LM) • 3LM (m3 or metal3) •via1 • via2 • metal pitch • electromigration • contact resis-tance and via resistance
via1
via2m3
m2
m1
contact
W plug(4000Å)
AlCu(3000Å)
Pt barrier(200Å)
m3m2
(a) (b)
TiW
y x x y z
+via1
contact+m1
+m2
m2+via2 +m3
2λ
-
8/9/2019 ASICs the Course Book
34/505
14 SECTION 2 CMOS LOGIC ASICS... THE COURS
2.3 CMOS Design Rules
Scalable CMOS design rules
nwell
pwell
nwell
pwell
ndiffpdiff
pdiffndiff
ndiffpdiff
pdiffnwell
poly
nwell
pwell
p-selectn-select
ndiffpoly
poly
nwell
poly
metal2
m1
polycontact
pdiff
polyactivecontact
m3
via2m3
glass
m2
m1
n-select
pdiff
p-select
ndiff
pdiff
1. well 2. active 3. poly
4. select5. polycontact
6. activecontact
7. metal1
9. metal2 15. metal3 10. overglass (microns)
pwell
nwell
hot
poly
ndiff
m1 m2
via1
8. via1
m2
m3via2
via1
m2
14. via2
m2
via1
m1
21
4
3
5
6
7 8
15 10149
m3
0 (1.4) 9 (1.2)
10 (1.1)
0 or 6 (1.3)
3 (2.1)
3(2.2)
5 (2.3)0 or 4(2.5)
0 or 4(2.5)
3 (2.4)3
(2.2)
3 (2.1) 5 (2.3) 3 (2.4)
2 (3.2)
2 (3.1)
2 (3.3)
1 (3.5)
3 (3.4)
1.5(5.2a)
2 × 2 (5.1a)
2 (5.3a)
1.5 (6.2a)
2 × 2(6.1a)
2 (6.4a)
1.5(6.2a)
3 (8.2)2 × 2 (8.1)2 (8.5)
2 (8.5)2 (8.4)
1 (8.3)
2 (6.3a)1 (4.3)
2 (4.2)
3 (7.1)1 (7.3)
3(7.2a)
1 (7.4)
3 (4.1)
2(7.2b)
3(14.2)
2 (14.4)1(14.3)
2 × 2 (14.1)3 (9.1)
4(9.2a)
1(9.3)
3(9.2b)
6(15.1)
4 (15.2)
2 (15.3)
6 (10.3) 30 (10.4)
15(10.5)
100 ×100 (10.1)
-
8/9/2019 ASICs the Course Book
35/505
ASICs... THE COURSE 2.4 Combinational Logic Cells15
2.4 Combinational Logic Cells
2.4.1 Pushing Bubbles
2.4.2 Drive StrengthWe ratio a cell to adjust itsdrive strength and make βn =β p to create equal rise and falltimes
Naming of complex CMOS com-binational logic cells
The AOI family of cells with three index numbers or less
Cell type1
Cells Number of unique cellsXa1 X21, X31 2Xa11 X211, X311 2Xab X22, X33, X32 3Xab1 X221, X331, X321 3Xabc X222, X333, X332, X322 4
Total 141Xabc: X={AOI, AO, OAI, OA}; a, b, c = {2, 3}; {} means “choose one.”
BCDE
AZ
BCDE
A
F
Z
AOI221
AOI221 OAI321
OAI321
(a) (b)
OR AND INVERTORAND INVERT
-
8/9/2019 ASICs the Course Book
36/505
16 SECTION 2 CMOS LOGIC ASICS... THE COURS
2.4.3 Transmission Gates
Charge sharing : suppose C BIG=0.2pF and C SMALL=0.02pF, V BIG=0V and V SMALL=5V;then
Constructing a CMOS logic cell—an AOI221 • pushing bubbles •de Morgan’s theorem •network duals
CMOS transmission gate (TG, TX gate, pass gate, coupler)
(0.2× 10 –12) (0) + (0.02× 10 –12) (5)
V F = –––––––––––––––––––––––––––
– = 0.45 V(0.2× 10 –12) + (0.02× 10 –12)
ZABCDE
VDD
Z
A
C
E
B
D
E A
B
C
D
ZABCDE
push bubbles to the inputs
OR = parallelAND = series
OR = parallelAND = series
1
3
VDD
6/1 6/16/16/1
6/1
1/1 2/1
2/1
2/1
2/1
6/(1+1+1) =2/1
2
adjustsizes4
(a) (c)(b)
(a)
A
'1'
Z
CBIGCSMALL
charge sharing
VBIG→VFVSMALL→VF
(c)
'0'A
S'
ZA Z
S=0
ZA S=1A
S
ZS'
(b)
S
strong '1'
strong '0'
-
8/9/2019 ASICs the Course Book
37/505
ASICs... THE COURSE 2.5 Sequential Logic Cells17
2.5 Sequential Logic CellsTwo choices for sequential logic:multiphase clocks or synchronous design . We choosethe latter.
2.5.1 Latch
CMOS latch •enable • transparent • static • sequential logic cell • storage • initial value
CLKNCLKCLKNI4 CLKPI5
CLKNQ
CLKPI2
I3
I1D
CLKP
QI2
I3
I1D QI2
I3
I1Dstorageloop
(a) (b) (c)
D
CLK
Q
t
D
CLK
Q
t
latch is transparent
1DC1
-
8/9/2019 ASICs the Course Book
38/505
18 SECTION 2 CMOS LOGIC ASICS... THE COURS
2.5.2 Flip-Flop
CMOS ip-op• master latch • slave latch• active clock edge • negative-edge–triggered flip-flop• setup time (tSU) • hold time (tH) • clock-to-Q propagation delay (tPD)
• decision window
CLKN
CLKN
CLKPI2
I3
I1D
CLKP(a)
D
CLK
t
CLKP
CLKNI6
I7
CLKCLKN
I4CLKP
I5
CLKP
Q
QN
I8
I9
S
(b)MI2
I3
I1D
load master
SI6
I7
storeQ
QN
I8
I9
(c)MI2
I3
I1D
load slave
SI6
I7
storeQ
QN
I8
I9
CLK=1
CLK=0
M
Q
(d)
load master load slave load master load slave
tSUtH
50%
tPD
1DC1
decisionwindow
master slave
M
CLKN
-
8/9/2019 ASICs the Course Book
39/505
ASICs... THE COURSE 2.6 Datapath Logic Cells19
2.6 Datapath Logic Cells
• parity function ('1' for an odd numbers of '1's)• majority function ('1' if the majority of the inputs are '1')
full adder (FA ): SUM = A B CIN = SUM(A, B, CIN) = PARITY(A, B, CIN) ,COUT = A · B + A · CIN + B · CIN = MAJ(A, B, CIN).
S[i ] = SUM (A[i ], B[i ], CIN)COUT = MAJ (A[i ], B[i ], CIN)
A datapath adder• Ripple-carry adder (RCA)• Data signals •control signals •datapath • datapath cell or datapath element• Datapath advantages: predictable and equal delay for each bit • built-in interconnect• Disadvantages of a datapath: overhead • harder design • software is more complex
(a)
SUMB[1]A[1]
B[0]A[0]
B[2]A[2]B[3]A[3]
VSS
COUT[3]
(b)
S[3]
S[2]
S[1]
S[0]AB
CIN
COUTADD
(d)
A B COUTCIN
(c)
COUT[3]
VSS
control
datam2
m1
S
m2m1
COUT[2]
m1
m2
CIN
COUT[2]
CIN[0]
-
8/9/2019 ASICs the Course Book
40/505
20 SECTION 2 CMOS LOGIC ASICS... THE COURS
2.6.1 Datapath Elements
-
8/9/2019 ASICs the Course Book
41/505
-
8/9/2019 ASICs the Course Book
42/505
22 SECTION 2 CMOS LOGIC ASICS... THE COURS
2.6.2 AddersGenerate , G[i ] and propagate , P[i ]
Carry signal:
Carry chain using two-input NAND gates, one per cell:
Carry-save adder (CSA ) cell CSA(A1[i ], A2[i ], A3[i ], CIN, S1[i ], S2[i ], COUT) has three out-puts:
subtractionresult:OV=overow,
OR=out of range
OR=BOUT[MSB]BOUT is bor-row out
as in addition as in addition as in addition
negation:Z=–A (negate)
NA Z=A;SG(Z)=NOT(SG(A))
Z=NOT(A) Z=NOT(A)+1
method 1 method 2
G[i] = A[i] · B[i] G[i ] = A[i ] · B[i ]
P[i ] = A[i ] B[i P[i ] = A[i ] + B[i ]C[i ] = G[i ] + P[i ] · C[i –1] C[i ] = G[i ] + P[i ] · C[i –1]S[i ] = P[i ] C[i –1] S[i ] = A[i ] B[i ] C[i –1]
either C[i ] = A[i ] · B[i ] + P[i ] · C[i – 1]or C[i ] = (A[i ] + B[i ]) · (P[i ]' + C[i – 1]), where P[i ]'=NOT(P[i ])
even stages odd stages
C1[ i ]' = P[ i ] · C3[ i – 1] · C4[ i – 1] C3[ i]' = P[ i ] · C1[ i – 1] · C2[ i – 1]
C2[ i ] = A[ i ] + B[ i ] C4[ i]' = A[ i ] · B[ i ]
C[ i ] = C1[ i ] · C2[ i ] C[ i ] = C3[ i ]'+ C4[ i ]'
S1[i ] = CIN ,
S2[i ] = A1[i ] A2[i ] A3[i ] = PARITY(A1[i ], A2[i ], A3[i ])COUT = A1[i ] · A2[i ] + [(A1[i ] + A2[i ]) · A3[i ]] = MAJ(A1[i ], A2[i ], A3[i ])
-
8/9/2019 ASICs the Course Book
43/505
ASICs... THE COURSE 2.6 Datapath Logic Cells23
Carry-propagate adder (CPA )
carry-bypass adders (CBA ):
carry-skip adder :
The carry-save adder (CSA) •pipeline • latency • bit slice
C[7]=(G[7]+P[7]·C[6])·BYPASS'+C[3]·BYPASS
CSKIP[i ] = (G[i ] + P[i ] · C[i – 1]) · SKIP' + C[i – 2] · SKIP
(a)
S1A1A2
CIN
COUT
CSAS2A3
COUT[MSB]
CIN[0]
(b)
COUT[MSB–1]
Σ
A2[MSB:0]
A4[MSB:0]
+
+A3[MSB:0] +
Σ+
++
A1[MSB:0]
Σ S[MSB:0]+
+
(c)
(d)
ΣS[MSB:0]+
+Σ
A2[MSB:0]
A4[MSB:0]
+
+A3[MSB:0] +
A1[MSB:0]
Σ+
++
CLKCLK
(e)
Σ
A1[MSB:0]
A3[MSB:0]
S1[MSB:0]+
+A2[MSB:0]
S2[MSB:0]+
OV
(f) (g)
CSA1
CSA2RCA
CSA1
CSA2 RCA
pipeline registers
RCACLK CLK
pipeline registers
CSA1 CSA2
1
2
3
4
5 1 2 3 4 5
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
COUT[MSB]COUT[MSB–1]
CSA1 CSA2 RCA
bit sliceMSB
LSB
-
8/9/2019 ASICs the Course Book
44/505
24 SECTION 2 CMOS LOGIC ASICS... THE COURS
Carry-lookahead adder (CLA , for example theBrent–Kung adder ):
Carry-select adder duplicates two small adders for the cases CIN='0' and CIN='1' and thenuses a MUX to select the case that we need
C[1] = G[1] + P[1] · C[0]= G[1] + P[1] · (G[0] + P[1] · C[–1])= G[1] + P[1] · G[0]
C[2] = G[2] + P[2] · G[1] + P[2] · P[1] · G[0] ,C[3] = G[3] + P[2] · G[2] + P[2] · P[1] · G[1] + P[3] · P[2] · P[1] · G[0]
The Brent–Kung carry-lookahead adder
A[i ] B[i ]
G[i ]
P[i ]
G[i +1]
P[i +1]
G[0]P[0]G[1]P[1]
C[1] =G[1]+P[0]
P[2]
P[0]P[1]G[2]
C[2] =G[2]+P[2]G[1]+P[2]P[1]G[0]
P[3]
P[0]P[1]P[2]G[3]
C[3]= G[3]+P[3]G[2]+ P[3]P[2]G[1]+P[3]P[2]P[1]G[0]P[0]P[1]P[2]P[3]
CLG
CLG CLG CLG
G[i +1]+P[ i ]
P[i ]P[i + 1]
G[0]P[0]G[1]P[1]
G[2]P[2]G[3]P[3]
CLG
C[3]
C[2]C[1]
L1 L2 L3
L4
01 2 3
123
01
23
1
3
2
(a)
(b) (c)
(d)
(e) (f)
CLG
CLG CLG
L1
L2 L3
01234567
0
0
012345
6
7
G[i ]/P[i ] in C[i ] out
Each wire is a bundle ofG[i +1]+P[ i ] and P[i ]P[i +1].
(g)
A[i ] B[i ]
G[i ]P[i ]
Sum[i ]C[i ]
orP[i ]
Create generate and propagate signals.Create carry signals.
Create sum signals.
-
8/9/2019 ASICs the Course Book
45/505
ASICs... THE COURSE 2.6 Datapath Logic Cells25
The conditional-sum adder
A[0] B[0]
C1_0_0
H0
C[0]
C1_0_1
A[1] B[1]H1
stage0
1
2
S[1] C[2] S[0]
bit 1 0
Q1_0
Q2_1
A[i ] B[i ]H
A[i ] B[i ]
(A[i ] B[i ])'
A[i ].B[i ]
A[i ]+B[ i ]
(a) (c)
Ci _ j _ k
Si _ j _1 orCi _ j _1
Si _ j _0 orCi _ j _0
G111
Si _ j _ k orCi _ j _ k
Si _ j _ k or Ci _ j _ k Qi _ j
(b)
(k =0 or 1)
Ci _ j _ k =carry in to thei th bit assuming the carry in to the j th bit isk (k =0 or 1)S
i _
j _
k =sum at the
i th bit assuming the carry in to the
j th bit is
k (
k =0 or 1)
Ci _ j _ k Si _ j _0 orCi _ j _0Si _ j _1 orCi _ j _1
carry out (carry in=0)sum (carry in =0)
sum (carry in =1)carry out (carry in=1)
Q1_1
-
8/9/2019 ASICs the Course Book
46/505
26 SECTION 2 CMOS LOGIC ASICS... THE COURS
2.6.3 A Simple Example
2.6.4 Multipliers• Mental arithmetic: 15 (multiplicand ) × 19 (multiplier ) = 1 5×(20–1) = 15×21• Suppose we want to multiply by B=00010111 (decimal 16+4+2+1=23)• Use the canonical signed-digit vector (CSD vector ) D=00101001(decimal 32–8+1=23)• B h a s a weight of 4, but D has a weight of 3 — and saves hardware
An 8-bit conditional-sum adder
module m8bitCSum (C0, a, b, s, C8); // Verilog conditional-sum adderfor an FPGA //1input [7:0] C0, a, b; output [7:0] s; output C8; //2wire A7,A6,A5,A4,A3,A2,A1,A0,B7,B6,B5,B4,B3,B2,B1,B0,S8,S7,S6,S5,S4,S3,S2,S1,S0; //3wire C0, C2, C4_2_0, C4_2_1, S5_4_0, S5_4_1, C6, C6_4_0, C6_4_1,C8; //4assign {A7,A6,A5,A4,A3,A2,A1,A0} = a; assign {B7,B6,B5,B4,B3,B2,B1,B0} = b; //5assign s = { S7,S6,S5,S4,S3,S2,S1,S0 }; //6assign S0 = A0^B0^C0 ; // start of level 1: & = AND, ^ = XOR, | =OR, ! = NOT //7assign S1 = A1^B1^(A0&B0|(A0|B0)&C0) ; //8assign C2 = A1&B1|(A1|B1)&(A0&B0|(A0|B0)&C0) ; //9assign C4_2_0 = A3&B3|(A3|B3)&(A2&B2) ; assign C4_2_1 =A3&B3|(A3|B3)&(A2|B2) ; //10assign S5_4_0 = A5^B5^(A4&B4) ; assign S5_4_1 = A5^B5^(A4|B4) ; //11assign C6_4_0 = A5&B5|(A5|B5)&(A4&B4) ; assign C6_4_1 =A5&B5|(A5|B5)&(A4|B4) ; //12assign S2 = A2^B2^C2 ; // start of level 2 //13
assign S3 = A3^B3^(A2&B2|(A2|B2)&C2) ; //14assign S4 = A4^B4^(C4_2_0|C4_2_1&C2) ; //15assign S5 = S5_4_0&!(C4_2_0|C4_2_1&C2)|S5_4_1&(C4_2_0|C4_2_1&C2) ; //16assign C6 = C6_4_0|C6_4_1&(C4_2_0|C4_2_1&C2) ; //17assign S6 = A6^B6^C6 ; // start of level 3 //18assign S7 = A7^B7^(A6&B6|(A6|B6)&C6) ; //19assign C8 = A7&B7|(A7|B7s)&(A6&B6|(A6|B6)&C6) ; //20endmodule //21
-
8/9/2019 ASICs the Course Book
47/505
ASICs... THE COURSE 2.6 Datapath Logic Cells27
Datapath adders
To recode (or encode) any binary number, B, as a CSD vector, D: Di = Bi + Ci – 2Ci + 1 ,where Ci +1 is the carry from the sum of Bi +1+Bi +Ci (we start with C0=0).
If B=011 (B2=0, B1=1, B0=1; decimal 3), then: D0 = B0 + C0 – 2C1 = 1 + 0 – 2 =1,D1 = B1 + C1 – 2C2 = 1 + 1 – 2 = 0,D2 = B2 + C2 – 2C3 = 0 + 1 – 0 = 1,
so that D= 101 (decimal 4–1=3).
We can use a radix other than 2, for exampleBooth encoding (radix-4):B=101001 (decimal 9–32=–23) E=1 21 (decimal –16–8+1=–23)
B=01011 (eleven) E=11 1 (16–4–1)B=101 E=11
normalizeddelay
bits8 16 32 64
120
80
40
area/kλ2
bits8 16 32 64
3000
2000
1000
2-inputNAND =1
ripple-carry
carry-select
carry-save
ripple-carry
carry-select
carry-save
(b)(a)
-
8/9/2019 ASICs the Course Book
48/505
28 SECTION 2 CMOS LOGIC ASICS... THE COURS
Tree-based multiplication – at each stage we have the following three choices:(1) sum three outputs using a full adder(2) sum two outputs using a half adder(3) pass the outputs to the next stage
FA
A B
Sum
COUT CIN
full adderS31S51 S41
S22S42 S32
S23
S14
S13
S04
S33
S24
S50
P5 P4P6
'0''0'
S40
'0'
'0'
S15 S05
'0'
'0'
S41S50
S32S14
S23
S05
P5
'0'
a0
b1
c2
d3
e4
f5
b0
c1
d2
e3
f4
a1
b2
c3
d4
e5
f6
5.1 5.2
5.3
5.4
5.5
Wallace treecarry-save chain
(a) (b)
(c)
fulladder
halfadder
Each dotrepresentsan output ofone stageand aninput to thenext.
P5
S50S41S32S23S14S05
5.1
5.4
5.5
5.2
5.3
1
2
3
4
0
5
6
redundantcarry
-
8/9/2019 ASICs the Course Book
49/505
ASICs... THE COURSE 2.6 Datapath Logic Cells29
A Wallace-tree multiplier works forward from the multiplier inputs• Full adder is a3:2 compressor or (3, 2) counter• Half adder is a(2, 2) counter
FA
A B
Sum
COUT CIN
fulladder
S50S32 S05
P5
'0'
1
2
34567
0
S41
S15
S33
S24
P6
P7P8P9P10P11
S55
S42S51S45S54 S44S53 S43S52S34S35
S25
P4
P3
P2
P1
'0''0'
'0'
'0'
'0'S04 S03 S02
S31 S30
S14S23 S13S22 S12S21 S11S20S01
S10
S40
1
2
3
4
5
6
7
15
P5
S00
P0
1 2 3
10 11 12
17
4 5
13
18 19
22 23
25
26
27 28 29 30
6 7 8 9
1614
20
24
21
-
8/9/2019 ASICs the Course Book
50/505
30 SECTION 2 CMOS LOGIC ASICS... THE COURS
The Dadda multiplier works backward from the nal product• Each stage has a maximum of 2, 3, 4, 6, 9, 13, 19, ...outputs (each successive stage is3/2 times larger—rounded down to an integer
The number of stages and thus delay (in units of an FA delay—excluding the CPA) for ann-bittree-based multiplier using (3, 2) counters islog1.5 n = log10 n /log10 1.5 = log10 n /0.176
P1P2P3P4P5P6P8 P7P9P10P11
S04S13
S03 S12
S40
'0'
S22
S25S34
'0'S10S01
S02 S11
S35S44
S55
S54
S52
S53 S50 S21
S31
S30
S24'0'S33
S42S51S15 S14'0'S23
S32S41S05S43
P0
S00
S20
'0'
S45
1
234
0
'0'
1
7 8 9 10
2 3 4
13 14 15 16 17
21 22 23 24 25 26
6
11 12
5
18 19
27 28
20
29 30
-
8/9/2019 ASICs the Course Book
51/505
-
8/9/2019 ASICs the Course Book
52/505
32 SECTION 2 CMOS LOGIC ASICS... THE COURS
2.6.5 Other Arithmetic Systems
• 101 (decimal) is1100101 (in binary and CSD vector) or11100111• 188 (decimal) is10111100 (in binary),111000100, 101001100, or 101000100(CSDvector)• 101 is represented as 010010 (using sign magnitude) — rather wasteful
Residue number system
• 11 (decimal) is represented as [1, 2] residue (5, 3)• 11R5=11 mod 5=1 and 11R3=11 mod 3=2• The size of this system is 3×5=15• We can now add, subtract, or multiply without using any carry
binary decimal redundant binary CSD vector1010111 87 10101001 10101001 addend
+ 1100101 101 + 11100111 + 01100101 augend01001110 = 11001100 intermediate sum11000101 11000000 intermediate carry
= 10111100 = 188 111000100 101001100 sum
Redundant binary addition • redundant binary encoding avoids carry propagation
A[i ] B[ i ] A[i –1] B[ i –1] Intermediate
sum
Intermediate
carry1 1 x x 0 1
1 0A[i–1]=0/1 and
B[i–1]=0/1 1 0
0 1 A[i–1]= 1 or B[i–1]= 1 1 1
1 1 x x 0 0
1 1 x x 0 0
0 0 x x 0 0
0 1
A[i–1]=0/1 and
B[i–1]=0/1 1 11 0 A[i–1]= 1 or B[i–1]= 1 1 0
1 1 x x 0 1
-
8/9/2019 ASICs the Course Book
53/505
ASICs... THE COURSE 2.6 Datapath Logic Cells33
4 [4, 1] 12 [2, 0] 3 [3, 0]+ 7 + [2, 1] – 4 – [4, 1] × 4 × [4, 1]= 11 = [1, 2] = 8 = [3, 2] = 12 = [2, 0
The 5, 3 residue number system
n residue 5 residue 3 n residue 5 residue 3 n residue 5 residue 30 0 0 5 0 2 10 0 11 1 1 6 1 0 11 1 22 2 2 7 2 1 12 2 03 3 0 8 3 2 13 3 1
4 4 1 9 4 0 14 4 2
-
8/9/2019 ASICs the Course Book
54/505
34 SECTION 2 CMOS LOGIC ASICS... THE COURS
2.6.6 Other Datapath Operators
Symbols for datapath elements
Full subtracter DIFF = A NOT(B) ΝΟΤ(BIN)= SUM(A, NOT(B), NOT(BIN))
NOT(BOUT) = A · NOT(B) + A · NOT(BIN) + NOT(B) · NOT(B
= MAJ(NOT(A), B, NOT(BIN))
Keywords: adder/subtracter • barrel shifter • normalizer • denormalizer • leading-one detecto• priority encoder • exponent correcter • accumulator • multiplier–accumulator (MACincrementer • decrementer • incrementer/decrementer • all-zeros detector • all-ones detecto• register file • first-in first-out register (FIFO) • last-in first-out register (LIFO)
Q[MSB:0]CLK PRE
D[MSB:0]
S
01
A[MSB:0]
B[MSB:0]
Z[MSB:0]Σ
S[MSB:0]A[MSB:0]
B[MSB:0]
+
+/-Z[MSB:0]+/-1 =1Z
=0Z
(a)
A[MSB:0]B[MSB:0]
Z[MSB:0] B[MSB:0]
A Z[MSB:0]
(b)
(c)
(d) (e) (f) (g) (h)
-
8/9/2019 ASICs the Course Book
55/505
ASICs... THE COURSE 2.7 I/O Cells 35
2.7 I/O Cells
2.8 Cell Compilers
2.9 Summary• The use of transistors as switches• The difference between a ip-op and a latch• The meaning of setup time and hold time
Keywords: Tri-State ® is a registered trademark of National Semiconductor) •drivers • con-
tention • bus keeper or bus-hold cell (TI calls this Bus-Friendly logic) •slew rate • power-supply bounce • simultaneously switching outputs (SSOs) •quiet-I/O • bidirectional I/O• open-drain • level shifter • electrostatic discharge , orESD • electrical overstress (EOS)• ESD implant • human-body model (HBM) •machine model (MM) •charge-device model(CDM, also called device charge–discharge) •latch-up • undershoot • overshoot • guardrings
A three-state bidirectional output buffer
Keywords: silicon compilers • RAM compiler • multiplier compiler • single-port RAM • duaRAMs • multiport RAMs • asynchronous • synchronous • model compiler • netlist compicorrect by construction
I/Opad
VDD
OE
DATAout
M1
M2
ND1
NR1
I2
DATAinI1
outputenable
to corelogic
from corelogic
-
8/9/2019 ASICs the Course Book
56/505
36 SECTION 2 CMOS LOGIC ASICS... THE COURS
• Pipelines and latency• The difference between datapath, standard-cell, and gate-array logic cells• Strong and weak logic levels
• Pushing bubbles• Ratio of logic• Resistance per square of layers and their relative values in CMOS• Design rules and λ
2.10 ProblemsSuggested homework: 2.1, 2.2, 2.38, 2.39 (from ASICs... the book )
-
8/9/2019 ASICs the Course Book
57/505
ASICs...THE COURSE (1 WEEK)
1
ASIC LIBRARYDESIGN
ASIC design uses predened and precharacterized cells from a library—so we need todesign or buy a cell library. A knowledge of ASIC library design is not necessary but makesit easier to use library cells effectively.
3.1 Transistors as Resistors
Key concepts: Tau, logical effort, and the prediction of delay • Sizes of cells, and their drive
strengths • Cell importance • The difference between gate-array macros, standard cells, and
datapath cells
– t PDf 0.35 V DD = V DD exp –––––––––––––––––
R pd (Cout + C p)
An output trip point of 0.35 is convenient because ln(1/0.35)=1.04 ≈1 and thust PDf = R pd (Cout + C p) ln (1/0.35) ≈ R pd (Cout + C p)
For output trip points of 0.1/0.9 we multiply by –ln(0.1) = 2.3, because exp (–2.3) = 0.100
-
8/9/2019 ASICs the Course Book
58/505
2 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE
A linear model for CMOS logic delay• Ideal switches = no delay • Resistance and capacitance causes delay
• Load capacitance, Cout • parasitic output capacitance, C p • input capacitance, C
• Linearize the switch resistance • Pull-up resistance, R pu • pull-down resistance, R pd
• Measure and compare the input, v(in1) and output, v(out1)
• Input trip point of 0.5 • output trip points are 0.35 (falling) and 0.65 (rising)
• The linear prop–ramp model: falling propagation delay, t PDf ≈R pd (C p + Cout )
(c)
R pu
R pd
C p
V DD
in1 out1
CoutC
(a)
V DD
in1 out1
Cout
m1
m2
v(in1)v(out1)
t PDf
V DD
0.5 V DD
0.35 V DD
saturation linearoff
0
(b)
t'
m2
m1
m1 :t' =0
t' =0 ≈ R pd (C p + Cout )
V DD exp[– t' / (R pd (C p + C out ))]
t' =0 – I DSp
I DSn –( I DSp + I DSn )
-
8/9/2019 ASICs the Course Book
59/505
ASICs... THE COURSE 3.1 Transistors as Resistors 3
(a) (b)
CMOS inverter characteristics
• Equilibrium switching
• Non-equilibrium switching
• Nonlinear switching resistance
• Switching current
(c)
v(in1) /V
v(out1) / V
0
1
2
3
1 2 30
equilibriumpath
nonequilibrium path
I DSn =–I DSp
1
0
3
1
23
0
3
2
0
1
v(in1) /V
v(out1) / V
nonequilibrium pathmax( I DSn , –I DSp ) /mA
I DSn
–I DSp
1
2
equilibriumpath
v(in1) /V
0.2
0.4
0.01 2 30
equilibriumpath
2
I DSn =–I DSp
max( I DSn , –I DSp ) /mA
-
8/9/2019 ASICs the Course Book
60/505
-
8/9/2019 ASICs the Course Book
61/505
ASICs... THE COURSE 3.2 Transistor Parasitic Capacitance 5
NAME m1 m2MODEL CMOSN CMOSPID 7.49E-11 -7.49E-11VGS 0.00E+00 -3.00E+00
VDS 3.00E+00 -4.40E-08VBS 0.00E+00 0.00E+00VTH 4.14E-01 -8.96E-01VDSAT 3.51E-02 -1.78E+00GM 1.75E-09 2.52E-11GDS 1.24E-10 1.72E-03GMB 6.02E-10 7.02E-12CBD 2.06E-15 1.71E-14CBS 4.45E-15 1.71E-14CGSOV 1.80E-15 2.88E-15CGDOV 1.80E-15 2.88E-15
CGBOV 2.00E-16 2.01E-16CGS 0.00E+00 1.10E-14CGD 0.00E+00 1.10E-14CGB 3.88E-15 0.00E+00
• ID (I DS ), VGS , VDS , VBS , VTH (Vt), and VDSAT (V DS (sat) ) are DC parameters
• GM, GDS , and GMB are small-signal conductances (corresponding to ∂I DS / ∂V GS ,∂I DS / ∂V DS , and ∂I DS / ∂V BS , respectively)
-
8/9/2019 ASICs the Course Book
62/505
6 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE
-
8/9/2019 ASICs the Course Book
63/505
ASICs... THE COURSE 3.2 Transistor Parasitic Capacitance 7
Calculations of parasitic capacitances for an n-channel MOS transistor.
PSpice Equation Values 1 for V GS =0V, V DS =3V, V SB =0V
CBD
CBD = CBDJ + CBDSW CBD = 1.855 × 10
–13
+ 2.04 × 10 –16
= 2.06 × 10 –13 F
CBDJ + AD CJ ( 1 + V DB / φB) –mJ (φB =
PB )
CBDJ = (4.032 × 10 –15 )(1 + (3/1)) –0.56 = 1.86 ×
10 –15 F
CBDSW = P D CJSW (1 + V DB / φB) –mJSW
(P D may or may not include channeledge)
CBDSW = (4.2 × 10 –16 )(1 + (3/1)) –0.5 = 2.04 ×
10 –16 FCBS
CBS = CBSJ + CBSSW
CBS = 4.032 × 10 –15 + 4.2 × 10 –16 = 4.45 ×
10 –15
F
CBSJ + AS CJ ( 1 + V SB / φB) –mJ
AS CJ = (7.2 × 10 –15 )(5.6 × 10 –4 ) = 4.03 ×
10 –15 F
CBSSW = P S CJSW (1 + V SB / φB) –mJSW
P S CJSW = (8.4 × 10 –6 )(5 × 10 –11 ) = 4.2 ×
10 –16 FCGSOV CGSOV =W EFF CGSO ; W EFF =W–2W
D CGSOV = (6 × 10 –6 )(3 × 10 –10 ) = 1.8 × 10 –16 F
CGDOV CGDOV =W EFF CGSO CGDOV = (6 × 10 –6 )(3 × 10 –10 ) = 1.8 × 10 –15 F
CGBOV CGBOV =L EFF CGBO ; L EFF =L–2L D CGDOV = (0.5 × 10 –6 )(4 × 10 –10 ) = 2 × 10 –16 FCGS CGS /CO = 0 (off), 0.5 (lin.), 0.66 (sat.)
CO (oxide capacitance) = W EF LEFF εox / Tox
CO = (6 × 10 –6 )(0.5 × 10 –6 )(0.00345) = 1.03 ×
10 –14 FCGS = 0.0 F
CGD CGD /CO = 0 (off), 0.5 (lin.), 0 (sat.) CGD = 0.0 F
CGB CGB = 0 (on), = C O in series with CGS (off)
CGB = 3.88 × 10 –15 F, C S =depletion capaci-
tance1Input .MODEL CMOSN NMOS LEVEL=3 PHI=0.7 TOX=10E-09 XJ=0.2U TPG=1
VTO=0.65 DELTA=0.7+ LD=5E-08 KP=2E-04 UO=550 THETA=0.27 RSH=2 GAMMA=0.6NSUB=1.4E+17 NFS=6E+11+ VMAX=2E+05 ETA=3.7E-02 KAPPA=2.9E-02 CGDO=3.0E-10CGSO=3.0E-10 CGBO=4.0E-10+ CJ=5.6E-04 MJ=0.56 CJSW=5E-11 MJSW=0.52 PB=1m1 out1 in1 0 0 cmosn W=6U L=0.6U AS=7.2P AD=7.2P PS=8.4UPD=8.4U
-
8/9/2019 ASICs the Course Book
64/505
8 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE
3.2.1 Junction Capacitance• Junction capacitances, C BD and C BS , consist of two parts: junction area and sidewall
• Both C BD and C BS have different physical characteristics with parameters: CJ and MJ
for the junction, CJSW and MJSW for the sidewall, and PB is common• C BD and C BS depend on the voltage across the junction ( V DB and V SB )
• The sidewalls facing the channel ( CBSJ GATE and C BDJ GATE ) are different from the side-walls that face the eld
• It is a mistake to exclude the gate edge assuming it is in the rest of the model—it is not
• In HSPICE there is a separate mechanism to account for the channel edge capaci-tance (using parameters ACM and CJGATE )
3.2.2 Overlap Capacitance
• The overlap capacitance calculations for C GSOV and C GDOV account for lateral diffusion• SPICE parameter LD=5E-08 or LD =0.05 µm
• Not all SPICE versions use the equivalent parameter for width reduction, WD, in calcu-lating C GDOV• Not all SPICE versions subtract W D to form W EFF
3.2.3 Gate Capacitance• The gate capacitance depends on the operating region
• The gate–source capacitance C GS varies from zero (off) to 0.5C O in the linear region to
(2/3)C O in the saturation region• The gate–drain capacitance C GD varies from zero (off) to 0.5C O (linear region) andback to zero (saturation region)
• The gate–bulk capacitance C GB is two capacitors in series: the xed gate-oxide capaci-tance, C O, and the variable depletion capacitance, C S• As the transistor turns on the channel shields the bulk from the gate—and C GB falls tozero
• Even with V GS =0V, the depletion width under the gate is nite and thus C GB is less thanCO
-
8/9/2019 ASICs the Course Book
65/505
ASICs... THE COURSE 3.2 Transistor Parasitic Capacitance 9
The variation of n-channel transistor parasitic capacitance
• PSpice v5.4 ( LEVEL=3 )
• Created by varying the input voltage, v(in1) , of an inverter
• Data points are joined by straight lines
• Note that CGSOV = CGDOV
0
2
4
6
0 0.5 1 1.5 2 2.5 3
CBD CBS CGSOV CGDOV
CGBOV CGS CGD CGB
capacitance/fF
inverter input voltage, v(in1) /V
off saturation linear
-
8/9/2019 ASICs the Course Book
66/505
10 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE
3.2.4 Input Slew Rate
(a)
(b)
(c)
Measuring the input capacitance of an inverter
(a) Input capacitance is measured by monitoring the input current to the inverter, i(Vin)
(b) Very fast (non-equilibrium) switching: input current of 40fA = input capacitance of 40fF
(c) Very slow (equilibrium) switching: input capacitance is now equal for both transitions
-
8/9/2019 ASICs the Course Book
67/505
ASICs... THE COURSE 3.2 Transistor Parasitic Capacitance 11
(a) (c)
(b)(d)
Parasitic capacitance measurement
(a) All devices in this circuit include parasitic capacitance
(b) This circuit uses linear capacitors to model the parasitic capacitance of m9/10 .
• The load formed by the inverter ( m5 and m6 ) is modeled by a 0.0335pF capacitor ( c2 )
• The parasitic capacitance due to the overlap of the gates of m3 and m4 with their source,drain, and bulk terminals is modeled by a 0.01pF capacitor ( c3 )
• The effect of the parasitic capacitance at the drain terminals of m3 and m4 is modeled by a0.025pF capacitor ( c4 )
(c) Comparison of (a) and (b). The delay (1.22–1.135=0.085ns) is equal to t PDf for the in-verter m3/4
(d) An exact match would have both waveforms equal at the 0.35 trip point (1.05V).
-
8/9/2019 ASICs the Course Book
68/505
12 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE
3.3 Logical EffortWe extend the prop–ramp model with a “catch all” term, t q, that includes:
• delay due to internal parasitic capacitance
• the time for the input to reach the switching threshold of the cell
• the dependence of the delay on the slew rate of the input waveform
• R and C will change as we scale a logic cell, but the RC product stays the same
• Logical effort is independent of the size of a logic cell
• We can nd logical effort by scaling a logic cell to have the same drive as a 1Xminimum-size inverter
• Then the logical effort, g , is the ratio of the input capacitance, C in, of the 1X logic cell toC inv
t PD = R(Cout + C p) + t qWe can scale any logic cell by a scaling factor s : t PD = (R / s )·(Cout + sC p) + st q
Cout
t PD = RC –––––– + RC p + st q C in
(RC ) (Cout / C in ) + RC p + st qNormalizing the delay: d = ––––––––––––––––––––––––––––––– = f + p + q
τ
The time constant tau , τ = Rinv C inv , is a basic property of any CMOS technology
The delay equation is the sum of three terms, d = f + p + q or
delay = effort delay + parasitic delay + nonideal delayThe effort delay f is the product of logical effor t, g , and electrical effort , h: f = gh
Thus, delay = logical effort × electrical effort + parasitic delay + nonideal delay
-
8/9/2019 ASICs the Course Book
69/505
ASICs... THE COURSE 3.3 Logical Effort 13
The h depends only on the load capacitance C out connected to the output of the logiccell and the input capacitance of the logic cell, C in; thus
Logical effort • For a two-input NAND cell, the logical effort, g =4/3
(a) Find the input capacitance, Cinv, looking into the input of a minimum-size inverter interms of the gate capacitance of a minimum-size device
(b) Size a logic cell to have the same drive strength as a minimum-size inverter (assuminga logic ratio of 2). The input capacitance looking into one of the logic-cell terminals is thenC in
(c) The logical effort of a cell is C in / C inv
electrical effort h = Cout / C in
parasitic delay p = RC p / τ (the parasitic delay of a minimum-size inverter is: pinv = C p /Cinv )
nonideal delay q = st q / τ
Cell effort, parasitic delay, and nonideal delay (in units of τ) for single-stage CMOS cells
Cell Cell effort(logic ratio=2)Cell effort
(logic ratio=r) Parasitic delay/ τ Nonideal delay/ τ
inverter 1 (by denition) 1 (by denition) p inv (by denition) q inv (by denition)n-input NAND ( n +2)/3 ( n + r )/(r +1) np inv nq invn-input NOR (2 n +1)/3 ( nr +1)/( r +1) np inv nq inv
(b)
Cin =2+2=4
Measure the inputcapacitance of aminimum-sizeinverter.
Make the cell have the samedrive strength as aminimum-size inverter.
g = C in / C inv = 4/3
1X
2/1
2/1
2/1 2/1
(a)
2/1
1/1
2 units ofgate capacitance
1 unitCinv
1X
C inv
(c)
C inv =2+1=3
A
A
Measure ratio of cell inputcapacitance to that of aminimum-size inverter.
1
2 3
ZNZN
-
8/9/2019 ASICs the Course Book
70/505
14 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE
3.3.1 Predicting Delay• Example: predict the delay of a three-input NOR logic cell
• 2X drive
• driving a net with a fanout of four• 0.3pF total load capacitance (input capacitance of cells we are driving plus the inter-connect)
• p =3 p inv and q =3 q inv for this cell
• the input gate capacitance of a 1X drive, three-input NOR logic cell is equal to gC inv• for a 2X logic cell, C in = 2 gC inv
The delay of the NOR logic cell, in units of τ , is thus
Cout g ·(0.3 pF) (0.3 pF) gh = g ––––– = ––––––––––– = –––––––––––– (Notice g cancels out in this equation)
C in 2 gC inv (2)·(0.036 pF)
0.3 × 10 –12
d = gh + p + q = –––––––––––––––––––– + (3)·(1) + (3)·(1.7)
(2)·(0.036 × 10 –12 )
= 4.1666667 + 3 + 5.1
= 12.266667 τ equivalent to an absolute delay, t PD ≈12.3 ×0.06ns=0.74ns
The delay for a 2X drive, three-input NOR logic cell is t PD = (0.03 + 0.72 Cout + 0.60) ns
With C out =0.3pF, t PD = 0.03 + (0.72)·(0.3) + 0.60 = 0.846 ns compared to our prediction of0.74ns
-
8/9/2019 ASICs the Course Book
71/505
ASICs... THE COURSE 3.3 Logical Effort 15
3.3.2 Logical Area and Logical Efciency
An OAI221 logic cell
• Logical-effort vector g =(7/3, 7/3,5/3)
• The logical area is 33 logicalsquares
An AOI221 logic cell
• g =(8/3, 8/3, 7/3)
• Logical area is 39 logical squares
• Less logically efficient than OAI221
Z
4/1 4/1
4/14/1
A
B
C
D
VDD2/1E
3/1A
3/1C
3/1B
3/1D
Z
ABCDE 3/1
E
VDD
Z
6/1 6/1
6/16/1
6/1
A
C
E
B
D
1/1E
2/1A
2/1B
2/1C
2/1D
Z
ABCD
E
-
8/9/2019 ASICs the Course Book
72/505
16 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE
3.3.3 Logical Paths
3.3.4 Multistage Cells
path delay D = ∑ g i h i + ∑ ( p i + q i )
i path i path
Logical paths • Comparison of multistage and single-stage implementations
(a) An AOI221 logic cell constructed as a multistage cell, d 1 = 20 + CL
(b) A single-stage AOI221 logic cell, d 1 = 18.8 + CL
(a)
(b)
d 1=(1 × 2.6+1+1.7) +(1 × C L +5+8.5)=18.8+ C L
d 1=( g 0 h0 + p0 + q0) +( g2 h2 + p2 + q2) +( g3h 3 + p3 + q3) +( g4h4 + p4 + q4)
=(1 × 1.4 +1+1.7)+(1.4 × 1+2+3.4)+(1.4 × 0.7+ 2+3.4) +(1 × C L +1+1.7)=20+ CL
AOI21 g4 =1 p4 =1q4 =1.7
ZN
C
A1A2
B1B2
ZN
1.01.4
1.41.0
2.0 1.6
C LC
A1A2
B1B2
g1 =(2, 1.6) p1 =3q1 =5.4
g0 =1 p0 =1q0 =1.7
g2 =1.4 p2 =2q2 =3.4
delay d1
g3 =1.4 p3 =2q3 =3.4
ZN
C
A1A2B1B2
AOI221
2.6
g0 =1 p0 =1
q0 =1.7 g1 =(2.6, 2.6, 2.2) p1 =5q1 =8.5
CL
delay d1
h0 =1.41.4
h3 =0.7
h4 =C L
2
13 4 43
1
2
AOI221 AOI221
1.0
(b) isslightlyfasterthan (a)
h2 =1.0
-
8/9/2019 ASICs the Course Book
73/505
ASICs... THE COURSE 3.3 Logical Effort 17
3.3.5 Optimum Delay
3.3.6 Optimum Number of Stages
• Chain of N inverters each with equal stage effort, f=gh
• Total path delay is Nf=Ngh=Nh , since g =1 for an inverter
path logical effort G = ∏ g ii path
Cout
path electrical effort H = ∏ h i –––––
i path C inCout is the load and C in is the rst input capacitance on the path
path effort F = GH
optimum effort delay f^i = g i h i = F 1/ N
optimum path delay D^ = NF 1/ N = N (GH )1/ N + P + Q
P + Q = ∑ p i + h i
i path
Stage effort
h h/(ln h)1.5 3.72 2.9
2.7 2.73 2.7
4 2.95 3.1
10 4.3
0
2
4
6
8
10
12
1 2 3 4 5 6 7 8 9 10
= h /(ln h)
stage electrical effort, h=H 1/ N
Delay of N inverter stages drivinga path effort of H = C out / C in .
C in Cout
1 N2
h
delay/(ln H )
h
-
8/9/2019 ASICs the Course Book
74/505
18 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE
• To drive a path electrical effort H, h N = H , or N ln h =lnH
• Delay, Nh = h lnH /ln h
• Since ln H is xed, we can only vary h /ln( h)
• h /ln( h) is a shallow function with a minimum at h =e ≈2.718• Total delay is N e=eln H
3.4 Library-Cell Design• A big problem in library design is dealing with design rules
• Sometimes we can waive design rules
• Symbolic layout , sticks or logs can decrease the library design time (9 months forVirtual Silicon–currently the most sophisticated standard-cell library)
• Mapping symbolic layout uses 10–20 percent more area (5–10 percent with compac-tion)
• Allowing 45°layout decreases silicon area (some companies do not allow 45° layout)
-
8/9/2019 ASICs the Course Book
75/505
ASICs... THE COURSE 3.5 Library Architecture 19
3.5 Library Architecture
(a) (b )
(c) (d)
Cell library statistics
• 80percent of an ASIC uses less than20percent of the cell library
• Cell importance
• A D flip-flop (with a cell importance of 3.5)contributes 3.5 times as much area on a typi-cal ASIC than does an inverter (with a cell im-portance of 1)
(e)
cell numberordered bycell use
normalized cell use(minimum-size inverter=1)
0
1
cell numberordered bycell use
50 normalized cell area(minimum-size inverter=1)
0
cell area × cell use(minimum-size inverter=1)
cell numberordered bycell use
0
4
0
1
cell numberordered bycell importance
normalized cell importance(D flip-flop=1)
cell importance =cell area × cell use(D flip-flop=1)
cell use (minimum-size inverter=1)
0
1
cell numberordered bycell use andby cell importance
-
8/9/2019 ASICs the Course Book
76/505
20 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE
3.6 Gate-Array Design
Key words: gate-array base cell (or base cell) • gate-array base (or base) • horizontal tracks •
vertical track • gate isolation • isolator transistor • oxide isolation • oxide-isolated gate array
The construction of a gate-isolated gate array(a) The one-track-wide base cell containing one p-channel and one n-channel transistor
(b) The center base cell is isolating the base cells on either side from each other
(c) The base cell is 21 tracks high (high for a modern cell library)
n-well
p-well
n-diff
p-diff
poly
m1
m2
contact
(a) (b) (c)
continuousp-diff strip
continuousn-diff strip
VDD
VSS
n-wellcontact
p-wellcontact
bent gate
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
contact forisolator
-
8/9/2019 ASICs the Course Book
77/505
ASICs... THE COURSE 3.6 Gate-Array Design 21
An oxide-isolated gate-array base cell
• Two base cells, each contains eight transistors and two well contacts
• The p-channel and n-channel transistors are each 4 tracks high
• The cell is 12 tracks high (8–12 is typical for a modern library)
• The base cell is 7 tracks wide
VDD
GND
n-wellp-welln-diff
p-diffpoly
m1
m2contact
n-wellcontactp-wellcontact
base cell
break in diffusion
12(3)456789(10)1112
1 2 3 4 5 6 7 poly
p-diff
n-diff
p-diff
-
8/9/2019 ASICs the Course Book
78/505
22 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE
An oxide-isolated gate-array base cell
• 14 tracks high and 4 tracks wide
• VDD (tracks 3 and 4) and GND (tracks 11 and 12) are each 2 tracks wide
• 10 horizontal routing tracks (tracks 1, 2, 5–10, 13, 14)—unusually large number for mod-ern cells
• p-channel and n-channel polysilicon bent gates are tied together in the center of the cell
• The well contacts leave room for a poly cross-under in each base cell.
n-well
p-well
n-diff
p-diff
poly
m1
m2
contact
poly cross-under
VDD
VSS
1
2
(3)
(4)
5
6
7
8
9
10
(11)
(12)
13
14
n-well contact
1 2 3 4
-
8/9/2019 ASICs the Course Book
79/505
ASICs... THE COURSE 3.6 Gate-Array Design 23
Flip-op macro in a gate-isolated gate-array library
• Only the first-level metallization and contact pattern, the personalization , is shown, butthis is enough information to derive the schematic
• This is an older topology for 2LM (cells for 3LM are shorter in height)
D
Q
contact forisolator
VDD
VSS
connector
CLR
QN
CLK
-
8/9/2019 ASICs the Course Book
80/505
24 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE
The SiARC/Synopsys cell-based array (CBA) basic cell
• This is CBA I for 2LM (CBA II is intended for 3LM and salicide proceses)
n-well
p-well
n-diff
p-diff
poly
m1
m2
1
2
3
4
5
6
7
8
9
10
11
-
8/9/2019 ASICs the Course Book
81/505
ASICs... THE COURSE 3.6 Gate-Array Design 25
A simple gate-array base cell
aa bb cc dd ee ff gg hh ii jj kk ll
ab
c
def
g
hi
j
k
l
m
n
o
p
q
mm
nn
oo
xx
yy
aa+bb
= 2.75= (0.5 × P.4 + C.3)
= 1.25
= 0.5 × P.4
= xx + yy
= 1.5= C.3
n-wellpoly
ndiff
contact
ndiff pdiff
pdiff
BB
BB BB
basecell 1
basecell 2
BB = cell bounding box
p-well
-
8/9/2019 ASICs the Course Book
82/505
26 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE
3.7 Standard-Cell Design
A D ip-op standard cell
• Performance-optimized library • Area-optimized library
• Wide power buses and transistors for a performance-optimized cell
• Double-entry cell intended for a 2LM process and channel routing
• Five connectors run vertically through the cell on m2
• The extra short vertical metal line is an internal crossover
• bounding box (BB) • abutment box (AB) • physical connector • abut
-
8/9/2019 ASICs the Course Book
83/505
ASICs... THE COURSE 3.7 Standard-Cell Design 27
A D ip-op from a 1.0 µm standard-cell library
VDD
GND
-
8/9/2019 ASICs the Course Book
84/505
28 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE
D ip-op(Top) n-diffusion, p-diffusion, poly, contact (n-well and p-well are not shown)
(Bottom) m1, contact, m2, and via layers
poly
ndiff
pdiff
contactt1 t2 t3
t4 t5t6 t7 t8
t9 t10 t11
t12 t13
t14 t15
t20 t21t22
t23
t24
t25 t26 t27
t28
t29t30
t31
t32
t33
t34
1 1
1
1 1 1 1
00
00
2
2
3
3
3
44
4
5
5
5 5
6
6
6 6
6
7
7
7
7
8
8
8
9
9
9
10
10
10
c
d
d
a
b
b
e
e
10
9
10
via
m1
m2
contactVDD
VSS
ba
1
0
2 3
4 5
5 5
35
6
67
7
8
9
10
c
d6
8
c d e
-
8/9/2019 ASICs the Course Book
85/505
ASICs... THE COURSE 3.8 Datapath-Cell Design 29
3.8 Datapath-Cell Design
A datapath D ip-op cell
VDD
VDD
VSS
VSS
-
8/9/2019 ASICs the Course Book
86/505
-
8/9/2019 ASICs the Course Book
87/505
ASICs... THE COURSE 3.9 Summary 31
3.9 Summary
Key concepts:
• Tau, logical effort, and the prediction of delay• Sizes of cells, and their drive strengths
• Cell importance
• The difference between gate-array macros, standard cells, and datapath cells
-
8/9/2019 ASICs the Course Book
88/505
32 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE
-
8/9/2019 ASICs the Course Book
89/505
ASICs...THE COURSE (1 WEEK)
1
PROGRAMMABLEASICs
4.1 The Antifuse
Key concepts: programmable logic devices (PLDs) • eld-programmable gate arrays
(FPGAs) • programming technology • basic logic cells • I/O logic cells • programmable inter-
connect • software to design and program the FPGA
Actel antifuse
antifuse • programming current (about 5mA) • (PLICE‘) • oxide–nitride–oxide (ONO) dielec-tric • Activator • in-system programming (ISP) • gang programmers • one-time programma-ble (OTP) FPGAs
n+ antifusediffusion
antifusepolysilicon
2 λ
-
8/9/2019 ASICs the Course Book
90/505
-
8/9/2019 ASICs the Course Book
91/505
ASICs... THE COURSE 4.1 The Antifuse 3
4.1.1 Metal–Metal Antifuse
Metal–metal antifuse
QuickLogic metal–metal antifuse (ViaLink‘) • alloy of tungsten, titanium, and silicon • bulk re-sistance of about 500m Ω cm
Resistance values for the QuickLogicmetal–metal antifuse
m1
m2
SiO2
SiO2via
link
link
m2
amorphous Si
(a) (b)
SiO2
tungstenplug
m3
4 λ
4 λ
amorphous Si2 λ
m3
m2
2 λ2 λ
antifuse resistance/ Ω
0
100percentage
-
8/9/2019 ASICs the Course Book
92/505
4 SECTION 4 PROGRAMMABLE ASICs ASICS... THE COURSE
4.2 Static RAM
4.3 EPROM and EEPROM Technology
Xilinx SRAM (static RAM) congura-tion cell
• use in reconfigurable hardware
• use of programmable read-onlymemory or PROM to hold configu-ration
An EPROM transistor
(a) With a high (>12V) programming voltage, V PP , applied to the drain, electrons gainenough energy to “jump” onto the floating gate (gate1)
(b) Electrons stuck on gate1 raise the threshold voltage so that the transistor is always offfor normal operating voltages
(c) UV light provides enough energy for the electrons stuck on gate1 to “jump” back to thebulk, allowing the transistor to operate normally
Facts and keywords: Altera MAX 5000 EPLDs and Xilinx EPLDs both use UV-erasable
electrically programmable read-only memory (EPROM) • hot-electron injection or avalancheinjection • floating-gate avalanche MOS (FAMOS)
DATA
READ orWRITE
Q
Q'
configurationcontrol
source drain
+VPPGND
electronsGND
+V GS > V tn
gate1
gate2
source drain
+V DSGND
no channelbulk
GND
+V GS > Vtn h ν
UV light
(a) (b) (c)
bulkbulk
-
8/9/2019 ASICs the Course Book
93/505
ASICs... THE COURSE 4.4 Practical Issues 5
4.4 Practical Issues
4.4.1 FPGAs in Use• inventory
• risk inventory or safety supply
•just-in-time
(JIT
)
• printed-circuit boards (PCBs )
• pin locking or I/O locking
4.5 Specications• qualification kit
• down-binning
4.6 PREP Benchmarks• Programmable Electronics Performance Company (PREP )
• http://www.prep.org
Hardware security keycomputer-aided engineering (CAE ) tools • PC vs. workstation •ease of use • cost of ownership
-
8/9/2019 ASICs the Course Book
94/505
6 SECTION 4 PROGRAMMABLE ASICs ASICS... THE COURSE
4.7 FPGA Economics
Xilinx part-naming convention
Not all parts are available in all packag-es
Some parts are packaged with fewerleads than I/Os
Programmable ASIC part codes
Item Code Description Code DescriptionManufac-turer’s
code
A Actel ATT AT&T (Lucent)XC Xilinx isp Lattice Logic
EPM Altera MAX M5 AMD MACH 5 is on thedevice
EPF Altera FLEX QL QuickLogicCY7C Cypress
Package
type
PL or
PC
plastic J-leaded chip carrier,
PLCC
VQ very thin quad atpack,
VQFPPQ plastic quad atpack, PQFP TQ thin plastic atpack, TQFPCQ orCB
ceramic quad atpack, CQFP PP plastic pin-grid array, PPGA
PG ceramic pin-grid array, PGA WB,PB
ball-grid array, BGA
Application C commercial B MIL-STD-883I industrial E extendedM military
XC4010-10 PG156C
device typespeedpackagenumber of pinstemperature range
-
8/9/2019 ASICs the Course Book
95/505
-
8/9/2019 ASICs the Course Book
96/505
-
8/9/2019 ASICs the Course Book
97/505
ASICs... THE COURSE 4.7 FPGA Economics 9
4.7.2 Pricing Examplesbase prices and adjustment factors • “sticker price”
• Marshall at http://marshall.com , carry Xilinx
• Hamilton-Avnet, at http://www.hh.avnet.com , carry Xilinx
• Wyle, at http://www.wyle.com carries Actel and Altera
Example Actel part-price calculation
Example: A1020A-2-PQ100 in (100–999) quantity, purchased 1H92.Factor Example ValueBase price A1020A $43.30Quantity 100–999 84%Time 1H92 100%Qualication type Industrial (I) 120%
Speed bin 1 2 140%
Package PQ100 125%Estimated price (1H92) $76.38Actual Actel price (1H92) $75.601The speed bin is a manufacturer’s code (usually a number) that follows the family part numberand indicates the maximum operating speed of the device
-
8/9/2019 ASICs the Course Book
98/505
10 SECTION 4 PROGRAMMABLE ASICs ASICS... THE COURSE
4.8 Summary
All FPGAs have the following key elements:
• The programming technology
• The basic logic cells
• The I/O logic cells
• Programmable interconnect
• Software to design and program the FPGA
Programmable ASIC technologies
Actel Xilinx LCA1
Altera EPLD Xilinx EPLDProgrammingtechnology
Poly–diffusionantifuse, PLICE
Erasable SRAM
ISP
UV-erasableEPROM (MAX 5k)
EEPROM (MAX7/9k)
UV-erasableEPROM
Size ofprogrammingelement
Small but requirescontacts to metal
Two inverters pluspass and switchdevices. Largest.
One n -channelEPROM device.
Medium.
One n -channelEPROM device.
Medium.Process Special: CMOS
plus three extramasks.
Standard CMOS Standard EPROMand EEPROM
Standard EPROM
Program-ming method
Special hardware PC card, PROM,or serial port
ISP (MAX 9k) orEPROM program-mer
EPROM program-mer
QuickLogic Crosspoint Atmel Altera FLEXProgrammingtechnology
Metal–metalantifuse, ViaLink
Metal–polysiliconantifuse
Erasable SRAM.
ISP.
Erasable SRAM.
ISP.
Size ofprogramming
element
Smallest Small Two inverters pluspass and switchdevices. Largest.
Two inverters pluspass and switchdevices. Largest.
Process Special, CMOSplus ViaLink
Special, CMOSplus antifuse
Standard CMOS Standard CMOS
Program-ming method
Special hardware Special hardware PC card, PROM,or serial port
PC card, PROM,or serial port
1Lucent (formerly AT&T) FPGAs have almost identical properties to the Xilinx LCA family
-
8/9/2019 ASICs the Course Book
99/505
ASICs... THE COURSE 4.9 Problems 11
4.9 Problems
-
8/9/2019 ASICs the Course Book
100/505
12 SECTION 4 PROGRAMMABLE ASICs ASICS... THE COURSE
-
8/9/2019 ASICs the Course Book
101/505
ASICs...THE COURSE (1 WEEK)
1
PROGRAMMABLEASIC LOGICCELLS
5.1 Actel ACT5.1.1 ACT 1 Logic Module
Key concepts: basic logic cell • multiplexer-based cell • look-up table (LUT) • programmable
array logic (PAL) • inuence of programming technology • timing • worst-case design
The Actel ACT architecture
(a) Organization of the basic logic cells
(b) The ACT 1 Logic Module (LM, the Actel basic logic cell). The ACT 1 family uses justone type of LM. ACT 2 and ACT 3 FPGA families both use two different types of LM
(c) An example LM implementation using pass transistors (without any buffering)
(d) An example logic macro. Connect logic signals to some or all of the LM inputs, the re-maining inputs to VDD or GND
F
(b) (c) (d)
S3
FA0
SA F1
A1
B0
SB
F2
B1
S0
S1 O1
S
A0
A1
SA
01
01
SB0
B1
SB
01S
S0S1
M1
M2
O1
M3
S3
F1
F2
01
01
01
D
'1'
D
A
'1'
C
'0'B
F
(a)
Actel ACT
Logic Module Logic Module Logic Module
F=(A ·B) +(B' ·C)+D
F1
F2
-
8/9/2019 ASICs the Course Book
102/505
2 SECTION 5 PROGRAMMABLE ASIC LOGIC CELLS ASICS... THE COURSE
5.1.2 Shannon’s Expansion Theorem• We can use the Shannon expansion theorem to expand F =A·F(A='1') +A'·F(A='0')
Example: F =A' ·B + A·B·C' + A'·B' ·C = A·(B·C' ) + A' ·(B + B'·C)• F(A='1')=B·C' is the cofactor of F with respect to ( wrt ) A or FA• If we expand F wrt B, F =A