sram implementation
TRANSCRIPT
-
8/3/2019 Sram Implementation
1/50
CHAPTER-1
INTRODUCTION TO VLSI
1.1 INTRODUCTION
Very-large-scale integration (VLSI) is the process of creating integrated circuits by
combining thousands of transistors into a single chip. VLSI began in the 1970s when
complex semiconductor and communication technologies were being developed.
The first semiconductor chips held two transistors. Subsequent advances added more
and more transistors, and, as a consequence, more individual functions or systems were
integrated over time. The first integrated circuits held only a few devices, perhaps as many as
ten diodes, transistors, resistors and capacitors, making it possible to fabricate one or
more logic gates on a single device. Now known retrospectively assmall-scale
integration (SSI), improvements in technique led to devices with hundreds of logic gates,
known as medium-scale integration (MSI). Further improvements led to large-scale
integration (LSI), i.e. systems with at least a thousand logic gates. Current technology has
moved far past this mark and today's microprocessors have many millions of gates andbillions of individual transistors.
At one time, there was an effort to name and calibrate various levels of large-scale
integration above VLSI. Terms like ultra-large-scale integration (ULSI) were used. But the
huge number of gates and transistors available on common devices has rendered such fine
distinctions moot. Terms suggesting greater than VLSI levels of integration are no longer in
widespread use.
As of early 2008, billion-transistor processors are commercially available. This isexpected to become more commonplace as semiconductor fabrication moves from the current
generation of 65 nm processes to the next 45 nm generations. Current designs, as opposed to
the earliest devices, use extensive design automation and automated logic synthesis to lay
out the transistors, enabling higher levels of complexity in the resulting logic functionality.
Certain high-performance logic blocks like the SRAM (Static Random Access Memory) cell,
however, are still designed by hand to ensure the highest efficiency.
1.2 VLSI DESIGN FLOW
1
-
8/3/2019 Sram Implementation
2/50
The system prototyping methodology is a natural outgrowth of recent developments in
Software and hardware facilities intended to make it simple for designers with an idea for a
particular application to turn that idea into a working system based on very large scale
Integrated chips. Today VLSI CMOS technologies deliver individual integrated circuits
(ICs) and containing millions of gates, sufficient to implement substantial systems-on-chip or
major subsystems-on-a chip. System-on-chip design may involve the expertise from many
fields of electronics such as signal processing, communication, device physics etc. and so on.
It is unreasonable to expect the architect of a speech recognition system, for example, to be
an expert in device physics as well as in signal processing. The Mead Conway methodology
for integrated-circuit design makes VLSI technology available to such an application
designers.
The structured design methodology of Mead and Conway is an approach to VLSI system
design that attacks the problems of complex chip designs. The structured design
methodology is similar in concept to structured programming: the design proceeds in a top-
down manner in which the problem is decomposed and refined. The structured design
methodology has two major parts: hierarchy and regularity. Hierarchical techniques have
long been used to design complex systems. Hierarchies are used to partition designs and
common parts of a design can be factored out and specified only once. By introducing
regularity into a system, the design problem is reduced in complexity as subunits are
replicated many times and connections between units are simplified.
Regularity means that the hierarchical decomposition of a large system should result in
not only simple, but also similar blocks, as much as possible. A good example of regularity
is the design of array structures consisting of identical cells such as a parallel multiplication
array. Regularity can exist at all levels of abstraction: If the designer has a small library of
well-defined and well-characterized basic building blocks, a number of different functions
can be constructed by using this principle. Regularity usually reduces the number of different
modules that need to be designed and verified, at all levels of abstraction.
1.3 DESIGN DESCRIPTION
VLSI design style mainly uses three domains of design description, viz. the
behavioral, the description of the function of the design; the structural, the description of the
2
-
8/3/2019 Sram Implementation
3/50
form of the implementation; and the physical, the description of the physical implementation
of the design. There are many possible representations of a circuit in each description, and
judicious choice of representations is important in tool design.
A simplified view of design flow is shown in Fig. 1. Regardless of the actual size of
the project, the basic principles of structured design will improve the prospects of success.
Behavioral
Representation
Logic(Gate Level
Representation)
Circuit Representation
Layout Representation
Fabrication and Testing
Fig1.1 VLSI design flow
At the beginning of a design it is important to specify the system requirements
without unduly restricting the design. The object is to describe the purpose of the design
including all aspects, such as the functions to the realized, timing constraints, and power
dissipation requirements, etc.
3
System Specification
Functional(Architecture) Design
Functional
Logic Design
Logic Verification
Circuit Design
Circuit Verification
Physical Design
Layout Verification
Circuit Modeling
-
8/3/2019 Sram Implementation
4/50
Functional design specifies the functional relationships among subunits or registers.
In general, a description of the IC in either the functional or the block diagram domain
consists both of the input-output description, and the way that this behavior is to be realized
in terms of subordinate modules. In turn each of these modules is described both in terms of
input-output behaviors and as an interconnection of other modules. These hierarchical ideas
apply to all the domains. The internal description of a module any be given in another
domain. If a module has no internal description then the design is incomplete. Ultimately
this hierarchy stops when the internal description is in terms of mask geometry, which is
primitive. Hierarchy and modularity are used in block diagrams or computer programs. In
these domains hierarchy suppresses unnecessary details, simplifies system design through a
divide-and-conquer strategy and leads to more easily understood designs that are more
readily debugged and documented.
It can be summarized in a way that when we want to design a digital system, we need
to specify the system performance which is called system specification. Then the system
must be broken down into subunits or registers. So we have a functional design which
specifies the functional relationships among subunits or registers. Architecture usually
means the functional design, system specification and often including part of the subsequent
logic design.
The next step is the Logic design of networks which constitutes subunits or registers.
When a system architecture or logic networks are designed, performance and errors are
checked by CAD programs, called as logic simulation. The subject of the logic design is
to decide overall structure of blocks, their interconnection pattern, to specify the structure of
data path and to control sequences of data path. Logic simulator does the logic verification
considering the propagation delays of interconnection signals and the element delay.
Simulator also checks whether the network contains hazards analysis. Logic design and
simulation is a key issue in VLSI CAD. The flow of logic design process is determined by
the level at which the design can begin-system level, behavioral level or functional level.
Logic design consists of a series of design steps leading from a higher level to a circuit
description at the logic level.
For this electronic Circuit design and simulation, CAD programs perform complex
numerical analysis calculations of nonlinear differential equations which characterize
electronic circuits. Since we need to finish calculation within a reasonable time limit,
keeping the required accuracy, many advanced numerical analysis techniques are used. The
4
-
8/3/2019 Sram Implementation
5/50
CAD programs usually yield the analysis of transient behavior, direct-current performance,
stationary alternating-current performance, temperature, signal distortion, noise interference,
sensitivity and parameter optimization of the electronic circuits.
The Layout system is used to convert block/cell placement data into actual locations,
and to construct a routing maze containing all spacing rules. The format used for relative cell
placement data is the same for automatic as for manual placements in order to simplify their
interchange. In fact, the output of the automatic placement program can be modified by hand
before input into the chip building step as manual placement data.
The layout for random-logic networks in the most time-consuming stage throughout
the entire sequence of LSI/VLSI chip design. After having finished the layout, designers
usually check by CAD programs whether the layout conforms to the layout rules. As the
integration size of LSI/VLSI chips becomes larger, design verification and testing at each
design stage is vitally important, because any errors which sneak in from the previous design
stages are more difficult to find and more expensive, since once found, we need to redo the
previous design stages. As the integration size increases, the test time increases very rapidly,
so it is crucial to find a good way to test within as short a time as possible, though it appears
very difficult to find good solutions. Complete test and design verification with software or
hardware (i.e., computers specialized in testing) is usually done to find a design mistake.
The last domain in which the design of an IC can exist include the mask set, and of
course, the final fabricated chip followed by prototype testing.
CHAPTER-2
STATIC RANDOM ACCESS MEMORY
2.1 INTRODUCTION
MEMORY refers to the physical devices used to store programs or data on a
temporary or permanent basis for use in a computer or other digital electronic device.
The term primary memory is used for the information in physical systems which are
fast (i.e. RAM), as a distinction from secondary memory, which are physical devices
5
-
8/3/2019 Sram Implementation
6/50
for program and data storage which are slow to access but offer higher memory capacity.
Primary memory stored on secondary memory is called "virtual memory".
The term "storage" is often used in separate computers of traditional secondary
memory such as tape, magnetic disks and optical discs (CD-ROM and DVD-ROM). The term"memory" is often associated with addressable semiconductor memory, i.e. integrated
circuits consisting of silicon-based transistors, used for example as primary memory but also
other purposes in computers and other digital electronic devices.
There are two main types of semiconductor memory: volatile and non-volatile.
Examples of non-volatile memory are flash memory and ROM, PROM, EPROM,
EEPROM memory. Examples of volatile memory are primary memory (dynamic RAM), and
fast CPU cache memory (static RAM), which is fast but energy-consuming and offer lower
memory capacity per area unit than DRAM.
The semiconductor memory is organized into memory cells or bistable flip-flops, each
storing one binary bit (0 or 1). The memory cells are grouped into words of fix word length,
for example 1, 2, 4, 8, 16, 32, 64 or 128 bit. Each word can be accessed by a binary address
ofNbit, making it possible to store 2 raised byNwords in the memory. This implies
that processor registers normally are not considered as memory, since they only store one
word and do not include an addressing mechanism.
2.2 OPERATION OF SRAM
SRAM operates in three modes. They are;
1. Standby mode or idle mode
2. Read mode and
3. Write mode.
In idle mode, the SRAM is disabled. In read mode the data is read from a selected addresslocation. In write mode the data is written to a particular address location.
2.3 APPLICATIONS
SRAM is used in many embedded applications.
Many categories of industrial and scientific subsystems, automotive electronics uses
static RAM. Some amount (kilobytes or less) is also embedded in practically all
modern appliances, toys, etc., that implement an electronic user interface.
Several megabytes is used in complex products such as digital cameras, cell phones,
synthesizers etc.
6
-
8/3/2019 Sram Implementation
7/50
7
-
8/3/2019 Sram Implementation
8/50
CHAPTER-3
MTCMOS TECHNIQUE
3.1 INTRODUCTION
This is one of the most common approaches to reduce leakage currents where two
different types of transistors are fabricated on the chip, a high V th to lower sub-threshold
leakage current. Based on the multi-threshold technologies previously described, several
multiple-threshold circuit design techniques have been developed.
Multi-threshold voltage CMOS: reduces the leakage by inserting high-threshold
devices in series to low Vth circuitry. Fig. 3.1(a) shows the schematic of an MTCMOS circuit.A sleep control scheme is introduced for efficient power management. In the active mode,
Sleep is set low and sleep control high Vth transistors (MP and MN) are turned on. Since their
on-resistances are small, the virtual supply voltages (Virtual Vdd and Virtual GND) almost
function as real power lines. In the standby mode, Sleep is set high, MN and MP are turned
off, and the leakage current is low. In fact, only one type of high V th transistor is enough for
leakage control. Fig 3.1(b) and (c) show the PMOS insertion and NMOS insertion schemes,
respectively. The NMOS insertion scheme is preferable, since the NMOS on-resistance is
smaller at the same width. NMOS can be sized smaller than corresponding PMOS.
MTCMOS can be easily implemented based on existing circuits.
8
-
8/3/2019 Sram Implementation
9/50
Fig 3.1 MTCMOS Technique
In the active mode, the sleep control signal SL is set low and the control transistors
are tuned on. Since the on resistances of high-V t sleep is low, VDD and VSS act like power
supply lines. In the standby mode, SL is set high, the high threshold sleep control transistors
are tuned off, resulting in low leakage current.
Short channel transistors require lower power supply levels to reduce power
consumption. This forces a reduction in the threshold voltage that causes a substantial
increase of weak inversion current. The leakage control technique that have been proposed so
far is power gating and also known as MTCMOS, has traditionally been the most effective
way to lower the leakage. Power gating uses a PMOS transistor or an NMOS transistor to
disconnect the circuits supply voltage from the logic when the logic is inactive. This
technique can reduce leakage by more than two orders of magnitude with negligible speed
degradation.
MTCMOS power gating works to reduce leakage currents by disconnecting the power
supply from specific portions of a circuit when those portions are not needed.
9
-
8/3/2019 Sram Implementation
10/50
Multi-threshold CMOS (MTCMOS) has been described as a method to reduce
standby leakage current in the circuit, with the use of a high threshold MOS device to de-
couple the logic from the supply or ground during long idle periods, or sleep states.
MTCMOS circuit, where the logic block is constructed using low threshold devices and the
either the power supply can be gated by a high threshold header switch, or the ground
terminal is gated by a high threshold footer switch.
During active operation of the MTCMOS circuit described by Fig 3.2, the power
interrupt switch is turned on by the SLEEPN (or SLEEPP) signal and current dissipated by
the logic is drawn through the interrupt switch which causes a reduction in drive voltage seen
by the logic, reducing logic performance. To compensate for the reduction in logic
performance: Larger power supply voltages can be used to at the expense of increased active
power for similar performance. Larger device widths for the power interrupt switch can be
used to minimize performance impact, at the expense of increased area and power for
entering and existing sleep mode. The adjustments in device implants to allow moderately
high threshold values is another technique that can be used to increase performance at the
expense of increased device leakage during idle mode.
Fig 3.2: MTCMOS Logic
The MTCMOS scheme, proposed in, is a good technique for reducing both gate and
sub-threshold leakages. But it slows down circuits considerably as VDD is scaled below 0.6V.
1.2GATE OXIDE LEAKAGE CURRENT
10
-
8/3/2019 Sram Implementation
11/50
A principal source of Igate arises from the tunneling of electrons through the gate
oxide. The probability of electron tunneling is a strong function of the applied electric field
and the barrier thickness itself, which is simply Tox, with a small change in Tox having a
tremendous impact on Igate. For example, in MOS devices with SiO2 gate oxides, a
difference in Tox of only 2A can result in an order of magnitude increase in Igate ,so that
reducing Tox from 18A to 12A increases Igate by approximately 10001. The most
effective way to control Igate is through the use of new materials, high-k dielectrics, but such
materials are not expected to be manufacturable until approximately 2007 at the earliest .The
issue of power dissipation due to gate leakage a rises in two contexts. In the stand-by mode,
when a circuit is not undergoing any active operations, leakage may be controlled through
various means, prominent among which is the use of multiple threshold CMOS sleep
transistors. The assignment of circuit inputs to send the circuit into a low leakage state, and
body biasing. In the active mode, i.e., in normal operation, clearly, the use of neither sleep
transistors nor state assignment is viable. Recent studies show that at the 90nm mode, leakage
can contribute over 40% of the total power.
Leakage power in modern CMOS VLSI circuits has become a component comparable
to dynamic power dissipation. Typically, the sub threshold leakage current dominates the
device off-state leakage due to low Vth transistors employed in logic cell blocks in order to
maintain the circuit switching speed in spite of decreasing VDD levels. The Multi-Threshold
CMOS (MTCMOS) technique can significantly reduce the sub threshold leakage currents
during the circuit sleep (standby) mode by adding high-Vth power switches (sleep transistors)
to low-Vthlogic cell blocks. This is because the stacked high-Vthsleep transistor connected to
the bottom of the pull-down network of all logic cells in the circuit acts as a high-resistance
element during the sleep mode, which limits the leakage current from V ddto ground lines. At
the same time, because of the stack effect, the sub threshold leakage of the low-V thtransistors
in the logic block itself goes down. This leakage reduction is preferably achieved with small
performance degradation because, during the active mode of the circuit, the sleep transistor is
fully on (i.e., it operates in the linear mode), and thus, all low-V th logic cells in the MTCMOS
logic block can switch very fast. Unfortunately, the situation is different in real designs. More
precisely, during the active mode of the circuit operation, the high-V thsleep transistor acts as
a small linear resistance placed at the bottom of the transistor stack to ground, causing the
propagation delay of the cells in the logic block to increase. In addition, the virtual ground
network itself acts as a distributed RC network, which causes the voltage of the virtualground node to rise even further, thereby degrading the switching speed of the logic cells
11
-
8/3/2019 Sram Implementation
12/50
even more .The former effect is a function of the size of the sleep transistor whereas the latter
effect is a function of the physical distance of the logic cell from the sleep transistor.
Fig 3.3: (a) MTCMOS circuit structure (b) The circuit model with virtual ground
interconnected and sleep transistor modeled as resistors R1 and R2 respectively
Fig 3.3(a) depicts a logic blockLB, in which a group of low-Vth logic cells are first
connected to the virtual ground node and then through a high-Vthsleep transistor, S, to the
actual ground, GND . Fig 3.3 (b) models the virtual ground interconnection and the high- Vth
sleep transistor, which behaves like a linear resistor in the active mode of the circuit
operation, as resistorsRi andRs, respectively. The virtual ground is at voltage Vxabove the
actual ground, i.e., ( VX=I.(Rs+Ri) whereIis the current flowing through the virtual ground
sub-network and the sleep transistor. The voltage drop acrossRs +Ri reduces the gate over-
drive voltage of MTCMOS logic cells (i.e., theirVgsvalue) from Vddto VddVx. An optimal
algorithm for placing sleep transistors for the standard cell-based layout design, which
minimizes the performance degradation of MTCMOS circuits due to the interconnect
resistance of the virtual ground network.
Technology scaling causes sub threshold leakage currents to increase exponentially.
As technology scales into the deep-submicron (DSM) regime, standby sub threshold leakage
power increases exponentially with the reduction of the supply voltage (VDD) and the
threshold voltage (Vth). For many event driven applications, such as mobile devices where
circuits spend most of their time in an idle state with no computation, standby leakage power
is especially detrimental on overall power dissipation. Multi-Threshold CMOS (MTCMOS)
is an effective circuit-level methodology that provides high performance in the active mode
and saves leakage power during the standby mode. The basic principle of the MTCMOS
technique is to use low Vth transistors to design the logic gates where the switching speed is
essential, while the high Vth transistors are used to effectively isolate the logic gates in
standby state and limit the leakage dissipation. In the active mode, the sleep transistor works
as a resistor.
12
-
8/3/2019 Sram Implementation
13/50
A downside of using Multi-Threshold CMOS (MTCMOS) technique for leakage
reduction is the energy consumption during transitions between sleep and active modes.
Previously, a charge recycling (CR) MTCMOS architecture was proposed to reduce the large
amount of energy consumption that occurs during the mode transitions in power gated
circuits. Considering the RC parasitic of the virtual ground and V DD lines, proper sizing and
placement of charge recycling transistors is key to achieving the maximum power saving.
Power gating technique provides low leakage and high performance operation by
using low Vt transistors for logic cells and high Vt devices as sleep transistors for
disconnecting logic cells from power supply and/or ground. This Multi-threshold CMOS
technology reduces the leakage in the sleep mode. One of the key concerns in MTCMOS is
the wake up time latency of the circuit, which is defined as the time required to turn on the
circuit after receiving the wake up signal. Reducing the wake up time latency is an important
issue since it can affect the overall performance of the VLSI circuit. Another important issue
in power gating is minimizing the energy wasted during mode transition, i.e., while switching
from active to sleep mode and vice versa. Both virtual ground and virtual VDD nodes
experience voltage change during mode transition. Since there is considerable number of
cells connected to the virtual ground and virtual supply nodes, the total switching capacitance
at these nodes is large, and as a result the switching power consumption during mode
transition can be significant.
Sleep transistor sizing is an important issue in designing the MTCMOS circuits.
Charge recycling technique has been recently proposed in order to reduce the energy
consumption during mode transition of MTCMOS circuits. It has been shown that by
applying this technique, up to 46% of the switching energy due to mode transition can be
saved.
The MTCMOS circuit scheme is a very efficient low-power and high performance
circuit technique that employs high Vth transistors to switch on and off the power supplies to
the low Vth logic blocks.
3.3 DIMENSIONING OF MTCMOS CIRCUITS
The MTMCOS technique is a well known way to combine high switching speed with
low standby current, by using low-Vt transistors for the logic part and high-Vt transistors for
the so-called sleep transistors. However, a practical analytic formula, how to correctly
dimension the sleep transistor for a demanded performance, has not been provided.
MTCMOS circuits can be simplified by using NMOS sleep transistors, see Fig.3.4
These transistors will be in their linear mode when the circuit is active. The logic transistors,
13
-
8/3/2019 Sram Implementation
14/50
however, will work in the saturation region. Since the current through logic and sleep
transistor must be identical, the following equation describes the resulting ground shift VH
due to the sleep transistor.
Fig 3.4: Modified MTCMOS design with NMOS sleep control transistor
The factor q (q>1) is used to describe the specified delay time factor of the MTCMOS
circuit in comparison to a standard CMOS configuration. With the help of Equation (1) it is
possible to calculate the necessary width WH of a sleep transistor, with WL as the
accumulated width of all low-Vt logic transistors that are controlled by the sleep transistor.
A drawback of the common MTCMOS technique is the floating of nodes in the circuits. To
prevent data from being lost, circuitry must be added to each flip flop.
3.4 MTCMOS APPROACHES HAVE THREE SHORTCOMINGS
First, process modifications for supporting the high-VTH of the sleep MOSFET are
required. Second, when a circuit goes into the sleep state, it takes a non-negligible amount of
time to wake up and re-activate because the large sleep transistor must be switched on and it
must initially discharge the slow virtual ground capacitance. Third, gates into the sleep region
may be interfaced with gates outside. This means that the outputs of inactive gates (gates into
the sleep region) can float at intermediate voltages, causing large short-circuit currents in the
active gates they drive.
14
-
8/3/2019 Sram Implementation
15/50
In recent years, technology scaling has increased the role of leakage power in the
overall power consumption of circuits. Supply voltage reduction is a widely accepted
methodology for reducing dynamic power, but it has an adverse effect on circuit
performance. To maintain high performance, the threshold voltage Vt must also be scaled
down which causes an exponential increase in the sub-threshold leakage currents. This is a
more potent problem in deep-sub micron technologies. In applications which involve large
standby times, this high sub-threshold leakage can be detrimental to the overall power
consumption of the circuit. Multi-threshold CMOS has emerged as an effective technique for
reducing sub-threshold currents in the standby mode while maintaining circuit performance.
MTCMOS technology essentially places a sleep transistor on gates and puts them in sleep
mode when the circuit is non-operational. State of the art techniques in leakage optimization
using MTCMOS essentially assign a sleep transistor to each gate and size them such that all
gates have a fixed slowdown. This is followed by a clustering approach that clusters gates
with mutually exclusive switching patterns. This reduces the overall area penalty of the
MTCMOS transistor. There are several problems in this approach. First the traditional
approach sizes the sleep transistors such that all gates have the same slowdown. It does not
investigate the possibility of slowing down non-critical gates more than critical gates for
better improvements in leakage. Second, it has been shown that clustering MTCMOS gates
has adverse effects on signal integrity due to ground bounce issues. In this work we address
these issues by developing a fine grained methodology for MTCMOS based leakage
optimization. First assign sleep transistors selectively to gates such that the overall slack
could be effectively utilized. Moreover, dont perform clustering, hence the signal integrity
issues are not critical in our approach.
As shown in figure 3.5(a), low Vt logic modules or gates are connected to the virtual
supply rails through high Vt sleep transistors which behave similar to a linear resistor in
active mode as shown in figure 3.5(b). The high threshold sleep transistor is controlled using
the Sleep signal and limits the leakage current to a low value in the standby mode.
The load dependent delay di of a gate i in the absence of a sleep transistor can be
expressed as
15
-
8/3/2019 Sram Implementation
16/50
where CL is the load capacitance at the gate output, VtL is the low voltage threshold =350mV,
Vdd = 1.8 V and is the velocity saturation index ( 1.3 in 0.18-m CMOS technology). In
the presence of a sleep transistor, the propagation delay of a gate can be expressed as
Where Vx is the potential of the virtual rails as shown in figure 1 and K is the proportionality
constant. Let us suppose Isleep ON is the current flowing in the gate during active mode of
operation. During this mode, the sleep transistor is in the linear region of operation. Using the
basic device equations for a transistor in linear region, the drain to source current in the sleep
transistor (which is the same as Isleep ON) is given by
The sub-threshold leakage current Ileak in the sleep mode will be determined by the sleep
transistor and is expressed as given by
Where n is the N-mobility, Cox is the oxide capacitance, Vth is the high threshold
voltage (= 500 mV), VT is the thermal voltage = 26mV and n is the sub-threshold swing
parameter.
Equation 2 establishes a relation between delay of a gate disleep and Vx. By replacing
Vx in equation 4 in terms of disleep (using equation 2), we get a dependence between (W/L)
sleep and disleep (assuming the ON current is constant for each gate). Thus, a range of (W/L)
sleep for the sleep transistor would correspond to a range of gate delays. Finally, (W/L) sleep
in equation 5 can be replaced in terms of disleep, hence establishing a relationship between
gate delay and gate leakage. The final relation between leakage and delay can be expressed as
This relationship exists for only those gates that have a sleep transistor assigned to
them. Note that the moment a sleep transistor is assigned, some delay penalty is incurred. The
range of delay that a gate can have is decided by the range of the acceptable (W/L) sleep. The
objective of sleep transistor sizing is to decide the best values of (W/L) sleep for all sleep
transistors such that the global delay constraint is satisfied and the total leakage is minimized.
16
-
8/3/2019 Sram Implementation
17/50
Fig 3.5 : Sleep Transistors in MTCMOS circuits
3.5TYPE OF TRANSISTORS
1. Low Vth Transistors (lvt)
The Low Vth transistor type is the fastest available favor in the STM 90nm general
purpose technology, and is used for applications where the speed is of primary importance.
The disadvantage of this type of transistors is that, due to the low threshold voltage (Vth), the
static power is very high.
2. Standard Vth Transistors (svt)
The Standard Vth transistor type is an all-purpose favor where delay and static
power has been traded-off to match typical design requirements. The procedure used to
characterize this technology variation is exactly the same as the one used for lvt.
3. High Vth Transistors (hvt)
The High Vth transistor type is a favor especially optimized for extremely low static
power consumption. Typical applications for this technology variation are circuit idle most of
the time and/or where speed/performance are not of utmost importance. The procedure used
to characterize this technology variation is exactly the same as the one used for lvt.
17
-
8/3/2019 Sram Implementation
18/50
3.6 MTCMOS ADVANTAGE
-Performance can be improved and the leakage current minimized.
-Sub-threshold leakage current is reduced by the sleep transistor while performance loss is
controlled.
3.7 MTCMOS DISADVANTAGE
MTCMOS has a serious problem that the stored data of latches and flip-flops in logic
blocks cannot be preserved when the power supply is turned off (sleep mode).Therefore,
extra circuits and complex timing design must be provided for holding the stored data. These
cause great penalties on performance, power and area of the system.
3.8 MTCMOS DESIGN WITH PMOS SLEEP CONTROL TRANSISTOR
Placing a high Vth PMOS transistor between Vdd and the logic block results in the
MTCMOS design with PMOS sleep control transistor as shown in fig 3.6 .
Fig 3.6 : Modified MTCMOS design with PMOS sleep control transistor
18
-
8/3/2019 Sram Implementation
19/50
CHAPTER-4
IMPLEMENTATION OF SRAM
The project involves in the implementation of SRAM using MTCMOS technique in
Cadence- Virtuoso Analog Design Environment.
4.1 DESIGN FLOW
4.2 CMOS LOGIC
In CMOS (complementary MOS) logic, only the two complementary MOSFET
transistors: n-channel also known as NMOS, andp-channel also known as PMOS are used to
create the circuit. The logic symbols for the NMOS and PMOS transistors are shown in
Figure (a) and Figure (b), respectively. In designing CMOS circuits, we are interested only in
the three connectionssource, drain, and gateof the transistor. The substrate for the
NMOS is always connected to ground, while the substrate for the PMOS is always connected
to VCC. Notice that the only difference between these two logic symbols is that one has a
circle at the gate input, while the other does not. Using the convention that the circle denotes
active-low i.e., a 0 activates the signal for PMOS, the NMOS gate input is active-high.
The operation of the NMOS transistor is as follows:
When the input at gate is a 1, the NMOS transistor is turned on or enabled, and the
source input that is supplying the 0 can pass through to the drain output through the
connecting n-channel. However, if the source has a 1, the 1 will not pass through to the drain
even if the transistor is turned on, because the NMOS does not create a p-channel. Instead,
only a weak 1 will pass through to the drain. On the other hand, when the gate is a 0 (or any
value other than a 1), the transistor is turned off, and the connection between the source andthe drain is disconnected. In this case, the drain will always have a high impedance Zvalue
19
Fig 4.1: Design Flow ofSRAM
-
8/3/2019 Sram Implementation
20/50
independent of the source value. The (dont-care) in the Input Signal column means that it
doesnt matter what the input value is, the output will be Z. The high-impedance value,
denoted byZ, means no value or no output. This is like having an insulator with an infinite
resistance or a break in a wire, therefore, whatever the input is, it will not pass over to the
output.
Fig 4.2 :NMOS symbol Fig 4.3:Truth Table
The PMOS transistor works exactly the opposite of the NMOS transistor. The
operation of the PMOS transistor is as follows.
When the input at gate is a 0, the PMOS transistor is turned on or enabled, and the
source input that is supplying the 1 can pass through to the drain output through the
connectingp-channel. However, if the source has a 0, the 0 will not pass through to the drain
even if the transistor is turned on, because the PMOS does not create an n-channel. Instead,
only a weak 0 will pass through to the drain. On the other hand, when the gate is a 1 (or any
value other than a 0), the transistor is turned off, and the connection between the source and
the drain is disconnected. In this case, the drain will always have a high-impedanceZvalue
independent of the source value.
(a) (b)
Fig 4.4: PMOS Transistor a)symbol b) Truth table
20
-
8/3/2019 Sram Implementation
21/50
4.3 INVERTER
When the gate input is a 1, the bottom NMOS transistor is turned on while the top
PMOS transistor is turned off. With this configuration, a 0 from ground will pass through the
bottom NMOS transistor to the output while the top PMOS transistor will output a high-
impedance Zvalue. A Zcombined with a 0 is still a 0, because a high-impedance is of no
value.
Alternatively, when the gate input is a 0, the bottom NMOS transistor is turned off
while the top PMOS transistor is turned on. In this case, a 1 from VCC will pass through the
top PMOS transistor to the output while the bottom NMOS tr
ansistor will output aZ. The resulting output value is a 1.
(b)
(a)
Fig 4.5 INVERTER (a) circuit (b)truth table
21
-
8/3/2019 Sram Implementation
22/50
Fig 4.6 Switch model for INVERTER (a) low input; (b) high input
4.4 NAND GATE
If either input isLOW
, the outputZ
has a low-impedance connection to VDD throughthe corresponding on p-channel transistor, and the path to ground is blocked by the
corresponding off n-channel transistor. If both inputs are HIGH, the path to VDD is
blocked, and Z has a low-impedance connection to ground.
(b)
(c)
(a)
Fig 4.7 NAND GATE (a)circuit (b) truth table (c) symbol
22
-
8/3/2019 Sram Implementation
23/50
Fig 4.8 : Switch model for 2 input NAND gate (a) both inputs low;
(b) one input high; (c) both inputs high
4.5 D LATCH WITH ENABLE
When the E input is asserted, the Q output follows the D input. In this situation, the latch
is said to be open and the path from D input to Q output is transparent; the circuit is
often called a transparent latch for this reason. When the E input is negated, the latch
closes; the Q output retains its last value and no longer changes in response to D, as long as
E remains negated.
Fig :4.9 D-Latch with enable
4.6 TRI-STATE BUFFER
A tri-state buffer, as the name suggests, has three states: 0, 1, and a third state
denoted byZ. The valueZrepresents a high-impedance state, which acts like a switch that is
23
-
8/3/2019 Sram Implementation
24/50
opened or a wire that iscut. Tri-state buffers are used to connect several devices to the same
bus. A bus is one or more wire for transferringsignals. If two or more devices are connected
directly to a bus without using tri-state buffers, signals will getcorrupted on the bus because
the devices are always outputting either a 0 or a 1. However, with a tri-state buffer in
between, devices that are not using the bus can disable the tri-state buffer so that it acts as if
those devices are physically disconnected from the bus. At any one time, only one active
device will have its tri-state buffers enabled, and thus, use the bus.
The active high enable lineEturns the buffer on or off. WhenEis de-asserted with a
0, the tri-state buffer is disabled, and the outputy is in its high-impedanceZstate. WhenEis
asserted with a 1, the buffer is enabled, and the outputy follows the input d.
The truth table is derived as follows.
WhenE= 0, it does not matter what the input dis, we want both transistors to be
disabled so that the output y has the Z value. The PMOS transistor is disabled
when the input A = 1; whereas, the NMOS transistor is disabled when the input
B= 0.
WhenE= 1 and d= 0, the output y is 0. To get a 0 ony, we need to enable the
bottom NMOS transistor and disable the top PMOS transistor so that a 0 will pass
through the NMOS transistor toy.
When E = 1 and d = 1, the output y is 1. Here we need to do the reverse by
enabling the top PMOS transistor and disabling the bottom NMOS transistor.
Fig 4.10:Tri-state buffer: (a) truth table; (b) logic symbol; (c) circuit; (d) truth table for
the control portion of the tri-state buffer circuit
4.7 MEMORY CELL
24
-
8/3/2019 Sram Implementation
25/50
Each bit in a static RAM chip is stored in a memory cell similar to the circuit shown
in Fig 4.11 (a). The main component in the cell is a D latch with enable. A tri-state buffer is
connected to the output of the D latch so that it can be selectively read from. The Cell enable
signal is used to enable the memory cell for both reading and writing. For reading, the Cell
enable signal is used to enable the tri-state buffer. For writing, the Cell enable together with
the Write enable signals are used to enable the D latch so that the data on the Inputline is
latched into the cell.
Fig4.11 :Memory cell (a) circuit; (b) logic symbol.
4.8 MEMORY TIMING DIAGRAM
The write operation begins with a valid address on the address lines and valid data on
the data lines, followed immediately by the CEline being asserted. As soon as the WR line is
asserted, the data present on the data lines is written into the memory location that is
addressed by the address lines.
A memory read operation also begins with setting a valid address on the address lines,
followed by CEgoing high. The WR line is then pulled low, and shortly after, valid data from
the addressed memory location is available on the data lines.
Fig4.12 Memory Timing Diagram (a) read operation (b) write operation.
4.9 STATIC RANDOM ACCESS MEMORY25
-
8/3/2019 Sram Implementation
26/50
In SRAM, there is a set of data lines, Di, and a set of address lines, Ai. The data lines
serve for both input and output of the data to the location that is specified by the address
lines. The number of data lines is dependent on how many bits are used for storing data in
each memory location. The number of address lines is dependent on how many locations are
in the memory chip.
In addition to the data and address lines, there are usually two control lines: chip
enable (CE), and write enable (WR). In order for a microprocessor to access memory, either
with the read operation or the write operation, the active-high CEline must first be asserted.
Asserting the CEline enables the entire memory chip. The active-high WR line selects which
of the two memory operations is to be performed. Setting WR to a 0 selects the read
operation, and data from the memory is retrieved. Setting WR to a 1 selects the write
operation, and data from the microprocessor is written into the memory. Instead of having
just the WR line for selecting the two operations, read and write, some memory chips have
both a read enable and a write enable line. In this case, only one line can be asserted at any
one time. The memory location in which the read and write operations are to take place, of
course, is selected by the value of the address lines. The operation of the memory chip is
shown in Figure 4.8(b).
Fig 4.13 A 2nx m RAM chip: (a) logic symbol; (b) operation table.
Notice in Fig 4.13(a) that the RAM chip does not require a clock signal. Both the read
and write memory operations are not synchronized to the global system clock. Instead the
data operations are synchronized to the two control lines, CEand WR.
To create a 8X8 static RAM chip, we need 64 memory cells forming a 8X8 grid, as
shown in Figure 4.8.2.
Each row forms a single storage location, and the number of memory cells in a row
determines the bit width of each location. So all of the memory cells in a row are enabled
26
-
8/3/2019 Sram Implementation
27/50
with the same address. Again, a decoder is used to decode the address lines, A0,A1,A2. In
this example, a 3 to 8 decoder is used to decode the eight address locations. The CE
signal is for enabling the chip, specifically to enable the read and write functions through the
two AND gates.
The data comes in from the external data bus, Di, through the input buffer and to the
Inputline of each memory cell. The purpose of using an input buffer for each data line is so
that the external signal coming in, only needs to drive just one device (the buffer) rather than
having to drive several devices (i.e., all of the memory cells in the same column). Which row
of memory cells actually gets written to will depend on the given address. The read operation
requires CEto be asserted and WR to be de-asserted. This will assert the internalREsignal,
which in turn will enable the eight output tri-state buffers at the bottom of the circuit diagram.
Again, the location that is read from is selected by the address lines.
Fig 4.14 A 8X8 SRAM chip circuit.
27
-
8/3/2019 Sram Implementation
28/50
CHAPTER-5
SIMULATION RESULTS
5.1 OUTPUT WAVEFORMS OF SRAM WITHOUT MTCMOS
28
-
8/3/2019 Sram Implementation
29/50
Fig 5.1: Output waveforms of SRAM without MTCMOS
29
-
8/3/2019 Sram Implementation
30/50
5.2 OUTPUT WAVEFORMS OF SRAM WITH MTCMOS
30
-
8/3/2019 Sram Implementation
31/50
31
-
8/3/2019 Sram Implementation
32/50
32
-
8/3/2019 Sram Implementation
33/50
Fig 5.2: Output waveforms of SRAM with MTCMOS
5.3 COMPARISON OF POWER CONSUMED WITH AND WITHOUT
MTCMOS
S.NONAME OF THE
COMPONENT
WITHOUT
MTCMOSWITH MTCMOS
1 INVERTER 117.5nW 71.53nW
2 NAND GATE 248.9nw 98.12nW
3 D-LATCH 1.507uW 0.6268uW
4TRI-STATE
BUFFER1.216uW 0.801uW
5 3x8 DECODER 9.47uW 2.113uW
6 MEMORY CELL 1.559uW 0.368uW
7
8x8
SRAM 97.92uW 13.11uW
33
-
8/3/2019 Sram Implementation
34/50
Table 5.1 Comparison of power consumed with and without MTCMOS
34
-
8/3/2019 Sram Implementation
35/50
CHAPTER-6
BIBILIOGRAPHY
6.1 REFERENCE BOOKS
Jack Horgan, Low Power Soc Design, EDAWeekly Review May 17 - 21, 2004
Cadence, Low Power in EncounterTM RTL Compiler, Product Version 5.2,
December 2005
Cadence, Cadence Low Power Design Flow
Cadence, Low Power Application Note for RC 4.1 and SoCE 4.1 USR3, Version
1.0,1/14/2005
V.Kursun and E. G. Friedman,Multi-Voltage CMOS Circuit Design.New York: Wiley,
2006.
A. Chandrakasan and B. Brodersen, editors,Low Power CMOS Design", IEEE Press,
1998.
J.K. Kao and A. Chandrakasan,Dual-Threshold Voltage Techniques for Low-Power
Digital Circuits",IEEE Journal of Solid State Circuits, Vol. 35, No. 7,pp. 1009-1018,
July 2000.
Liqiong Wei, Zhanping Chen, Roy, K., Yibin Ye, De, V., Mixed-Vth (MVT) CMOS
Circuit Design Methodology for Low Power Applications Design Automation
Conference, 1999. Proceedings. 36th, Jun. 1999, pp. 430-435.
M. Anis, S. Areibi, and M. Elmasry, Design and Optimization of Multithreshold
CMOS(MTCMOS) Circuits, IEEE Transaction on Computer-Aided Design of
Integrated Circuits and Systems,vol. 22, no. 10, pp. 1324-1342, Oct. 2003.
S. Sirichotiyakul and et al., Stand-by Power Minimization through Simultaneous
Threshold Voltage Selection and Circuit Sizing, Proc. of the DAC, pp. 436-441,
1999.
Essentials of VLSI circuits and systems- Kamran Eshraghain , Eshraghian Dougles
and A.Pucknell,PHI,2005 Edition
Digital Design Principles &Practices- John F. Wakerly , PHI/ Pearson Education
Asia, 3rd Ed., 2005
Digital Logic and microproccesor design with VHDL-Enoch O.Hwang
APPENDIX A35
-
8/3/2019 Sram Implementation
36/50
A.1 Cadence: VirtuosoAnalog Design Environment
Cadence is an Electronic Design Automation (EDA) environment which allows
different applications and tools to integrate into a single framework thus allowing to support
all the stages of IC design and verification from a single environment. These tools are
completely general, supporting different fabrication technologies.
Fig A.1 :Cadence design flow
A.2 Various Design steps
Firstly a schematic view of the circuit is created using the Cadence Composer
Schematic Editor. Alternatively, a text netlist input can be employed. Then, the circuit is
simulated using the Cadence Affirma analog simulation environment. Different simulators
can be employed, some sold with the Cadence software (e.g., Spectre) some from other
vendors (e.g., HSPICE) if they are installed and licensed.
1. Invoking Cadence tool
The command Interpreter Window can be invoked by typing icfb90The tool is
available on vlsi34, vlsi35, vlsi36, vlsi27. The following window will appear on the screen on
invoking the command.
36
-
8/3/2019 Sram Implementation
37/50
Fig A.2 Log Window
2. Create Library
In order to create the library go to Tools >Library Manager on the Tools menu of the CIW.
Fig A.3 Library window
Now to create a new library go to File >New >Library from the File menu of the Library
Manager.
37
-
8/3/2019 Sram Implementation
38/50
Fig A.4 Library Creation window
3.Create Schematic
Start by clicking on the library (created by you) in the Library Manager window, then
go to File >New >Cell View and fill in with Inverter ( in this case) as the cell name,
schematic as the view name, and Composer Schematicas the tool, then press OK.
Fig A.5 File Creation window
38
-
8/3/2019 Sram Implementation
39/50
An empty window appears as the next figure.
Fig A.6 Schematic Window
Now place the instances. Add the I/O pins.Add the wires.
Now you need to Check and Save your design (either the top left button or Design >Check
and Save).
Make sure you look at the CIW window and there are no errors or warnings, if there are any
you have to go back and fix them! Assuming there are no errors we are now ready to start
simulation!
39
-
8/3/2019 Sram Implementation
40/50
3 Simulation
In the Virtuoso Schematic window go to Tools >Analog Environment. The design should
be set to the right Library, Cell and View.
Fig A.7 Simulation window
5.Choosing the Analyses
In the Affirma Analog Circuit Design Environment window, click Analysis Choose
pull down menu to open the analyses window.Several analyses modes are set up.
40
-
8/3/2019 Sram Implementation
41/50
6.Transient Analysis
In the Analysis Section, select transient time and set the Stop Time and Before
Clicking OK button, click APPLY button.
Fig A.8 Analysis Window
7. Saving and Plotting Simulation Data
Select Output To be Plotted Select on Schematic to select nodes to be
plotted. By clicking on the wire on the schematic window to select voltage node, and by
clicking on the terminals to select currents. Select the input and output wires in the circuit.
Observe the simulation window as the wires get added.
8.Run the Simulation
Click on the Run Simulation icon.
When it completes, the plots are shown automatically.
APPENDIX B41
-
8/3/2019 Sram Implementation
42/50
SCHEMATIC VIEWS
B.1 INVERTER CIRCUIT
Fig B.1 : Inverter without MTCMOS
42
-
8/3/2019 Sram Implementation
43/50
Fig B.2 :Inverter with MTCMOS
B.2 SRAM 8X8 CHIP CIRCUIT43
-
8/3/2019 Sram Implementation
44/50
Fig B.3 :SRAM without MTCMOS
44
-
8/3/2019 Sram Implementation
45/50
Fig B.4 :SRAM with MTCMOS
45
-
8/3/2019 Sram Implementation
46/50
B.3 POWER CALCLATION WINDOW
46
-
8/3/2019 Sram Implementation
47/50
47
-
8/3/2019 Sram Implementation
48/50
Fig B.5 : Power Calculation window without MTCMOS
48
-
8/3/2019 Sram Implementation
49/50
49
-
8/3/2019 Sram Implementation
50/50
Fig B.6 :Power Calculation window with MTCMOS