cordic algorithm
TRANSCRIPT
ABSTRACT
CHAPTER-1
CORDIC THEORY: AN ALGORITHM FOR VECTOR ROTATION
All of the trigonometric functions can be computed or derived from functions using
vector rotations, as will be discussed in the following sections. Vector rotation can also
be used for polar to rectangular and rectangular to polar conversions, for vector
magnitude, and as a building block in certain transforms such as the DFT and DCT. The
CORDIC algorithm provides an iterative method of performing vector rotations by
arbitrary angles using only shifts and adds. The algorithm, credited to Volder, is derived
from the general (Givens) rotation transform
x’ xcosf-ysinf y’ ycosf xsinf
which rotates a vector in a Cartesian plane by the angle f. These can be rearranged so that:
x’ cosf x- ytanf
y’ cosfy xtanf
So far, nothing is simplified. However, if the rotation angles are restricted so that tan(f-
2-i, the multiplication by the tangent term is reduced to simple shift operation. Arbitrary
angles of rotation are obtainable by performing a series of successively smaller elementary
rotations. If the decision at each iteration, i, is which direction to rotate rather than
whether or not to rotate, then the cos(di) term becomes a constant (because cos(di) =
cos(-di)). The iterative rotation can now be expressed as:
X i+1= Ki xi- yid 2 –i]
Y i+1= Ki yi- xid 2 –i]
WhereKi costan -12 -i 1 1 2di + or -1
Removing the scale constant from the iterative equations yields a shift-add algorithm for
vector rotation. The product of the Ki’s can be applied elsewhere in the system or treated as
part of a system processing gain. That product approaches 0.6073 as the number of iterations
goes to infinity. Therefore, the rotation algorithm has a gain, An of approximately 1.647. The
exact gain depends on the number of iterations, and obeys the relation
An = n
The angle of a composite rotation is uniquely defined by the sequence of the directions of the
elementary rotations. That sequence can be represented by a decision vector. The set of all
possible decision vectors is an angular measurement system based on binary arctangents.
Conversions between this angular system and any other can be accomplished using a look-
up. A better conversion method uses an additional adder-subtractor that accumulates the
elementary rotation angles at each iteration. The elementary angles can be expressed in any
convenient angular unit. Those angular values are supplied by a small lookup table (one entry
per iteration) or are hardwired, depending on the implementation.
arctangent base, this extra element is not needed.
The CORDIC rotator is normally operated in one of two modes. The first, called rotation by
Volder, rotates the input vector by a specified angle (given as an argument). The second
mode, called vectoring, rotates the input vector to the x axis while recording the angle required to
make that rotation. In rotation mode, the angle accumulator is initialized with the desired
rotation angle. The rotation decision at each iteration is made to diminish the magnitude of
the residual angle in the angle accumulator. The decision at each iteration is therefore
based on the sign of the residual angle after each step. Naturally, if the input angle is
already expressed in the binary arctangent base, the angle accumulator may be
eliminated.
xi1 xiyidi 2-i
yyixid 2-i
zi1 zidi tan-i 2-i di= -1 if zi < 0, +1 otherwise
which provides the following result:
x A x cosz y sinz n n 0 0 0 0
yA y cosz x sinz n n 0 0 0 0
z n=0
In the vectoring mode, the CORDIC rotator rotates the input vector through whatever angle is necessary to
align the result vector with the x axis. The result of the vectoring operation is a rotation angle and the scaled
magnitude of the original vector (the x component of the result). The vectoring function works by seeking
to minimize the y component of the residual vector at each rotation. The sign of the residual y component is
used to determine which direction to rotate next. If the angle accumulator is initialized with zero, it will
contain the traversed angle at the end of the iterations. In vectoring mode, the CORDIC equations are:
x1 xiyidi 2-i
y i+1yixid 2-i
zi1 zidi tan-i 2-i
The CORDIC rotation and vectoring algorithms as stated are limited to rotation angles between -/2 and /2.
This limitation is due to the use of 20 for the tangent in the first iteration. For composite rotation angles larger than
/2, an additional rotation is required. Volder[4] describes an initial rotation /2. This gives the correction
iteration:
x’dy
y’dx
z’zd2
where d = +1 if y<0, -1 otherwise.
There is no growth for this initial rotation. Alternatively, an initial rotation of either or 0 can be made, avoiding
the reassignment of the x and y components to the rotator elements. Again, there is no growth due to the
initial rotation:
x’dx
y’dy
z’ = z if d= 1, or z - if d= -1 d = -1 if x<0, +1 otherwise.
Both reduction forms assume a modulo 2 representation of the input angle. The style of first reduction is
more consistent with the succeeding rotations, while the second reduction may be more convenient when
wiring is restricted, as is often the case with FPGAs.
The CORDIC rotator described is usable to compute several trigonometric functions directly and
others indirectly. Judicious choice of initial values and modes permits direct computation of sine, cosine,
arctangent, vector magnitude and transformations between polar and Cartesian coordinates.
Sine and Cosine
The rotational mode CORDIC operation can simultaneously compute the sine and cosine of the
input angle. Setting the y component of the input vector to zero reduces the rotation mode result to
xn Anx0cosz 0
yn Anx0sinz 0
By setting x0 equal to 1/ An, the rotation produces the unscaled sine and cosine of the angle
rgument, z0. Very often, the sine and cosine values modulate a magnitude value. Using
other techniques (e.g., a look up table) requires a pair of multipliers to obtain the
modulation. The CORDIC technique performs the multiply as part of the rotation operation,
and therefore eliminates the need for a pair of explicit multipliers. The output of the
CORDIC rotator is scaled by the rotator gain. If the gain is not acceptable, a single
multiply by the reciprocal of the gain constant placed before the CORDIC rotator will yield
unscaled results. It is worth noting that the hardware complexity of the CORDIC
rotator is approximately equivalent to that of a single multiplier with the same word
size.
Polar to Rectangular Transformation
A logical extension to the sine and cosine computer is a polar to Cartesian coordinate
transformer. The transformation from polar to Cartesian space is defined by:
x=rcos
y = rsin
As pointed out above, the multiplication by the magnitude comes for free using the
CORDIC rotator. The transformation is accomplished by selecting the rotation mode with
x0=polar magnitude, z0=polar phase, and y0=0. The vector result represents the polar input
transformed to Cartesian space. The transform has a gain equal to the rotator gain, which
needs to be accounted for somewhere in the system. If the gain is unacceptable, the polar
magnitude may be multiplied by the reciprocal of the rotator gain before it is presented to the
CORDIC rotator
General vector rotation
The rotation mode CORDIC rotator is also useful for performing general vector
rotations, as are often encountered in motion correction and control systems. For general
rotation, the 2 dimensional input vector is presented to the rotator inputs. The rotator rotates the
vector through the desired angle. The output is scaled by the CORDIC rotator gain, which
must be accounted for elsewhere in the system. If the scaling is unacceptable, a pair of
constant multipliers is required to compensate for the gain. CORDIC rotators may be
cascaded in a tree architecture for general rotation in n-dimensions. Some optimization of
multidimensional rotation is possible to permit computational savings over the general
n-dimensioned case, as reported by Hsiao et al.
ArctangentThe arctangent, =Atan(y/x), is directly computed using the vectoring mode
CORDIC rotator if the angle accumulator is initialized with zero. The argument must
be provided as a ratio expressed as a vector (x, y). Presenting represent infinity (by setting x=0).
Since the arctangent result is taken from the angle accumulator, the CORDIC rotator growth does not affect
the result.
Vector MagnitudeThe vectoring mode CORDIC rotator produces the magnitude of the input vector as a
byproduct of computing the arctangent. After the vectoring mode rotation, the vector is
aligned with the x axis. The magnitude of the rotated vector. The magnitude result is scaled
by the processor gain, which needs to be accounted for elsewhere in the system. This
implementation of vector magnitude has a hardware complexity of roughly one
multiplier of the same width. The CORDIC implementation represents a significant
hardware savings over an equivalent Pythagorean processor. The accuracy of the
magnitude result improves by 2 bits for each iteration performed.
Cartesian to Polar transformation
The Cartesian to Polar transformation consists of finding the magnitude (r=sqrt(x2+y2)) and phase angle
(=atan[y/x]) of the input vector, (x, y). The reader will immediately recognize that both functions are provided
simultaneously by the vectoring mode CORDIC rotator. The magnitude of the result will be scaled by the
CORDIC rotator gain, and should be accounted for elsewhere in the system. If the
gain is unacceptable, it can be corrected by multiplying the resulting magnitude by the reciprocal of the gain
constant.
Inverse CORDIC functions
In most cases, if a function can be generated by a CORDIC style computer, its inverse can also
be computed. Unless the inverse is calculable by changing the mode of the rotator, its
computation normally involves comparing the output to a target value. The CORDIC inverse is
illustrated by the Arcsine function.
Arcsine and Arccosine
The Arcsine can be computed by starting with a unit vector on the positive x axis, then
rotating it so that its y component is equal to the input argument. The arcsine is
then the angle subtended to cause the y component of the rotated vector to match the
argument. The decisionfunction in this case is the result of a comparison between
the input value and the y component of the rotated vector at each iteration:
x1 xiyidi 2-i
y i+1yixid 2-i
zi1 zidi tan-i 2-i
The arcsine function as stated above returns correct angles for inputs -1 < c/Anx0 < 1, although the accuracy suffers
as the input approaches 1 (the error increases rapidly for inputs larger than about 0.98). This loss of accuracy
is due to the gain of the rotator. For angles near the y axis, the rotator gain causes the rotated vector to be
shorter than the reference (input), so the decisions are made improperly.
The gain problems can be corrected using a “double iteration algorithm”[9] at the cost of an increase
in complexity.The Arccosine computation is similar, except the difference between the x component and the
input is used as the decision function. Without modification, the arccosine algorithm works only for inputs
less than 1/An, making the double iteration algorithm a necessity. The Arccosine could also be computed by
using the arcsine function and subtracting /2 from the result, followed by an angular reduction if the result is in
the fourth quadrant.
IMPLEMENTATION IN AN FPGAThere are a number of ways to implement a CORDIC processor. The ideal architecture
depends on the speed versus area tradeoffs in the intended application. First we will
examine an iterative architecture that is a direct translation from the CORDIC
equations. From there, we will look at a minimum hardware solution and a maximum
performance solution.
Iterative CORDIC Processors
An iterative CORDIC architecture can be obtained simply by duplicating each of the three
difference equations in hardware as shown in Figure 1. The decision function, di, is driven by
the sign of the y or z register depending on whether it is operated in rotation or vectoring
mode. In operation, the initial values are loaded via multiplexers into the x, y and z registers.
Then on each of the next n clock cycles, the values from the registers are passed through the
shifters and adder-subtractors and the results placed back inthe registers. The shifters are modified
on each iteration to cause the desired shift for the iteration. Likewise, the ROM address is
incremented on each iteration so that theappropriate elementary angle value is presented
to the z adder-subtractor. On the last iteration, the results are read directly from the adder-
subtractors. Obviously, a simple state machine is required keep track of the current iteration,
and to select the degree of shift and ROM address for each iteration.The design depicted in
Figure 1 uses word-wide data paths (called bit-parallel design). The bit-parallel variable shift
shifters do not map well to FPGA architectures because of the high fan-in required. If
implemented, those shifters will typically require several layers of logic (i.e., the signal will
need to pass through a number of FPGA cells). The result is a slow design that uses a large
number of logic cells.
A considerably more compact design is possible using bit serial arithmetic. The simplified
interconnect and logic in a bit serial design allows it to work at a much higher clock rate than
the equivalent bit parallel design. Of course, the design also needs to clocked w times for each
iteration (w is the width of the data word). The bit serial design consists of three bit serial adder-
subtractors, three shift registers and a serial Read Only Memory (ROM). Each shift register has a
length equal to the word width. There is also some gating or multiplexers to select taps
off the shift registers for the right shifted cross terms (shifting is accomplished using bit
delays in bit serial systems). In this design, w clocks are required for each of the n
iterations, where w is precision of the adders. In operation, the load multiplexers on the left are
opened for w clock periods to initialize the x, y and z registers (these registers could also be
parallel loaded to initialize). Once loaded, the data is shifted right through the serial adder-
subtractors and returned to the left end of the register. Each iteration requires w clocks to
return the result to the register. At the beginning of each iteration, the control state machine
reads the sign of the y (or z) register and sets the add/subtract controls accordingly.
The appropriate tap off the register for the cross terms is also selected at the beginning
of each iteration. During the nth iteration, the results can be read from the outputs of the
serial adders while the next initialization data is shifted into the registers
The simplicity of the bit serial design is apparent. Even in this case, the wiring of the
shift tap multiplexers can present problems in some FPGAs (this is one place where tri-state
long lines can come in handy). Even so, the interconnect is minimal and the logic between
registers is simple. This combination permits bit clock ratesnear the maximum toggle
frequency of the FPGA. The possibility of using extreme bit clock frequencies makes up for
the large number of clock cycles required to complete each rotation Now, if the design isin a
Xilinx 4000E series part, the shift registers can be implemented in the CLB RAM[2]. The
RAM emulates a shift register by incrementing the read/write address after each
access. The dual port capability of the CLB RAM provides the capability to read
two locations in the 16x1 RAM simultaneously [9]. By properly sequencing the second
address, the effect of the shift tap multiplexer is achieved without a physical
multiplexer. The result is the shift register and multiplexerfor word lengths up to 16 bits are
implemented in a single CLB (plus 8 CLBs for the 2 address sequencers and iteration
counter, which are shared by the three shifters). The serial ROM also uses the CLB for data
storage. One CLB is required for every two iterations. The 16 bit, 8 iteration CORDIC
processor shown in Figure 3 uses only21 CLBs, and will run at bit rates up to about 90
MHz (mainly limited by the RAM write cycle). This translates to about a 1.5S processing
time, which is only about three and a half times longer than the best one could expect from the
much larger bit parallel iterative solution
On-Line CORDIC Processo
The CORDIC processors discussed so far are iterative, which means the processor has to
perform iterations at ntimes the data rate. The iteration process can unrolled so that each of n
processing elements always performs the same iteration. An unrolled CORDIC processor is shown
in Figure 4. Unrolling the processor results in two significant implifications. First the shifters
are each a fixed shift,which means that they can be implemented in the wiring.
Second, the lookup values for the angle accumulator are distributed as constants to each
adder in the angleaccumulator chain. Those constants can be hardwired instead of
requiring storage space. The entire CORDIC processor is reduced to an array of
interconnected adder-subtractors. The need for registers is also eliminated, making the
unrolled processor strictly combinatorial. The delay through the resulting circuit would be
substantial, but the processing time is reduced from that required by the iterative circuit (if by
nothing else than the set-up and hold times of the register). Most times, especially in an FPGA, it
does not make sense to use such a large combinatorial circuit. The unrolled processor is
easily pipelined by inserting registers between the adder-subtractors. In the case of most
FPGA architectures there are already registers present in each logic cell, so the addition of the
pipeline registers has no hardware cost The unrolled processor can also be converted to a bit
serial design. Each adder subtractor is replaced by a serial adder-subtractor, separated by w bit
shift registers. The shift registers are necessary to extract the sign of the y or z element
before the first bits (lsbs) reach the next adder-subtractors. The right shifted cross terms are
taken from fixed taps in the shift registers. Some method of sign extension for the shifted
terms is required too
INTRODUCTION TO VERILOG HDL
What is HDL
A typical Hardware Description Language (HDL) supports a mixed-level description in
which gate and netlist constructs are used with functional descriptions. This mixed-level
capability enables you to describe system architectures at a high level of abstraction, then
incrementally refine a design’s detailed gate-level implementation.
HDL descriptions offer the following advantages:
• We can verify design functionality early in the design process. A design written as an HDL
description can be simulated immediately. Design simulation at this high level — at the
gate-level before implementation — allows you to evaluate architectural and design decisions.
• An HDL description is more easily read and understood than a netlist or schematic description.
HDL descriptions provide technology-independent documentation of a design and its
functionality. Because the initial HDL design description is technology independent, you
can use it again to generate the design in a different technology, without having to
translate it from the original technology.
• Large designs are easier to handle with HDL tools than schematic tools.
Verilog Overview :
Introduction
Verilog is a HARDWARE DESCRIPTION LANGUAGE (HDL). A hardware description
Language is a language used to describe a digital system, for example, a microprocessor or
a memory or a simple flip-flop. This just means that, by using a HDL one can describe any
hardware (digital ) at any level.
Verilog provides both behavioral and structural language structures. These structures allow
expressing design objects at high and low levels of abstraction. Designing hardware with a
language such as Verilog allows using software concepts such as parallel processing and
object-oriented programming. Verilog has a syntax similar to C and Pascal.
Design Styles
Verilog like any other hardware description language permits the designers to create a design in
either Bottom-up or Top-down methodology.
Bottom-Up Design
The traditional method of electronic design is bottom-up. Each design is performed at the
gate-level using the standard gates. With increasing complexity of new designs this
approach is nearly impossible to maintain. New systems consist of ASIC or
microprocessors with a complexity of thousands of transistors. These traditional bottom-up
designs have to give way to new structural, hierarchical design methods. Without these new
design practices it would be impossible to handle the new complexity.
Top-Down Design
The desired design-style of all designers is the top-down design. A real top-down design
allows early testing, easy change of different technologies, a structured system design and
offers many other advantages. But it is very difficult to follow a pure top-down design. Due
to this fact most designs are mix of both the methods, implementing some key elements of
both design styles.
Complex circuits are commonly designed using the top down methodology. Various
specification levels are required at each stage of the design process.
Abstraction Levels of Verilog
Verilog supports a design at many different levels of abstraction. Three of them are very
important:
Behavioral level
Register-Transfer Level
Gate Level
Behavioral level
This level describes a system by concurrent algorithms (Behavioral). Each algorithm itself is
sequential, that means it consists of a set of instructions that are executed one after the other.
Functions, Tasks and Always blocks are the main elements. There is no regard to the structural
realization of the design.
Register-Transfer Level
Designs using the Register-Transfer Level specify the characteristics of a circuit by operations
and the transfer of data between the registers. An explicit clock is used. RTL design contains
exact timing possibility; operations are scheduled to occur at certain times. Modern definition
of a RTL code is "Any code that is synthesizable is called RTL code".
Gate Level
Within the logic level the characteristics of a system are described by logical links and
their timing properties. All signals are discrete signals. They can only have definite logical
values (`0', `1', `X', `Z`). The usable operations are predefined logic primitives (AND, OR,
NOT etc gates). Using gate level modeling might not be a good idea for any level of logic
design. Gate level code is generated by tools like synthesis tools and this Netlist is used for gate
level simulation and for backend.
vlsi design flow
Introduction
Design is the most significant human endeavor: It is the channel through which creativity is
realized. Design determines our every activity as well as the results of those activities; thus it
includes planning, problem solving, and producing. Typically, the term "design" is applied
to the planning and production of artifacts such as jewelry, houses, cars, and cities. Design is
also found in problem-solving tasks such as mathematical proofs and games. Finally,
design is found in pure planning activities such as making a law or throwing a party.
More specific to the matter at hand is the design of manufacturable artifacts. This
activity uses all facets of design because, in addition to the specification of a producible
object, it requires the planning of that object's manufacture, and much problem solving
along the way. Design of objects usually begins with a rough sketch that is refined by
adding precise dimensions. The final plan must not only specify exact sizes, but also include a
scheme for ordering the steps of production. Additional considerations depend on the
production environment; for example, whether one or ten million will be made, and how
precisely the manufacturing environment can be controlled.
A semiconductor process technology is a method by which working circuits can be
manufactured from designed specifications. There are many such technologies, each of
which creates a different environment or style of design.
INTRODUCTION OF VLSI
Very-large-scale integration (VLSI) is the process of creating integrated
circuits by combining thousands of transistor-based circuits into a single chip. VLSI
began in the 1970s when complex semiconductor and communication
technologies were being developed. The microprocessor is a VLSI device. The
term is no longer as common as it once was, as chips have increased in complexity
into the hundreds of millions of transistors.
Overview
The first semiconductor chips held one transistor each. Subsequent
advances added more and more transistors, and, as a consequence, more
individual functions or systems were integrated over time. The first integrated
circuits held only a few devices, perhaps as many as ten diodes, transistors,
resistors and capacitors, making it possible to fabricate one or more logic gates on
a single device. Now known retrospectively as "small-scale integration" (SSI),
improvements in technique led to devices with hundreds of logic gates, known as
large-scale integration (LSI), i.e. systems with at least a thousand logic gates.
Current technology has moved far past this mark and today's microprocessors
have many millions of gates and hundreds of millions of individual transistors.
At one time, there was an effort to name and calibrate various levels
of large-scale integration above VLSI. Terms like Ultra-large-scale Integration
(ULSI) were used. But the huge number of gates and transistors available on
common devices has rendered such fine distinctions moot. Terms suggesting
greater than VLSI levels of integration are no longer in widespread use. Even VLSI
is now somewhat quaint, given the common assumption that all microprocessors
are VLSI or better.
As of early 2008, billion-transistor processors are commercially
available, an example of which is Intel's Montecito Itanium chip. This is expected
to become more commonplace as semiconductor fabrication moves from the
current generation of 65 nm processes to the next 45 nm generations (while
experiencing new challenges such as increased variation across process corners).
Another notable example is NVIDIA’s 280 series GPU.
This microprocessor is unique in the fact that its 1.4 Billion transistor
count, capable of a teraflop of performance, is almost entirely dedicated to logic
(Itanium's transistor count is largely due to the 24MB L3 cache). Current designs,
as opposed to the earliest devices, use extensive design automation and
automated logic synthesis to lay out the transistors, enabling higher levels of
complexity in the resulting logic functionality. Certain high-performance logic
blocks like the SRAM cell, however, are still designed by hand to ensure the
highest efficiency (sometimes by bending or breaking established design rules to
obtain the last bit of performance by trading stability).
What is VLSI?
VLSI stands for "Very Large Scale Integration". This is the
field which involves packing more and more logic devices into smaller and
smaller areas.
VLSI
1. Simply we say Integrated circuit is many transistors on one chip.
2. Design/manufacturing of extremely small, complex circuitry using
modified semiconductor material
3. Integrated circuit (IC) may contain millions of transistors, each a few mm
in size
4. Applications wide ranging: most electronic logic devices
History of Scale Integration
late 40s Transistor invented at Bell Labs
late 50s First IC (JK-FF by Jack Kilby at TI)
early 60s Small Scale Integration (SSI)
10s of transistors on a chip
late 60s Medium Scale Integration (MSI)
100s of transistors on a chip
early 70s Large Scale Integration (LSI)
1000s of transistor on a chip
early 80s VLSI 10,000s of transistors on a
chip (later 100,000s & now 1,000,000s)
Ultra LSI is sometimes used for 1,000,000s
SSI - Small-Scale Integration (0-102)
MSI - Medium-Scale Integration (102-103)
LSI - Large-Scale Integration (103-105)
VLSI - Very Large-Scale Integration (105-107)
ULSI - Ultra Large-Scale Integration (>=107)
Advantages of ICs over discrete components
While we will concentrate on integrated circuits , the
properties of integrated circuits-what we can and cannot efficiently put in an
integrated circuit-largely determine the architecture of the entire system.
Integrated circuits improve system characteristics in several critical ways. ICs have
three key advantages over digital circuits built from discrete components:
Size. Integrated circuits are much smaller-both transistors and wires
are shrunk to micrometer sizes, compared to the millimeter or
centimeter scales of discrete components. Small size leads to
advantages in speed and power consumption, since smaller
components have smaller parasitic resistances, capacitances, and
inductances.
Speed. Signals can be switched between logic 0 and logic 1 much
quicker within a chip than they can between chips. Communication
within a chip can occur hundreds of times faster than communication
between chips on a printed circuit board. The high speed of circuits
on-chip is due to their small size-smaller components and wires have
smaller parasitic capacitances to slow down the signal.
Power consumption. Logic operations within a chip also take much
less power. Once again, lower power consumption is largely due to
the small size of circuits on the chip-smaller parasitic capacitances
and resistances require less power to drive them.
VLSI and systems
These advantages of integrated circuits translate into advantages at the system
level:
Smaller physical size. Smallness is often an advantage in itself-
consider portable televisions or handheld cellular telephones.
Lower power consumption. Replacing a handful of standard parts
with a single chip reduces total power consumption. Reducing
power consumption has a ripple effect on the rest of the system:
a smaller, cheaper power supply can be used; since less power
consumption means less heat, a fan may no longer be necessary;
a simpler cabinet with less shielding for electromagnetic shielding
may be feasible, too.
Reduced cost. Reducing the number of components, the power
supply requirements, cabinet costs, and so on, will inevitably
reduce system cost. The ripple effect of integration is such that
the cost of a system built from custom ICs can be less, even
though the individual ICs cost more than the standard parts they
replace.
Understanding why integrated circuit technology has such profound influence on
the design of digital systems requires understanding both the technology of IC
manufacturing and the economics of ICs and digital systems.
Applications
Electronic system in cars.
Digital electronics control VCRs
Transaction processing system, ATM
Personal computers and Workstations
Medical electronic systems.
Etc….
Applications of VLSI
Electronic systems now perform a wide variety of tasks in daily life.
Electronic systems in some cases have replaced mechanisms that operated
mechanically, hydraulically, or by other means; electronics are usually smaller,
more flexible, and easier to service. In other cases electronic systems have
created totally new applications. Electronic systems perform a variety of tasks,
some of them visible, some more hidden:
Personal entertainment systems such as portable MP3 players
and DVD players perform sophisticated algorithms with
remarkably little energy.
Electronic systems in cars operate stereo systems and displays;
they also control fuel injection systems, adjust suspensions to
varying terrain, and perform the control functions required for
anti-lock braking (ABS) systems.
Digital electronics compress and decompress video, even at high-
definition data rates, on-the-fly in consumer electronics.
Low-cost terminals for Web browsing still require sophisticated
electronics, despite their dedicated function.
Personal computers and workstations provide word-processing,
financial analysis, and games. Computers include both central
processing units (CPUs) and special-purpose hardware for disk
access, faster screen display, etc.
Medical electronic systems measure bodily functions and perform
complex processing algorithms to warn about unusual conditions.
The availability of these complex systems, far from overwhelming
consumers, only creates demand for even more complex systems.
The growing sophistication of applications continually pushes the design and
manufacturing of integrated circuits and electronic systems to new levels of
complexity. And perhaps the most amazing characteristic of this collection of
systems is its variety-as systems become more complex, we build not a few
general-purpose computers but an ever wider range of special-purpose systems.
Our ability to do so is a testament to our growing mastery of both integrated
circuit manufacturing and design, but the increasing demands of customers
continue to test the limits of design and manufacturing
ASIC
An Application-Specific Integrated Circuit (ASIC) is an
integrated circuit (IC) customized for a particular use, rather than intended for
general-purpose use. For example, a chip designed solely to run a cell phone is an
ASIC. Intermediate between ASICs and industry standard integrated circuits, like
the 7400 or the 4000 series, are application specific standard products (ASSPs).
As feature sizes have shrunk and design tools improved over
the years, the maximum complexity (and hence functionality) possible in an ASIC
has grown from 5,000 gates to over 100 million. Modern ASICs often include
entire 32-bit processors, memory blocks including ROM, RAM, EEPROM, Flash and
other large building blocks. Such an ASIC is often termed a SoC (system-on-a-
chip). Designers of digital ASICs use a hardware description language (HDL), such
as Verilog or VHDL, to describe the functionality of ASICs.
Field-programmable gate arrays (FPGA) are the modern-day technology for
building a breadboard or prototype from standard parts; programmable logic
blocks and programmable interconnects allow the same FPGA to be used in many
different applications. For smaller designs and/or lower production volumes,
FPGAs may be more cost effective than an ASIC design even in production.
1. An application-specific integrated circuit (ASIC) is an integrated circuit
(IC) customized for a particular use, rather than intended for general-
purpose use.
2. A Structured ASIC falls between an FPGA and a Standard Cell-based
ASIC
3. Structured ASIC’s are used mainly for mid-volume level designs
4. The design task for structured ASIC’s is to map the circuit into a fixed
arrangement of known cells
Migrating Projects from Previous ISE Software Releases
When you open a project file from a previous release, the ISE® software prompts you to migrate
your project. If you click Backup and Migrate or Migrate Only, the software automatically
converts your project file to the current release. If you click Cancel, the software does not
convert your project and, instead, opens Project Navigator with no project loaded.
Note: After you convert your project, you cannot open it in previous versions of the ISE
software, such as the ISE 11 software. However, you can optionally create a backup of the
original project as part of project migration, as described below.
To Migrate a Project
1. In the ISE 12 Project Navigator, select File > Open Project.
2. In the Open Project dialog box, select the .xise file to migrate.
Note You may need to change the extension in the Files of type field to display .npl
(ISE 5 and ISE 6 software) or .ise (ISE 7 through ISE 10 software) project files.
3. In the dialog box that appears, select Backup and Migrate or Migrate Only.
4. The ISE software automatically converts your project to an ISE 12 project.
Note If you chose to Backup and Migrate, a backup of the original project is created at
project_name_ise12migration.zip.
5. Implement the design using the new version of the software.
Note Implementation status is not maintained after migration.
Properties
For information on properties that have changed in the ISE 12 software, see ISE 11 to ISE 12
Properties Conversion.
IP Modules
If your design includes IP modules that were created using CORE Generator™ software or
Xilinx® Platform Studio (XPS) and you need to modify these modules, you may be required to
update the core. However, if the core netlist is present and you do not need to modify the core,
updates are not required and the existing netlist is used during implementation.
Obsolete Source File Types
The ISE 12 software supports all of the source types that were supported in the ISE 11
software.
If you are working with projects from previous releases, state diagram source files (.dia), ABEL
source files (.abl), and test bench waveform source files (.tbw) are no longer supported. For state
diagram and ABEL source files, the software finds an associated HDL file and adds it to the
project, if possible. For test bench waveform files, the software automatically converts the TBW
file to an HDL test bench and adds it to the project. To convert a TBW file after project
migration, see Converting a TBW File to an HDL Test Bench.
Migrating Projects from Previous ISE Software Releases
When you open a project file from a previous release, the ISE® software prompts you to migrate
your project. If you click Backup and Migrate or Migrate Only, the software automatically
converts your project file to the current release. If you click Cancel, the software does not
convert your project and, instead, opens Project Navigator with no project loaded.
Note After you convert your project, you cannot open it in previous versions of the ISE
software, such as the ISE 11 software. However, you can optionally create a backup of the
original project as part of project migration, as described below.
To Migrate a Project
1. In the ISE 12 Project Navigator, select File > Open Project.
2. In the Open Project dialog box, select the .xise file to migrate.
Note You may need to change the extension in the Files of type field to display .npl
(ISE 5 and ISE 6 software) or .ise (ISE 7 through ISE 10 software) project files.
3. In the dialog box that appears, select Backup and Migrate or Migrate Only.
4. The ISE software automatically converts your project to an ISE 12 project.
Note If you chose to Backup and Migrate, a backup of the original project is created at
project_name_ise12migration.zip.
5. Implement the design using the new version of the software.
Note Implementation status is not maintained after migration.
Properties
For information on properties that have changed in the ISE 12 software, see ISE 11 to ISE 12
Properties Conversion.
IP Modules
If your design includes IP modules that were created using CORE Generator™ software or
Xilinx® Platform Studio (XPS) and you need to modify these modules, you may be required to
update the core. However, if the core netlist is present and you do not need to modify the core,
updates are not required and the existing netlist is used during implementation.
Obsolete Source File Types
The ISE 12 software supports all of the source types that were supported in the ISE 11
software.
If you are working with projects from previous releases, state diagram source files (.dia), ABEL
source files (.abl), and test bench waveform source files (.tbw) are no longer supported. For state
diagram and ABEL source files, the software finds an associated HDL file and adds it to the
project, if possible. For test bench waveform files, the software automatically converts the TBW
file to an HDL test bench and adds it to the project. To convert a TBW file after project
migration, see Converting a TBW File to an HDL Test Bench.
Using ISE Example Projects
To help familiarize you with the ISE® software and with FPGA and CPLD designs, a set of
example designs is provided with Project Navigator. The examples show different design
techniques and source types, such as VHDL, Verilog, schematic, or EDIF, and include different
constraints and IP.
To Open an Example
1. Select File > Open Example.
2. In the Open Example dialog box, select the Sample Project Name.
Note To help you choose an example project, the Project Description field describes
each project. In addition, you can scroll to the right to see additional fields, which
provide details about the project.
3. In the Destination Directory field, enter a directory name or browse to the
directory.
4. Click OK.
The example project is extracted to the directory you specified in the Destination Directory
field and is automatically opened in Project Navigator. You can then run processes on the
example project and save any changes.
Note If you modified an example project and want to overwrite it with the original example
project, select File > Open Example, select the Sample Project Name, and specify the same
Destination Directory you originally used. In the dialog box that appears, select Overwrite the
existing project and click OK.
Creating a Project
Project Navigator allows you to manage your FPGA and CPLD designs using an ISE® project,
which contains all the source files and settings specific to your design. First, you must create a
project and then, add source files, and set process properties. After you create a project, you can
run processes to implement, constrain, and analyze your design. Project Navigator provides a
wizard to help you create a project as follows.
Note If you prefer, you can create a project using the New Project dialog box instead of the
New Project Wizard. To use the New Project dialog box, deselect the Use New Project wizard
option in the ISE General page of the Preferences dialog box.
To Create a Project
1. Select File > New Project to launch the New Project Wizard.
2. In the Create New Project page, set the name, location, and project type, and
click Next.
3. For EDIF or NGC/NGO projects only: In the Import EDIF/NGC Project page,
select the input and constraint file for the project, and click Next.
4. In the Project Settings page, set the device and project properties, and click
Next.
5. In the Project Summary page, review the information, and click Finish to
create the project.
Project Navigator creates the project file (project_name.xise) in the directory you specified.
After you add source files to the project, the files appear in the Hierarchy pane of the Design
panel. Project Navigator manages your project based on the design properties (top-level
module type, device type, synthesis tool, and language) you selected when you created the
project. It organizes all the parts of your design and keeps track of the processes necessary to
move the design from design entry through implementation to programming the targeted
Xilinx® device.
Note For information on changing design properties, see Changing Design Properties.
You can now perform any of the following:
Create new source files for your project.
Add existing source files to your project.
Run processes on your source files.
Modify process properties.
Creating a Copy of a Project
You can create a copy of a project to experiment with different source options and
implementations. Depending on your needs, the design source files for the copied project and
their location can vary as follows:
Design source files are left in their existing location, and the copied project
points to these files.
Design source files, including generated files, are copied and placed in a
specified directory.
Design source files, excluding generated files, are copied and placed in a
specified directory.
Copied projects are the same as other projects in both form and function. For example, you can
do the following with copied projects:
Open the copied project using the File > Open Project menu command.
View, modify, and implement the copied project.
Use the Project Browser to view key summary data for the copied project and
then, open the copied project for further analysis and implementation, as described in
Using the Project Browser.
Note Alternatively, you can create an archive of your project, which puts all of the project
contents into a ZIP file. Archived projects must be unzipped before being opened in Project
Navigator. For information on archiving, see Creating a Project Archive.
To Create a Copy of a Project
1. Select File > Copy Project.
2. In the Copy Project dialog box, enter the Name for the copy.
Note The name for the copy can be the same as the name for the project, as long as you
specify a different location.
3. Enter a directory Location to store the copied project.
4. Optionally, enter a Working directory.
By default, this is blank, and the working directory is the same as the project directory.
However, you can specify a working directory if you want to keep your ISE® project
file (.xise extension) separate from your working area.
5. Optionally, enter a Description for the copy.
The description can be useful in identifying key traits of the project for reference later.
6. In the Source options area, do the following:
o Select one of the following options:
o Keep sources in their current locations - to leave the design
source files in their existing location.
If you select this option, the copied project points to the files in their
existing location. If you edit the files in the copied project, the changes
also appear in the original project, because the source files are shared
between the two projects.
o Copy sources to the new location - to make a copy of all the
design source files and place them in the specified Location directory.
If you select this option, the copied project points to the files in the specified
directory. If you edit the files in the copied project, the changes do not appear in
the original project, because the source files are not shared between the two
projects.
o Optionally, select Copy files from Macro Search Path directories to
copy files from the directories you specify in the Macro Search Path property in
the Translate Properties dialog box. All files from the specified directories are
copied, not just the files used by the design.
Note If you added a netlist source file directly to the project as described in
Working with Netlist-Based IP, the file is automatically copied as part of Copy
Project because it is a project source file. Adding netlist source files to the project
is the preferred method for incorporating netlist modules into your design,
because the files are managed automatically by Project Navigator.
o Optionally, click Copy Additional Files to copy files that were not
included in the original project. In the Copy Additional Files dialog box, use the
Add Files and Remove Files buttons to update the list of additional files to copy.
Additional files are copied to the copied project location after all other files are
copied.
7. To exclude generated files from the copy, such as implementation results and
reports, select Exclude generated files from the copy.
When you select this option, the copied project opens in a state in which processes have
not yet been run.
8. To automatically open the copy after creating it, select Open the copied project.
Note By default, this option is disabled. If you leave this option disabled, the original
project remains open after the copy is made.
Click OK.
Creating a Project Archive
A project archive is a single, compressed ZIP file with a .zip extension. By default, it contains all
project files, source files, and generated files, including the following:
User-added sources and associated files
Remote sources
Verilog `include files
Files in the macro search path
Generated files
Non-project files
To Archive a Project
1. Select Project > Archive.
2. In the Project Archive dialog box, specify a file name and directory for the ZIP
file.
3. Optionally, select Exclude generated files from the archive to exclude
generated files and non-project files from the archive.
4. Click OK.
A ZIP file is created in the specified directory. To open the archived project, you must first
unzip the ZIP file, and then, you can open the project.
Note Sources that reside outside of the project directory are copied into a remote_sources
subdirectory in the project archive. When the archive is unzipped and opened, you must either
specify the location of these files in the remote_sources subdirectory for the unzipped project, or
manually copy the sources into their original location.
CONCLUSION
The CORDIC algorithms presented in this paper are well known in the research and super-computing circles.
It is, however, my experience that the majority of today’s hardware DSP designs are being done by
engineers with little or no background in hardware efficient DSP
algorithms. The new DSP designers must become familiar with these algorithms and the techniques for
implementing them in FPGAs in order to remain competitive. The CORDIC algorithm is a powerful tool in
the DSP toolbox. This paper shows that tool is available for use in FPGA
based computing machines, which are the likely basis for the next generation DSP systems.
REFERENCES
[1] Ahmed, H. M., Delosme, J.M., and Morf, M., "Highly Concurrent Computing Structure for
Matrix Arithmetic and Signal Processing," IEEE Comput. Mag., Vol. 15, 1982, pp. 65-82.
[2] Alfke, P., “Efficient Shift Registers, LFSR Counters, and Long Pseudo Random Sequence
Generators,” Xilinx application note, August, 1995.
[3] Andraka, R. J., “Building a High Performance Bit-Serial Processor in an FPGA,”
Proceedings of Design SuperCon '96, Jan 1996. pp5.1 - 5.21
[4] Deprettere, E., Dewilde, P., and Udo, R., "Pipelined CORDIC Architecture for Fast VLSI
Filtering and Array rocessing," Proc. ICASSP'84, 1984, pp. 41.A.6.1-41.A.6.4
[5] Despain, A.M., "Fourier Transform Computations Using CORDIC Iterations," IEEE
Transactions on Computers, Vol.23, 1974, pp. 993-1001.
[6] Duh, W.J., and Wu, J.L., "Implementing the Discrete Cosine Transform by Using
CORDIC Techniques,"Proceedings the International Symposium on VLSI Technology,
Systems and Applications, Taipei, Taiwan, 1989, pp. 281-285
[7] Duprat, J. and Muller, J.M., "The CORDIC Algorithm: New Results for Fast VLSI
Implementation," IEEE Transactions on Computers, Vol. 42, pp. 168-178, 1993
8] Hsiao, S.F. and Delosme, J.M., "The CORDIC Householder Algorithm," Proceedings of the
10th Symposium on Computer Arithmetic, pp. 256-263, 1991. [9] Hu, Y.H., and Naganathan, S.,
"A Novel Implementation of Chirp Z-Transformation Using a CORDIC Processor," IEEE
Transactions on ASSP, Vol. 38, pp. 352-354, 1990.
[10] Hu, Y.H., and Naganathan, S., "An Angle Recoding Method for CORDIC Algorithm
Implementation", IEEE Transactions on Computers, Vol. 42, pp. 99-102, January
1993