kv nptel digital design module 2nptel.ac.in/courses/117108040/downloads/digital system...
TRANSCRIPT
1
11
Digital System Design with PLDs and FPGAs
Advanced Digital System Design
Kuruvilla Varghese
DESE
Indian Institute of Science
Kuruvilla Varghese
22Synchronous Sequential Circuit
• Structure
• Design
• Timing analysis
Kuruvilla Varghese
2
33Synchronous Counter
• Mod 6 Counter (0, 1, 2, 3, 4, 5, 0, …)
Preset state Next state
Q2 Q1 Q0 D2 D1 D0
0 0 0 0 0 1
0 0 1 0 1 0
… …
1 0 1 0 0 0
Kuruvilla Varghese
Next
State
Logic
NS
PSD
CK Q
AR
Clock
Reset
Di = Fi (Q2, Q1, Q0)
NS = Funct (PS)
Mod 6 Counter (0, 5, 3, 2, 1, 4, 0 …) ?
44Output Waveform
• 3 tco, for Q2, Q1, Q0, Worst case is taken
Kuruvilla Varghese
CLK
PS
tco
0 1 2 3
3
55Detailed and Next Level
Kuruvilla Varghese
D Q
CK
AR
D Q
CK
AR
D Q
CK
AR
Q2 Q1Q0
CLK
RST
Next
State
Logic
NS
PSD
CK Q
AR
CLK
RST
66Mod - 6 Counter with UP-DN/
Kuruvilla Varghese
D
CK Q
AR
Clock
Reset
NS
PSNext
State
Logic
UP-DN/
4
77Mod - 6 Counter with input UP-DN/
Inputs Preset state Next state
UP-DN/ Q2 Q1 Q0 D2 D1 D0
0 0 0 0 1 0 1
0 0 0 1 0 0 0
… … …
0 1 0 1 1 0 0
1 0 0 0 0 0 1
1 0 0 1 0 1 0
… … …
1 1 0 1 0 0 0
Kuruvilla Varghese
Di = Fi (Q2, Q1, Q0, UP-DN/)
NS = Funct (PS, Inputs)
• Inputs
• Count-by-2
• Reset
• Skip-3
• Load, din 2:0
• Synchronous Inputs
• Transferred to output on Clock
• Goes to the Input (D) of FF
88Asynchronous
• Can we remove the Flip Flops and use buffers
instead ?
Kuruvilla Varghese
Next
State
Logic
NS
PSD
CK Q
AR
CLK
RST
0 0 1
0 1 1
0 1 0
5
99Asynchronous
• Yes
• Unbalanced Path delays
• Races
• Difficult to design / control
• Fast
Kuruvilla Varghese
1010Maximum Frequency
Kuruvilla Varghese
Next
State
Logic
D
CK Q
AR
NS
PS
Clock
Reset
Min Clock period / Max frequency
Tclk(min) > tco(max) + tcomb(max)
+ ts(max)
fmax < 1 / Tclk(min)
slack = Tclk(min) – (tco(max) +
tcomb(max) + ts(max))
CLK
PS
NS
tsth
tco tcomb
Tclk
th
Avoid Hold time Violation
tco(min) + tcomb(min) > th(max)
6
1111Max Frequency / Hold time Violation
• Earlier we gave seen the basic timing parameter of combinational
circuit, i.e. tpd. Similarly basic timing parameters of flip-flops are ts,
th and tco/tcq. All the other timing relations/parameters of digital
circuits are built on these.
• For Sequential circuits, the basic timing parameters are minimum
clock period / maximum frequency and conditions for hold time
violation
• Our analysis makes an assumption that the clock reaches all the flip
flops at the same instant (i.e. there is no clock skew). We will
analyze the case with clock skews later.
Kuruvilla Varghese
1212Max Frequency / Hold time Violation
• When a minimum clock period condition is violated, this can be
met by increasing the clock period
• When there is a hold time violation you need to increase the
combinational delay.
• Since, in a flip flop the tco is greater than th, hold time violation
with our default assumption of no clock skew can’t happen.
• But, this can happen when there are clock skews, and could be
most probable where combinational delays are less or zero like
in shift register
Kuruvilla Varghese
7
1313Number of Paths
• In the case of Mod-6 counter there are 3 flip-flops. Total
number of probable register to register paths are 9
• i.e. From each Qi to each Dj for i, j: 0,1,2
• In general timing of any register to register path follows
the same pattern, it need not be synchronous counter
• e.g. Source register holding some data which goes to
combinational circuit for some computation, and the
output (result) from combinational circuit is registered in
a destination register
Kuruvilla Varghese
1414Register to Register Path
Tclk(min) = tco(max) + tcomb(max) + ts(max) + slack
tco(min) + tcomb(min) > th(min)
Kuruvilla Varghese
D Q
CK
D Q
CK
Comb
CLK
8
1515Setup, Hold Times with skew
Kuruvilla Varghese
D Q
CK
DD’
CLK
2 ns
2 ns 1 ns
CLK
D
ts th
D’
2 ns
ts’ 4 ns
th’ -1 ns
1616Setup, Hold Times with skew
• Most often, setup and hold times of flip-flops or registers with respect to a
pin or output of another register need to be analyzed.
• When there is a delay t in the path to D input, the setup time with respect to
new reference point D' is increased by t and hold time is decreased by t.
• In this case, hold time can take a negative value. A hold time of –t means
that at point D’, the data can be removed or changed t time before the active
clock edge.
• Note: Setup time is defined as time before clock, data has to be setup. So,
for setup time positive value is going backward from clock edge, and
negative value means it is forward from clock edge. For hold time reverse
case applies.
Kuruvilla Varghese
9
1717Setup, Hold Times with skew
Kuruvilla Varghese
D Q
CK
CLK
D
ts th
D
CLK’
3 ns
2 ns 1 ns
3 ns
ts’4 ns th’
-1 ns
CLK’
1818Setup, Hold Times with skew
• When there is a delay t in the path to CLK input, the setup
time with respect to new reference point CLK’ is
decreased by t and hold time is increased by t.
• In this case, setup time can take a negative value. A setup
time of –t means that at point CLK’, the data can be setup
t time after the active clock edge.
Kuruvilla Varghese
10
1919Flat Design: 60 Seconds Timer
Kuruvilla Varghese
POR
Clock DividerBCD
Counter
Mod 6
Counter
BCD-
7 Seg
BCD-
7 Seg
7 Seg
LED7 Seg
LED
2020Design Issues
• Accuracy
– Clock Frequency
• Area
– Clock frequency, Divider
– BCD, Mod-6 or Mod-60 counter ?
• Timing
– Max frequency – Divider
• Electrical Specifications
– 7 Segment LED driving
Kuruvilla Varghese
11
2121CPU Specifications
• Example 8 bit Micro-processor
8 bit ALU, Data Bus D7 – D0
4, 8 bit registers
16 Instructions, (4 bit op code, 2 bits each for src & dst Register)
64 KB Address space, Address lines A15 – A0
Program counter 16 bit, Stack Pointer 16 bit
No separate IO space
Controller – hard wired (Not Micro-coded)
De-multiplexed Address and Data bus
1 Interrupt
Kuruvilla Varghese
2222CPU Design
• Partition: Functional blocks with interfaces
(signals)
• Top-down Design
• At each level
Specifications / Functional description
Timing specifications
Electrical specifications
Kuruvilla Varghese
12
2323Top-down Design
• Complex designs cannot be done in one shot, one need to
partition it to logical blocks, each of which may have to
be further partitioned, till one end up with basic blocks
like muxes, adders, incrementers, registers, decoders etc.
• This calls for domain knowledge for proper partitioning
and identifying the interfaces and to decide the timing
detail at the interfaces
Kuruvilla Varghese
2424CPU Level 0
• Functional: Multi-cycle Execution, Instruction set, …
• Timing: Bus Cycle, Interrupt, Clock, Reset, …
• Electrical: Bus driving
Kuruvilla Varghese
CLK
RST
INTR
RD/
WR/
A15:0
D7:0
CPU
13
2525CPU Level 1
Kuruvilla Varghese
Data Bus D7:0
CLK
INST REG
INST DEC
CONTR-
OLLER
TR1 TR2
ALU
REG A
REG B
REG C
REG D
RA_L
RA_E
RB_L
RB_E
RC_L
RC_E
RD_L
RD_E
TR2_LTR1_L
AL_S
AL_E
IR_L
CLK
RST
INTRIR_L, TR1_L, TR2, L
AL_S, AL_E
RA_L, RB_L, RC_L, RD_L
RA_E, RB_E, RC_E, RD_E
PC_IS
PC_L0
PC_L1
PC_OS
PC_E
PC SP
SP_IS
SP_L0
SP_L1
SP_OS
SP_E
PC_IS, PC_L0, PC_L, PC_OS, PC_E
SP_IS, SP_L0, SP_L1, SP_OS, SP_E, AD_S
AD_S
A15:0
2626CPU Level 1
• Data Path
– Registers, Combinational Circuit
• Controller
– Finite State Machine (FSM)
– Registers, Combinational Circuit
Kuruvilla Varghese
14
2727Datapath
• Datapath is where data movement and computation happens. It usually
comprises of registers to hold the input/intermediate/final data values and
combinational circuits implementing all the computations
• In the case of CPU, Register file, ALU with registers at it input, Program
Counter block and Stack Pointer block forms the datapath
• Controller provides the timing or control signal; to enable various outputs of
registers, give latch signals to registers, specify the operation of combinational
circuit, to select various path through which data moves (through Muxes) etc.
• Controller does no computation, merely provides the timing signals
• This helps as to partition individual blocks, as at the point of partitioning
individual blocks we need not bother about sequence of operations or its timing,
we need to concentrate the functionality in terms of computation, data movement
etc.Kuruvilla Varghese
2828Controller
• Controller does no computation, merely provides the
timing signals
• A single controller can provide the control signals for all
the blocks which work synchronously
• Separate controller is required, if, another block whose
operation is not synchronous to this block
• Also, even if all the blocks works synchronous to each
other, to manage complexity, multiple controllers or
hierarchy of controllers may be used
Kuruvilla Varghese
15
2929CPU Level 2: Registers
• Registers use flip-flops. Main control we require is to
enable the register to latch the input data, at proper
instant.
• Normally this done on an active clock edge qualified by a
level control signal from controller (i.e. Input data is
latched on the register, up on the active clock edge while
the control signal (e.g. RA_L) is high.
• Such a scheme allows the continuous latching of input
data if the control signal is kept high.
Kuruvilla Varghese
3030CPU Level 2: Registers
Kuruvilla Varghese
Data Bus D7:0
CLK
INST REG
INST DEC
CONTR-
OLLER
TR1 TR2
ALU
REG A
REG B
REG C
REG D
RA_L
RA_E
RB_L
RB_E
RC_L
RC_E
RD_L
RD_E
TR2_LTR1_L
AL_S
AL_E
IR_L
CLK
RST
INTRIR_L, TR1_L, TR2, L
AL_S, AL_E
RA_L, RB_L, RC_L, RD_L
RA_E, RB_E, RC_E, RD_E
PC_IS
PC_L0
PC_L1
PC_OS
PC_E
PC SP
SP_IS
SP_L0
SP_L1
SP_OS
SP_E
PC_IS, PC_L0, PC_L, PC_OS, PC_E
SP_IS, SP_L0, SP_L1, SP_OS, SP_E, AD_S
AD_S
A15:0
16
3131CPU Level 2: Registers
Kuruvilla Varghese
D7:0
RA_E
RA_L
CLK
D Q
CK
8-bit Register (8, Edge triggered flip-flops)
8 Tri-state gates
1, 2-input AND gate
Note: There is a timing issue in this scheme
3232CPU Level 2: Registers
Kuruvilla Varghese
D7:0
RA_ERA_L
CLK
D Q
CK
0
1
8-bit Register (8, Edge triggered flip-flops)
8 Tri-state gates
1, 8-bit 2 to 1 Mulitplexer
17
3333Program Counter
• PC is incremented at every clock cycle, hence need a 16 bit
incrementer
• PC output drives the address bus along with stack pointer, a 16-
bit, 2 to 1 multiplexer is required
• On instructions like jump the address on the data bus has to be
loaded to PC. Data bus being 8-bit, this has to be done in 2
steps, that calls for PC register to be 2, 8-bit registers with
separate latch signals
• At the reset and on interrupt PC has to be loaded with specific
addresses, this calls for path selection at the input
Kuruvilla Varghese
3434Program Counter
• On instructions like call, the PC value has to go to data bus
(to memory), hence an 8-bit 2to 1 mux is required at the
output
• We don’t need to worry about the sequence of operations
or timing of the signals at this point, we need to identify
the data movement and various operations to be able to do
the next level design
• Sequencing and timing will be done while designing the
controller
Kuruvilla Varghese
18
3535CPU Level 2: Program Counter
Kuruvilla Varghese
PC-RST
PC-INT
+1PC(0)PC(1)
D7:0
D7:0A15:0
PC-IS
PC-IS
PC-OS
PC-E
From SP
AD-S
PC_L
CLK
PC_L
CLK
PC_L0
CLK PCS(0)
3636CPU Level 2: Program Counter
• 3, 8-bit Registers
• 2, 8-bit 4 to 1 Multiplexers
• 1, 16 bit Incrementer
• 1, 16-bit 2 to 1 Multiplexers
• 1, 8-bit 2 to 1 Multiplexers
• 8 Tri-state gates
Kuruvilla Varghese
19
3737Design
• Flat Design
• Top-down design
• Bottom-up Design
• Functionality
• Timing
• Electrical Characteristics
• Power Dissipation
Kuruvilla Varghese
3838Controller
Kuruvilla Varghese
Data Bus D7:0
CLK
INST REG
INST DEC
CONTR-
OLLER
TR1 TR2
ALU
REG A
REG B
REG C
REG D
RA_L
RA_E
RB_L
RB_E
RC_L
RC_E
RD_L
RD_E
TR2_LTR1_L
AL_S
AL_E
IR_L
CLK
RST
INTRIR_L, TR1_L, TR2, L
AL_S, AL_E
RA_L, RB_L, RC_L, RD_L
RA_E, RB_E, RC_E, RD_E
PC_IS
PC_L0
PC_L1
PC_OS
PC_E
PC SP
SP_IS
SP_L0
SP_L1
SP_OS
SP_E
PC_IS, PC_L0, PC_L1, PC_OS, PC_E
SP_IS, SP_L0, SP_L1, SP_OS, SP_E, AD_S
AD_S
A15:0
20
3939Controller
• Instruction: Add A, B (A <= A + B)
– Move A to TR1
• Enable A output,
• Give latch signal to Register TR1, Disable A Output
– Move B to TR2
– Select ALU operation (Part of instruction can select
directly)
– Wait
– Move ALU output to Register A
Kuruvilla Varghese
4040Controller
• This identifies the macro steps, and micro steps in each
macro step
• A single pulse for signal RA_E can enable and disable
output of Register A
• Same pulse can be used to latch the data to register TR1
(TR1_L)
Kuruvilla Varghese
21
4141Controller Timing
Kuruvilla Varghese
CLK
RA_E, TR1_L
RB_E, TR2_L
AL_E, RA_L
4242Controller
• Can’t be combinational circuit
– Since, combinational circuit can not produce a
sequence at its output for just a single input as
instruction
• Sequential Circuit
– What type of sequential circuit can generate such
pulses?
Kuruvilla Varghese
22
4343Counter
• Can we use a Counter to generate the timing pulses ?
• Mod 3 Counter ?
Kuruvilla Varghese
Output
Logic
OutputsNSNext
State
Logic
D
CK Q
AR
PS
Clock
Reset
Inputs
4444Output Logic
Kuruvilla Varghese
RA_E = f1 (Q1, Q0)
Similarly other outputs
Output = F (PS)
Pr State Outputs
Q1 Q0 RA_E TR1_L RB_E TR2_L AL_E RA_L
0 0 1 1 0 0 0 0
0 1 0 0 1 1 0 0
1 0 0 0 0 0 1 1
23
4545Controller Timing
Kuruvilla Varghese
CLK
RA_E, TR1_L
RB_E, TR2_L
AL_E, RA_L
4646Generic ?
• Can we generate whatever timing signals required ?
Kuruvilla Varghese
CLK
RA_E, TR1_L
RB_E, TR2_L
AL_E, RA_L
• YesRA_E = Q1/Q0/ + Q1Q0/ RB_E = Q1 EXOR Q0
24
4747State Assignment
• Is the counter (state changes) need to be ordered ?
Kuruvilla Varghese
Output
Logic
OutputsNSNext
State
Logic
D
CK Q
AR
PS
Clock
Reset
Inputs
4848Generic ?
• Is the counter (state changes) need to be ordered ?
– We care only about Inputs, Outputs
– State Assignment
– Could be of some use
• What it means to Design an FSM ?
– Inputs, State Transitions, Outputs
– Next State Table, Output Table
Kuruvilla Varghese
25
4949FSM Idea
• Finite State Machine (FSM)
– A Counter goes through various states.
– States are decoded to generate various pulses to control
the data path. (e.g. select a mux, clock a latch, clock
enable to a register etc.)
Kuruvilla Varghese
5050Moore / Mealy
• Moore and Mealy outputs
– Some outputs are decoded from present state, and they
are called Moore outputs
– Some outputs are decoded from present state and inputs,
and they are called Mealy outputs
Kuruvilla Varghese
26
5151FSM: 3 Blocks view
Kuruvilla Varghese
Next
State
Logic
D
CK Q
AR
Output
Logic
NS
PS
OutputsInputs
Clock
Reset
NS = f (PS, Inputs)
Moore Outputs = f (PS)
Mealy Outputs = f (PS, Inputs)
5252FSM / Controller: 2 Blocks view
Kuruvilla Varghese
Logic D
CK Q
AR
NS PS
Outputs
Inputs
Clock
Reset
27
53532 blocks
• If, you look at the diagram of FSM with 3 blocks, you can
see both Next State Logic and output logic use Present
State and Inputs to generate its output
• Hence we could view the FSM with 2 blocks where one
block is both logic combined
• Such a view is useful for timing analysis and HDL coding
Kuruvilla Varghese
5454Maximum Frequency
Kuruvilla Varghese
Tclk(min) > max (tcq(max) +
tNSL(max) + ts(min),
tcq(max) + tOL(max))
fmax < 1 / Tclk
slack = Tclk – (tcq(max) +
tNSL(max) + ts(min))
tcq(min) + tNSL(min) > th(max)
LogicD
CK Q
AR
NS PS
Outputs
Inputs
Clock
Reset
CLK
PS
NS
ts th
tcq tNSL
Outputs
tcq + tOL
th
28
5555Timing
• Timing analysis is same as that of the synchronous counter, but
for maximum frequency of operation there are 2 category of paths
to consider; register to register and register to output
• In our analysis, we have considered register to output just till
output, but in real life one has to consider the end point, i.e. if it is
going to another register input through a combinational circuit,
then the whole path till the destination need to be considered
• Hold violation condition is same as in the case of synchronous
counter
Kuruvilla Varghese
5656Controller Design
• Control Algorithm
– When one try to implement a control algorithm using an FSM one
need to decide sequence of steps for various input combinations, and
the output at each step. This can be a textual description followed
with a waveform
– Aim of the design is to come out with the truth tables of NSL and
OL. This couldn’t be done easily from textual description and/or
waveform to truth tables
– Hence, we use a graphical tool called state diagram (like flow chart)
to visualize the sequence of states, their transition based on inputs,
and various outputs produced at each state.
Kuruvilla Varghese
29
5757State Diagram
• State Diagram
– States: Oval / Circle
– Transitions: Arrows
– Outputs: Output signal in a block associated with states
• Designing FSM
– Designing NSL, OL
– State Assignment
Kuruvilla Varghese
5858FSM: 3 Blocks view
Kuruvilla Varghese
Next
State
Logic
D
CK Q
AR
Output
Logic
NS
PS
OutputsInputs
Clock
Reset
NS = f (PS, Inputs)
Moore Outputs = f (PS)
Mealy Outputs = f (PS, Inputs)
30
5959State Diagram: States, Transition
Kuruvilla Varghese
S0
State
S1
Unconditional
Transition
S0
S1 S2
en en/
Conditional
Transition
en
S0
S1
Conditional
Transition
en/
6060State Diagram: Moore, Mealy Outputs
Kuruvilla Varghese
Moore Outputs
S0
S1
rd/ = 1, latch = 0
en
en/
rd/ = 0, latch = 1
S0
S1
rd/ = 0, latch = en
en
en/
rd/ = 1, latch = 0
Mealy Outputs
31
6161State Diagram: Example
Kuruvilla Varghese
power_on
S0
S1
S2
start/
start
max/
max
prst = 0, shadct = 0,
mcmuld = 0, sel = 0
prst = 1, shadct = 0,
mcmuld = 1, sel = 0
prst = 0, shadct = 1,
mcmuld = 0, sel = r0
6262Next State Table
• Next State Table can be easily written by looking at inputs
and state transitions
Kuruvilla Varghese
Inputs Present State Next State
start max Q1 Q0 D1 D0
0 X 0 0 0 0
1 X 0 0 0 1
X X 0 1 1 0
X 0 1 0 1 0
X 1 1 0 1 1
32
6363Output Table
• Output table can be written looking at the outputs for each
state
Kuruvilla Varghese
Present State Outputs
Q1 Q0 prst shadct mcmuld sel
0 0 0 0 0 0
0 1 1 0 1 0
1 0 0 1 0 r0
6464State Diagram and Logic
• State diagram has all the information for Next state table
and Output Table
• If a state diagram is designed and coded the tools can
generate the tables and optimize it to implement it
Kuruvilla Varghese
33
6565ADC Controller
• Scenario
– Data Acquisition System
– ADC interfaced to Host processor
– Per sample interrupt costly for the host processor
– A temporary storage of samples
– A controller to control ADC and storage
– When storage is near full, interrupt the host
Kuruvilla Varghese
6666ADC Controller
Kuruvilla Varghese
Host interfaceADC
aclk
ain data
soc eocdata
hrd/
intr
start
Temporary storage ??
Controller
34
6767Temporary storage
• DPRAM
– Random access is not required
– Only one way data flow
– Complex for the application, costly
• FIFO
– Simple to use (No address bus)
– Enough for the application
Kuruvilla Varghese
6868Block Schematic
Kuruvilla Varghese
Host interfaceADC
aclk
ain data
soc eoc
data
hrd/
intr
start
FIFO
din
fwr/
dout
frd/
¾ full
Controller
clk
rst
start
eoc
soc
fwr/
Data path
35
6969Assumptions
• FIFO 3/4th Full can be used for host interrupt
• Host processor can read the full FIFO in short time through
burst read.
• Each time host reads a fixed number of samples (<= 75%),
last time may end up with some data in FIFO. Ignore it; may
have to use empty to completely read it.
• ADC ‘soc’ requires a narrow pulse
• fwr/ timing is to be met
Kuruvilla Varghese
7070Timing Diagram
Kuruvilla Varghese
start
soc
eoc
fwr/
36
7171fwr/ Timing
• How to generate ‘fwr/’ timing ?
• Controller goes through many states ?
– Too many states
– Modification is difficult
• Counter
– Counter counts to the required value
– Counter output is decoded to give it to FSM
– FSM controls the reset (sometimes enable also) of the counter
Kuruvilla Varghese
7272Complete Block Schematic
Kuruvilla Varghese
Host interfaceADC
aclk
ain data
soc eoc
data
hrd/
intr
start
FIFO
din
fwr/
dout
frd/
¾ full
Controller
clk
rst
start
eoc
wtim
soc
fwr/
crst
cclk = tim
Counter
37
7373Complete Timing Diagram
Kuruvilla Varghese
start
soc
eoc
fwr/
crst
wtim
Conversion Time
7474Control Algorithm
• Up on reset come to init state. Wait for start. Initialize
outputs (soc = 0, crst = 1, fwr/ = 1)
• Up on start, go to next state. Make soc = 1
• Transit to next state. Make soc = 0. Wait for eoc = 1
• Up on eoc = 1, transit to next state. Start the counter
(crst = 0), make fwr/ = 0. Wait for wtim = 1.
• Up on wtim = 1 go to init state
Kuruvilla Varghese
38
7575State Diagram
Kuruvilla Varghese
soc = 0, fwr/ = 1
crst = ‘1’
soc = 1, fwr/ = 1
crst = ‘1’
soc = 0, fwr/ = 1
crst = ‘1’
eoc/
eoc
soc = 0, fwr/ = 0
crst = ‘0’
wtim/ wtim
rst
S0
S1
S2
S3
start/
start
7676State Assignment, Flip-Flops etc
• Four states, Binary encoding
• Number of flip-flops: 2
• State Assignment: Sequential
• S0: 00, S1: 01, S2: 10, S3: 11
• Type of Flip-Flops: D Flip flops
Kuruvilla Varghese
39
7777Finite State Machine (FSM)
Kuruvilla Varghese
Next
State
Logic
D
CK Q
AR
Output
Logic
NS
PS
OutputsInputs
Clock
Reset
NS = f (PS, Inputs)
Moore Outputs = f (PS)
Mealy Output = f (PS, Inputs)
7878Next State Table
Inputs Present State Next State
start eoc wtim Q1 Q0 D1 D0
0 X X 0 0 0 0
1 X X 0 0 0 1
X X X 0 1 1 0
X 0 X 1 0 1 0
X 1 X 1 0 1 1
X X 0 1 1 1 1
X X 1 1 1 0 0
Kuruvilla Varghese
D1 = f1 (Q1, Q0, start, eoc, wtim) Equations
D2 = f2 (Q1, Q0, start, eoc, wtim) Minimization
40
7979Output Table
Present State Outputs
Q1 Q0 soc crst fwr/
0 0 0 1 1
0 1 1 1 1
1 0 0 1 1
1 1 0 0 0
Kuruvilla Varghese
soc = f1 (Q1, Q0) Equations
crst = f2 (Q1, Q0) Minimization
fwr/ = f3 (Q1, Q0)
8080Methodology
1. Specifications
2. Block schematic (Blocks, Signals)
– Data path, Controller(s)
3. System Timing Diagram
4. Sub-system Identification
5. Update Timing Diagram
6. Data path design (Various Levels)
7. Controller Algorithm
8. State Diagram
Kuruvilla Varghese
41
8181Methodology
9. State Diagram Optimization
10. State Assignment
11. Selection of Flip-flops
12. Next State Table, Equations, Minimization
13. Output Table, Equations, Minimization
14. Selection of Device Technology
15. Implementation
16. Test and Debug
17. Documentation
Kuruvilla Varghese
8282Methodology
• Steps 1-8: Designer
• Steps 9-13: Tool + Directives
• Steps 14-18: Tools + Designer
Kuruvilla Varghese
42
8383Power on Reset
• Use Asynchronous Reset,
• If not available, use synchronous Reset
Kuruvilla Varghese
Next
State
Logic
D
CK Q
AR
Output
Logic
NS
PS
OutputsInputs
Clock
Async Reset
Sync
Reset
8484FSM: Clock frequency
• Maximum Clock Frequency
– Delays of the blocks
• Max Clock frequency (Min Clock period)
Tclk(min) > Max ((tcq(max) + tNSL(max) + ts(max)), (tcq(max) + tOL(max)))
Kuruvilla Varghese
43
8585FSM: Minimum Clock frequency
Kuruvilla Varghese
CLK
IN1
IN2
IN3
CLK’
8686FSM: Minimum Clock frequency
• Minimum Clock frequency should be greater than the twice the Maximum Input clock frequency
• Sampling the inputs
• Inputs may not be periodic waveform
• Pulse width shouldn’t be the criteria, can be stretched, How fast to respond to the event should be the criteria.
Kuruvilla Varghese
44
8787Stretching the pulse
Kuruvilla Varghese
IN
CLK2
CLK1
IN’
8888Timing Pulse Accuracy
• To detect a pulse with certain accuracy, min clock period should
be less than the error
Kuruvilla Varghese
Timing Pulse
CLK1
CLK2
45
8989Stretching the pulse
Kuruvilla Varghese
IN
CLK2
CLK1
IN’
9090Pulse Stretcher
Kuruvilla Varghese
D Q
CK
AR
D Q
CK
AR
D Q
CK
AR
I
det
rst
clk
sdet
det
clk
sdet
46
9191Pulse Stretcher
• Pulse act as a clock to catching flip-flop
• Next 2 flip-flops are clocked by FSM clock to
ensure a pulse of at least one clock period duration.
• Not very practical if you have to stretch long
Kuruvilla Varghese
9292Pulse detection
• Pulse to level converter
• Level to Pulse converter
Kuruvilla Varghese
47
9393Pulse to toggle
Kuruvilla Varghese
p
I
D Q
CK
AR
I
P
9494Level to pulse
Kuruvilla Varghese
D Q
CK
D Q
CK
D Q
CK
I1 I2 I3I
clk2
clk2
l2 xor l3
I2
l2 .l3/
l2/.l3
48
9595Pulse Transfer
Kuruvilla Varghese
D Q
CK
D Q
CK
D Q
CK
Xor
I1 I2 l3
clk2
D Q
CK
I
P
9696Register to Register Path
Tclk(min) = tco(max) + tcomb(max) + ts(max) + slack
tco(min) + tcomb(min) > th(min)
Kuruvilla Varghese
D Q
CK
D Q
CK
Comb
CLK
49
9797Second Register clocked by CLK/?
• Bit naive question often asked
• We are minimizing the clock period looking at the critical path
delay
• If, we play with positive and negative clock edges, we have to make
sure alternate registers are clocked by CLK/.
• How can this be true when there is a feedback from a positive clock
triggered register to a positive clock edge triggered register control?
• What if the data from a positive edge triggered register and
negative edge triggered register goes to same register? How this
register can be clocked?
Kuruvilla Varghese
9898Second Register clocked by CLK/?
• What about signals from various parts of data path to a
controller?
• How can this system of clocking to be applied to an FSM?
• In the case of a mixed scenario, what if, the critical path
appears between registers clocked by opposite clock
edges?
Tclk(min) /2 = tco(max) + tcomb(max) + ts(max) + slack
Tclk(min) = 2 * (tco(max) + tcomb(max) + ts(max) + slack)
Kuruvilla Varghese
50
9999Second Register clocked by CLK/?
• And introducing the delay for inverter would mean
introducing skew to some part of the clock distribution
network, creating various timing problems
• In the most trivial case of a array of registers sandwiched
by combinational circuit, it does not matter if you use
same clock edges or alternate clock edges
• In the former case clock frequency would be twice as that
of the latter.
Kuruvilla Varghese
100100Moore / Mealy Output
Kuruvilla Varghese
soc = 0
soc = 1
soc = 0
rst
S0
S1
S2
start/
start
Moore Output
soc = start
soc = 0
rst
S0
S2
start/
start
Mealy Output
51
101101Moore / Mealy Output
Kuruvilla Varghese
clk
start
S0 S0 S1 S2Moore
states
soc
S0 S0 S2Mealy states
soc
102102Mealy Output
• Comes earlier to Moore output
• Number of states are less
• Output timing depends on input timing; Glitches
• Hence, ideal when FSM and other blocks are
synchronous
Kuruvilla Varghese
52
103103FSM: Mealy Output
Kuruvilla Varghese
FSM
clk
Synchr
onous
Sub-system
Synchr
onous
Sub-system
O1: Mealy Output
O2: Mealy Output
i1
i2
O1 and O2 can be Mealy as function of states and i1 and/or i2
104104CPU Level 2: Registers
Kuruvilla Varghese
Data Bus D7:0
CLK
INST REG
INST DEC
CONTR-
OLLER
TR1 TR2
ALU
REG A
REG B
REG C
REG D
RA_L
RA_E
RB_L
RB_E
RC_L
RC_E
RD_L
RD_E
TR2_LTR1_L
AL_S
AL_E
IR_L
CLK
RST
INTRIR_L, TR1_L, TR2, L
AL_S, AL_E
RA_L, RB_L, RC_L, RD_L
RA_E, RB_E, RC_E, RD_E
PC_IS
PC_L0
PC_L1
PC_OS
PC_E
PC SP
SP_IS
SP_L0
SP_L1
SP_OS
SP_E
PC_IS, PC_L0, PC_L, PC_OS, PC_E
SP_IS, SP_L0, SP_L1, SP_OS, SP_E, AD_S
AD_S
A15:0
53
105105Control of Sequential Circuits
Kuruvilla Varghese
Reg /
Counter /
Seq Ckt
FSM /
Contr-
oller
clk
en (RA_L)
106106Clock Gating
Kuruvilla Varghese
D7:0
RA_E
RA_L
CLK
D Q
CKCLK’
CLK
RA_L
CLK’
1 2
54
107107Clock Gating
• Two active clock edges
• In some cases, where the control signal register a data,
edge 1 may not meet the minimum clock period constraint
(i.e. it may not accommodate tco(max) + tcomb(max) + ts(max) ),
edge 2 may be late causing hold time violation.
• In cases, where control signal is used to increment some
counter, counter may get incremented twice, instead of
once.
Kuruvilla Varghese
108108Re-circulating Buffer
Kuruvilla Varghese
D7:0
RA_ERA_L
CLK
D Q
CK
0
1
CLK
RA-L
Register write on the clock edge
55
109109Re-circulating Buffer
• Any number of control signals
• Different data paths
– e.g. Parallel data, shifted data etc.
• Priority of control signals
Kuruvilla Varghese
110110Counter with enable
• ‘en’ comes from FSM, en = 1 counter counts
otherwise retains the value.
Kuruvilla Varghese
d
clk
q+1 count
q
rstclk
reset
en
1
0
56
111111VHDL Code
process (clk, rst)
begin
if (rst = '1') then q <= (others => '0’);
elsif (clk'event and clk = '1') then
if (en = ‘1’) then q <= q + 1;
end if;
end if;
end process;
Kuruvilla Varghese
112112Counter with enable and load
Kuruvilla Varghese
d
clk
qcount
q
rstclk
rst
1 0
load
din
1 0
+1
en
57
113113VHDL Code
process (clk, rst)
begin
if (rst = '1') then q <= (others => '0’);
elsif (clk'event and clk = '1') then
if (load = '1') then q <= din;
elsif (en = ‘1’) then q <= q + 1;
end if;
end if;
end process;
Kuruvilla Varghese
114114Re-circulating Buffer
Kuruvilla Varghese
D7:0
RA_ERA_L
CLK
D Q
CK
0
1
CLK
RA-L
Register write on the clock edge
58
115115Clock Gating
Kuruvilla Varghese
D7:0
RA_E
RA_L
CLK
D Q
CKCLK’
CLK
RA_L
CLK’
1 2
116116Clock Gating
Kuruvilla Varghese
D7:0
RA_E
RA_L
CLK
D Q
CKCLK’
CLK
RA-L
CLK’
1 2
59
117117Clock Gating for Low Power
Kuruvilla Varghese
CLK
D7:0
RA_E
RA_LD Q
CKD Q
CK
CLK1
CLK2
CLK
RA_L
CLK1
CLK2
118118Clock Gating for Low Power
• Here, the requirement is to have a clock pulse with active clock
edge matching with the trailing edge of the original control signal
• This can be achieved, if the original control signal is delayed by
half clock period and this is gated with original clock. But delaying
by adding combinational logic delays would not be precise and also
would not allow flexibility in changing the clock frequency.
• Hence, the original control signal is resynchronized with negative
(opposite) clock edge and this resynchronized signal is gated with
clock to generate the control signal
Kuruvilla Varghese
60
119119Finite State Machine (FSM)
Kuruvilla Varghese
Next
State
Logic
D
CK Q
AR
Output
Logic
NS
PS
OutputsInputs
Clock
Reset
120120Finite State Machine (FSM)
Kuruvilla Varghese
Logic D
CK Q
AR
NSPS
Outputs
Inputs
Clock
Reset
61
121121State Diagram Optimization
• States Si and Sj are equivalent, if
– For the same input conditions, both states transit to
same next states (i.e. Number of transitions and the
conditions for each transition should match)
– For the same input conditions, both states produces the
same outputs (For Moore outputs, input conditions
does not matter)
Kuruvilla Varghese
122122State Diagram Optimization
• If Si and Sj are equivalent, one is redundant
• The rule is applicable to more than two states
• The first condition can be detected by examining the rows
(identical rows except the present state) of next state table.
• The second condition can be detected by examining the rows
(identical rows except the present state) of output table. (For
Moore outputs ignore inputs)
Kuruvilla Varghese
62
123123State Diagram Optimization
• In Next State Table look for same next states, Then out of these
next states, select the states for which input conditions are same.
• Or, look for same next states’ with same input conditions in one
shot
• Now, for these states check if outputs (for Mealy both input
conditions and outputs) are same
• Select the states where the outputs (with inputs for mealy) are
same
• These states are equivalent
Kuruvilla Varghese
124124Output Races (Glitches)
Kuruvilla Varghese
en = 1 wr = 1
001
010
clk
en
wr
Note: Glitch could occur
either on ‘en’ or ‘wr’
011
tcq0 > tcq1
000
tcq0 < tcq1
63
125125Output Races (Glitches)
• When more than two flip-flops change Outputs during state
change, momentarily it could pass through transitory states
owing to variation in tcq. If these states produces some
outputs (different from source and destination states)
glitches could occur on these outputs.
• If these outputs are used for synchronous control it may not
affect the circuit. But, if it is used in asynchronous circuit
like RAM write it can cause problem.
Kuruvilla Varghese
126126Output Races: Solution
• Do state assignment such that more than one flip-flop
doesn’t change output (Gray Code)
• May not be possible
• Do state assignment such that the transitory states are the
ones that don’t produce any outputs.
• Register outputs. (Outputs are registered on the next clock
edge, well after the state change). Latency of one clock.
Kuruvilla Varghese
64
127127Output Registering
Kuruvilla Varghese
Reg
Outputs
Next
State
Logic
D
CK Q
AR
Output
Logic
NSPS
Outputs
Inputs
Clock
Reset
D Q
CK AR
State FFsOutput FFs
clk
en
(valid output) ld
ld®
en®
128128Output Registering
Kuruvilla Varghese
LogicD
Q
CK ARNS
PS
OutputsInputs
Clock
Reset
Reg
Outputs
65
129129Selection of Flip-Flops
PS NS D J K T
0 0 0 0 X 0
0 1 1 1 X 1
1 0 0 X 1 1
1 1 1 X 0 0
Kuruvilla Varghese
In case of JK flip-flop, NSL has twice the number of outputs
compared to D or T, but because of don’t cares may result in
less logic for next state decoding. CPLDs and FPGAs has flip- flops that can be
used as D or T flip flops.
130130State Assignment
• Number of states = s
• Number of flip-flops = n =
• Number of possible ways to do the state assignment?
P(2n, s)
• e.g. s =17 n = 5 (Minimize Area)
P(2n, s) = 32! / (32-17)!
= 32 x 31 x … x 18 x 17 x 16 = 2.5…. x 1044
• NSL Minimization
• OL Minimization
s2log
Kuruvilla Varghese
66
131131NSL Optimization
• Our aim is to do the state assignment such that NSL is
minimized (minimum area)
• As we have seen a search through all possible assignments
to find the assignment which results in the minimum area is
almost impossible in terms of computation time
• Hence, we look for some Heuristic solution, where we
follow some sensible rules, that would result in near optimal
solution
Kuruvilla Varghese
132132NSL Optimization
• Where to look for insight?
– Next state table
– We think of Minterms grouping (as in Karnaugh Map)
• Aim would be to look for same next states (since, these state bits are
output of NSL, and that is what we want to minimize)
• For these bits we would like the minterms adjacent
• Minterms consists of input conditions and present state
• So let us look for next state with same input conditions, and make the
present state adjacent, such that, while grouping bits are removed
• This can be done in terms powers of 2
Kuruvilla Varghese
67
133133NSL Minimization
Inputs Present State Next State
I1 I2 I3 Q2 Q1 Q0 D2 D1 D0
1 1 0 0 1 0 1 0 1
1 1 0 0 1 1 1 0 1
Kuruvilla Varghese
134134NSL Minimization
• Under the same input conditions, if states Si and Sj transit
to same next state Sk, make state assignment such that the
states Si and Sj are logically adjacent.
• Applicable to more than two states (by powers of 2)
• Note: States Si and Sj may transit to different next states
under different input conditions. But, if both transit to just
one next state under same input conditions, is good
enough to make them adjacent, as the equations for that
state get minimized.
Kuruvilla Varghese
68
135135OL Minimization
Inputs Present State Outputs
I1 I2 I3 Q2 Q1 Q0 O2 O1 O0
1 1 0 0 1 0 1 0 1
1 1 0 0 1 1 1 0 1
Kuruvilla Varghese
136136OL Minimization
• (Under the same input conditions), if states Si and Sj
produces the same outputs, make state assignment such
that the states Si and Sj are logically adjacent.
• Applicable to more than two states (by powers of 2)
• Note: States Si and Sj may produce different outputs
(under different input conditions). But, if both produces
one output same (under same input conditions), is good
enough to make them adjacent, as the equations for that
output get minimized.
Kuruvilla Varghese
69
137137Fault Tolerance: Unused States
• Number of states = s
• Number of flip-flops = n =
• Unused states 2n – s
• s = 5, n = 3
• Unused states = 23 – 5 = 3
s2log
Kuruvilla Varghese
138138Unused States
Kuruvilla Varghese
000
001
010
011
100
101
110
111
State Diagram Unused states
70
139139Unused States
• What happens if FSM get in to these states ?
– It could get stuck there
– It could loop through some or all of unused states
– It could get back to a valid/used state.
• If these states produces some outputs ?
• On what conditions above happens ?
Kuruvilla Varghese
140140NSL code: Unused states as Don’t cares
process (pr_state, i1, i2, ,)
begin
case pr_state is
when S0 => ,.
when S1 => ,..
when others => nx_state <= “---”;
end case;
end process;
Kuruvilla Varghese
71
141141Unused States
Inputs Present State Next State
I1 I2 I3 Q2 Q1 Q0 D2 D1 D0
1 x x 0 0 0 0 0 1
x x 1 1 0 0 0 0 0
x x x 1 0 1 X(1) X(0) X(1)
Kuruvilla Varghese
101 � 101
142142Unused States
Inputs Present State Next State
I1 I2 I3 Q2 Q1 Q0 D2 D1 D0
1 x x 0 0 0 0 0 1
x x 1 1 0 0 0 0 0
x x x 1 0 1 X(1) X(1) X(0)
x x x 1 1 0 X(1) X(1) X(1)
x x x 1 1 1 X(1) X(0) X(1)
Kuruvilla Varghese
101 � 110 � 111 � 101
72
143143Unused States: Fault Tolerance
Kuruvilla Varghese
000
001
010
011
100
101
110
111
144144NSL Block coding: Fault tolerance
process (pr_state, i1, i2, ,)
begin
case pr_state is
when S0 => ,.
when S1 => ,..
when others => nx_state <= S0;
end case;
end process;
Kuruvilla Varghese
73
145145Unused states
• Introduce transitions from unused states to a Safe
state.
• Safe state could be Init state.
• The safe state depends on the application, could be
something other than Init state.
Kuruvilla Varghese
146146Bring back to last state?
• One can think of redundancy for present state, but, that if
there is a mismatch to decide on which one is correct may
require majority voting
• Another way is to use error correcting codes (hamming
codes) for states, and use maximum likelihood decoding for
deciding on correct states, in case of wrong transition
Kuruvilla Varghese
74
147147Finite State Machine (FSM)
Kuruvilla Varghese
Next
State
Logic
D
CK Q
AR
Output
Logic
NS
PS
OutputsInputs
Clock
Reset
148148FSM: Output Delay
• Output delay: tcq + tol
• How to reduce the Output delay ?
– Decode Output from Next state but then has to be registered to
coincide with the state change. (i.e. next state to present state)
– Encode the Outputs in state bits (state variables)
Kuruvilla Varghese
75
149149Output decoding from Next State
Kuruvilla Varghese
Next
State
Logic
D
CK Q
AR
Output
Logic
NS
PS
Outputs
Inputs
Clock
Reset
D Q
CK
AR
Output delay = tcq
Critical Path delay = tcq + tNSL + tOL + ts (*)
150150Output decoding from Next State
• In the normal case, we would have chosen clock period as tcq
+ tNSL + ts with some margin
Kuruvilla Varghese
Next
State
Logic
D
CK Q
AR
Output
Logic
NS
PS
OutputsInputs
Clock
Reset
76
151151Output decoding from Next State
• Compared to earlier case; from previous clock edge, total
delay would be tcq + tNSL + ts + tcq + tOL
• In the case of decoding output from Next State, delay would
be tcq + tNSL + tOL + ts + tcq
• But, Since NSL and OL are together, synthesis tool can
optimize (minimize) them combined and may result in less
delay than tNSL + tOL
Kuruvilla Varghese
152152Encoding Output in state bits
Kuruvilla Varghese
Next
State
Logic
D
CK Q
AR
NS
PS
Outputs
Inputs
Clock
Reset
Output delay = tcq
77
153153Encoding Output in state bits
StatesOutputs
WR/ EN
S0 0 1
S1 1 0
S2 1 1
S3 0 0
Q1 Q0
Kuruvilla Varghese
Since output patterns are unique and equal to number of states,
state variables can be used as outputs
154154Encoding Output in state bits
StatesOutputs
WR/ EN
S0 0 1
S1 1 0
S2 1 1
S3 1 0
Q1 Q0
Kuruvilla Varghese
For states S1 and S3 outputs are same and hence one extra bit is
needed for state variables.
78
155155Encoding Output in state bits
StatesOutputs Extra
bitWR/ EN
S0 0 1 0
S1 1 0 0
S2 1 1 0
S3 1 0 1
Q2 Q1 Q0
Kuruvilla Varghese
Adding the extra bit makes unique pattern and state variables can
be used as outputs.
156156Encoding Output in state bits
States Outputs Extra bits
Adr_1 Adr_0 WR/ EN
S0 0 0 1 0 0 0
S1 0 1 0 1 0 0
S2 0 0 1 0 0 1
S3 0 1 0 1 0 1
S4 0 0 1 0 1 0
S5 1 0 0 1 0 0
S6 1 1 0 1 0 0
Q5 Q4 Q3 Q2 Q1 Q0
Kuruvilla Varghese
79
157157Encoding Output in state bits
• Identify states with same output values,
• From these identify the states where one output pattern repeats
maximum,
• Add additional bits to make these output patterns distinct.
Kuruvilla Varghese
158158Metastability in edge triggered Flip-Flop
Kuruvilla Varghese
D Q
CLK
CLK
D
Q
tsth
tco
ts: Setup time: Minimum time
input must be valid before
the active clock edge
th: Hold time: Minimum time
input must be valid after the
active clock edge
tco: Propagation delay for
input to appear at the output
from active
clock edge
80
159159Metastability
• If setup/hold time is violated, flip-flop can sample inputs
wrongly i.e. output could be ‘1’ or ‘0’ (it may remain at
previous value or transit to new value).
• The output may take long time to resolve, if the input
changes close to clock edge
• In the worst case, output can get stuck in between valid logic
levels, and can remain in such a state for indeterminate
amount of time
Kuruvilla Varghese
160160Datapath / Sequential Circuits
• We build data paths and controllers (FSMs) using flip-flops
(registers) and combinational circuits
• We make sure that in register to register paths, setup time is
met by choosing proper clock period
• We also analyze the conditions for hold time violation and
avoid the violation
• In all these, skew of the clock is also considered for worst
case design.
Kuruvilla Varghese
81
161161Dataptah
Kuruvilla Varghese
D Q
CLK
D Q
CLK
Comb
clk
Min Clock period / Max frequencyTclk(min) > tco(max) + tcomb(max) + ts(max)
Avoid Hold time violationtco(min) + tcomb(min) > th(min)
162162Sequential Circuit: FSM
Kuruvilla Varghese
outputs
LogicFlip
Flops PS
NS
clock
reset
inputs
Min Clock period / Max frequencyTclk(min) > tco(max) + tcomb(max) + ts(max)
Avoid Hold time violationtco(min) + tcomb(min) > th(min)
82
163163Metastability in Sequential Circuits
• We take care of the setup time and hold time violation in
register to register paths in sequential circuits
• When can Metastability happens in datapath/Sequential
Circuits?
Kuruvilla Varghese
D Q
CLK
D Q
CLK
Comb
clk
164164Asynchronous Inputs
Kuruvilla Varghese
D Q
CLK
Comb
clk
inputs D Q
CLK
inputsoutputs
LogicFlip
Flops
PSNS
clock
reset
83
165165Asynchronous Inputs
• Asynchronous inputs to a sequential circuit can cause metastability in Flip-Flops.
• Asynchronous inputs:
– Outputs from Flip-Flops working on a different clock
– Outputs generated by some process not synchronized to the sequential circuit clock
• How to solve the problem ?
• Synchronize the input to the sequential circuit clock. Synchronizer
Kuruvilla Varghese
166166Single Stage Synchronizer
Kuruvilla Varghese
CLK
D
Q
tco
tclk > tco + tcomb + tsetup
D Q
CLK
Comb
clk
D Q
CLK
ainp
Sequential CircuitSynchronizer
84
167167Single Stage Synchronizer
• Asynchronous input is synchronized to the active clock edge and appear after delay tcq, if clock period is chosen properly it will meet the setup time at the next clock edge.
• Latency of one clock cycle for the flip-flops to sample the synchronized input.
• Synchronizing flip-flop samples the data at one point, system FFs sample the data at different points, owing to difference in path delays.
• Synchronizing flip-flop can get in to metastability, but problem is isolated to synchronizing flip-flop.
Kuruvilla Varghese
168168Single Stage Synchronizer
• But, if it comes out of metstability by next clock edge (with margin) it is fine.
• Also, we assume input remains valid for one more clock cycle to be correctly sampled and captured by the synchronizing flip-flop at next clock.
Kuruvilla Varghese
85
169169Single Stage Synchronizer
• The probability of a flip-flop remaining in metstability decreases
exponentially with time.
Kuruvilla Varghese
D Q
CLK
Comb
clk
D Q
CLK
ainp
Sequential CircuitSynchronizer
170170Metstability Resolution Time.
• Metstability resolution time (tr) for single stage synchronizer
tr = tclk – tcomb – ts
• How to increase tr ?
• Can we make tcomb zero ?
• Double stage Synchronizer
Kuruvilla Varghese
86
171171Double Stage Synchronizer
• tr = tclk – ts
• Latency of two clock period
Kuruvilla Varghese
D Q
CLK
Comb
clk
D Q
CLK
ainp D Q
CLK
Sequential CircuitSynchronizer
172172Further reducing tr
• If tr to be increased further ?
• We need to then increase tclk, but then the system throughput
would come down
• So, let us keep the system clock at same frequency and reduce the
clock of synchronizer, by dividing the system clock
• Multiple cycle synchronizer
Kuruvilla Varghese
87
173173Multiple Cycle Synchronizer
• tr = n .tclk – ts
• Latency = 2 . n . tclk
• n = 2 or 3
Kuruvilla Varghese
D Q
CLK
Comb
clk
D Q
CLK
ainpD Q
CLK
Mod-n
Counter
Sequential CircuitSynchronizer
174174Multi-cycle Synchronizer with de-skew FF
• Double stage synchronizer output will have additional skew of Mod-n counter output delay resulting in larger clock period.
Tclk – tskew> tcq + tcomb + ts
• This is de-skewed by de-skew flip-flop
Tclk – tskew> tcq + ts
Kuruvilla Varghese
D Q
CLK
Comb
clk
D Q
CLK
ainp D Q
CLK
Mod-n
Counter
D Q
CLK
Sequential CircuitSynchronizer
De-skew FF
88
175175Cascaded Synchronizer
• Probability of metastable output reduces multiplicatively, at each
stage of cascaded synchronizer.
Kuruvilla Varghese
D Q
CLK
Comb
clk
D Q
CLK
ainpD Q
CLK
D Q
CLK
Sequential CircuitSynchronizer
176176Asynchronous Inputs to FSM
• If EN is an asynchronous input state transitions can go to
unintended state
• Solution: Gray code assignment
Kuruvilla Varghese
1001
00
11
EN/
EN
EN/00
01
EN
89
177177Asynchronous Inputs to FSM
• Don’t branch to more than one state on an asynchronous input.
• Always use Go No-Go structure with gray code assignment
• Even better synchronize all asynchronous inputs
Kuruvilla Varghese
1001
00
11
EN/EN 00
01
EN/
EN
178178Reset Recovery Time (tREC, tRR)
• Reset Recovery time is the minimum amount of time between the
de-assertion of reset and the next rising clock edge, for the proper
sampling of ‘D’ input Flip-flop.
Kuruvilla Varghese
D Q
CK
ARclock
reset
90
179179Reset Recovery Time (tREC, tRR)
• One way to meet tRR is to synchronize asynchronous reset input to
flip-flop, then the reset behaviour would be same as that of
synchronous reset, as the reset happens at the clock edge.
• To retain the asynchronous reset behaviour only the trailing edge
of the asynchronous reset should be synchronized.
Kuruvilla Varghese
180180Asynchronous Reset
Kuruvilla Varghese
D Q
CK
D Q
CK
clk
rst
POR
To asynch
resets of
FFs
91
181181Asynchronous Reset
Kuruvilla Varghese
D Q
CK
D Q
CK
clk
rst/
POR
To async
resets of
FFs
182182
Digital System Design with PLDs and FPGAs
Case Study
Kuruvilla Varghese
DESE
Indian Institute of Science
Kuruvilla Varghese
92
183183Multiplier: Algorithm
Multiplicand 1 0 1 1 x
Multiplier 1 1 0 1
------------------
Partial products 1 0 1 1
0 0 0 0
1 0 1 1
1 0 1 1
-----------------------------
1 0 0 0 1 1 1 1
-----------------------------Kuruvilla Varghese
1841848-bit Multiplier: Issues
• Algorithm: Shift and Add
• 8 partial products – 7 Adders ?
– Accumulator
– Shift Accumulator right
• Resource sharing
– Multiplier & LSB of result
• Add and Shift in one clock cycle
• If multiplier bit is ‘0’ re-circulate result with shiftKuruvilla Varghese
93
185185Resources
• Multiplicand Register, 8 bit
• Result Register (Multiplier), 16 bit
• 9 bit Adder
• 9 bit, 2 to 1 Multiplexer
• Bit counter (Mod-8 / 3-bit)
• Controller (FSM)
Kuruvilla Varghese
186186
clk
rst
load
shift
MCND REG
ADD
0 1
H. PROD REG
L.PROD / MULT
REG
s0s8:1
su8:0r15:80
sel
clk
prst
shift
r15:8
mc7:0
clk
rst
load
ml7:0
md7:0
r15:8 r7:0
Multiplier: Data Path
94
187187Counter
Kuruvilla Varghese
Counter Decodercount2:0 max
clk
prst
shift
188188Controller
Kuruvilla Varghese
Controller
prst
load
clk
rst
start
r(0)
max
shift
sel
done
95
189189Multiplicand Register (MCND)
mcndreg: process (clk, rst)
begin
if (rst = '1') then
md <= (others => '0');
elsif (clk'event and clk = '1') then
if (load = '1') then
md <= mc;
end if;
end if;
end process mcndreg;
Kuruvilla Varghese
D Q
CK
AR
mdmd
mc
load
clk
rst
190190H Product Register
hprodreg: process (clk, prst)
begin
if (prst = '1') then
r(15 downto 8) <= (others => '0');
elsif (clk'event and clk = '1') then
if (shift = '1') then
r(15 downto 8) <= s(8 downto 1);
end if;
end if;
end process hprodreg;
Kuruvilla Varghese
D Q
CK
AR
r15:8r15:8
s8:1
shift
clk
prst
96
191191L. PRODUCT / MULT Register
mulreg: process (rst, clk)
begin
if (rst = '1') then
r(7 downto 0) <= (others => '0');
elsif (clk'event and clk = '1') then
if (load = '1') then
r(7 downto 0) <= ml;
elsif (shift = '1') then
r(7 downto 0) <= s(0) & r(7 downto 1);
end if;
end if;
end process mulreg;
Kuruvilla Varghese
D Q
CK
AR
r7:0
load
clk
rst
ml7:0
r7:0
s0 & r7:1
shift
192192Counter
-- Counter
counter: process (clk, prst)
begin
if (prst = '1') then
count <= (others => '0');
elsif (clk'event and clk = '1') then
if (shift = '1') then
count <= count + 1;
end if;
end if;
end process;
Kuruvilla Varghese
D
Q
CK
AR
count
shift
clk
prst
+1count
97
193193FSM: 3 Blocks view
Kuruvilla Varghese
Next
State
Logic
D
CK Q
AR
Output
Logic
NS
PS
OutputsInputs
Clock
Reset
NS = f (PS, Inputs)
Moore Outputs = f (PS)
Mealy Outputs = f (PS, Inputs)
194194FSM / Controller: 2 Blocks view
Kuruvilla Varghese
Logic D
CK Q
AR
NS PS
Outputs
Inputs
Clock
Reset
98
195195State Diagram
Kuruvilla Varghese
prst = 1, shift = 0,
load = 1, sel = 0,
done = 0
S0
S1
S2
start/
start
power_on
max/max
prst = 1, shift = 0,
load = 0, sel = 0,
done = 0
prst = 0, shift = 1,
load = 0, sel = r(0),
done = 0
S3
start/
start
prst = 0, shift = 0,
load = 0,
sel = 0, done = 1
196196
clk
rst
load
shift
MCND REG
ADD
0 1
H. PROD REG
L.PROD / MULT
REG
s0s8:1
su8:0r15:80
sel
clk
prst
shift
r15:8
mc7:0
clk
rst
load
ml7:0
md7:0
r15:8 r7:0
Multiplier: Data Path
99
197197Multiplier: VHDL Code
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_unsigned.all;
entity mult8 is port
(clk, rst, start: in std_logic;
done: out std_logic;
mc, ml: in std_logic_vector(7 downto 0);
prod: out std_logic_vector(15 downto 0));
end entity;
architecture arch_mult8 of mult8 is
type statetype is (s0, s1, s2, s3);
signal pr_state, nx_state: statetype;
signal prst, max, load: std_logic;
signal sel, shift: std_logic;
signal md: std_logic_vector(7 downto 0);
signal su, s: std_logic_vector(8 downto 0);
signal count: std_logic_vector(2 downto 0);
signal r: std_logic_vector(15 downto 0);
begin
Kuruvilla Varghese
198198Multiplier: VHDL Code
-- Multiplicand Register
mcndreg: process (clk, rst)
begin
if (rst = '1') then md <= (others => '0');
elsif (clk'event and clk = '1') then
if (load = '1') then
md <= mc;
end if;
end if;
end process mcndreg;
-- Multiplier cum Lower Product Register
mulreg: process (rst, clk)
begin
if (rst = '1') then
r(7 downto 0) <= (others => '0');
elsif (clk'event and clk = '1') then
if (load = '1') then
r(7 downto 0) <= ml;
elsif (shift = '1') then
r(7 downto 0) <= s(0) & r(7 downto 1);
end if;
end if;
end process mulreg;
Kuruvilla Varghese
100
199199Multiplier: VHDL Code
-- 9 bit 2-to-1 Multiplexer
s(8 downto 0) <= '0' & r(15 downto 8) when sel = '0' else su(8 downto 0);
-- Higher Product Register
hprodreg: process (clk, prst)
begin
if (prst = '1') then
r(15 downto 8) <= (others => '0');
elsif (clk'event and clk = '1') then
if (shift = '1') then
r(15 downto 8) <= s(8 downto 1);
end if;
end if;
end process hprodreg;
-- prod output
prod <= r;
-- Counter
counter: process (clk, prst)
begin
if (prst = '1') then
count <= (others => '0');
elsif (clk'event and clk = '1') then
if (shift = '1') then
count <= count + 1;
end if;
end if;
end process;
-- Max decoder
max <= '1' when (count = 7) else '0';
Kuruvilla Varghese
200200Multiplier: VHDL Code
-- Adder
su <= ('0' & md) + ('0' & r(15 downto 8));
-- FSM, Next state Logic, Output Logic
connsl: process (pr_state, start, r(0), max)
begin
case pr_state is
when s0 =>
prst <= '1'; load <= '0'; shift <= '0';
sel <= '0'; done <= '0';
if (start = '1') then nx_state <= s1;
else nx_state <= s0;
end if;
when s1 =>
prst <= '1'; load <= '1'; shift <= '0';
sel <= '0'; done <= '0';
nx_state <= s2;
when s2 =>
prst <= '0'; load <= '0'; shift <= '1';
sel <= r(0); done <= '0';
if (max = '1') then nx_state <= s3;
else nx_state <= s2;
end if;
Kuruvilla Varghese
101
201201Multiplier: VHDL Code
when s3 =>
prst <= '0'; load <= '0'; shift <= '0';
sel <= '0'; done <= '1';
if (start = '1') then nx_state <= s1;
else nx_state <= s3;
end if;
when others =>
prst <= '0'; load <= '0'; shift <= '0';
sel <= '0'; done <= '0';
nx_state <= s0;
end case;
end process;
-- FSM Flip Flops
conff: process (rst, clk)
begin
if (rst = '1') then
pr_state <= s0;
elsif (clk'event and clk = '1') then
pr_state <= nx_state;
end if;
end process;
end arch_mult8;
Kuruvilla Varghese
202202State Diagram
Kuruvilla Varghese
prst = 1, shift = 0,
load = 1, sel = 0,
done = 0
S0
S1
S2
start/
start
power_on
max/max
prst = 1, shift = 0,
load = 0, sel = 0,
done = 0
prst = 0, shift = 1,
load = 0, sel = r(0),
done = 0
S3
start/
start
prst = 0, shift = 0,
load = 0,
sel = 0, done = 1
102
203203Multiplier: VHDL Code version 2
-- Components with clk, prst
subreg1: process (clk, prst)
begin
if (prst = '1') then
-- HPROD Register, Counter clear
r(15 downto 8) <= (others => '0');
count <= (others => '0');
elsif (clk'event and clk = '1') then
-- Higher Product Register
if (shift = '1') then
r(15 downto 8) <= s(8 downto 1);
end if;
-- Counter
if (shift = '1') then count <= count + 1;
end if;
end if;
end process subreg1;
Kuruvilla Varghese
204204Multiplier: VHDL Code version 2
-- Components with clk, rst
subreg2: process (clk, rst)
begin
if (rst = '1') then md <= (others => '0');
r(7 downto 0) <= (others => '0');
elsif (clk'event and clk = '1') then
-- Multiplicand Register
if (load = '1') then md <= mc;
end if;
-- Multiplier cum Lower Product Register
if (load = '1') then
r(7 downto 0) <= ml;
elsif (shift = '1') then
r(7 downto 0) <= s(0) & r(7 downto 1);
end if;
end if;
end process subreg2;
Kuruvilla Varghese
103
205205Xilink Spartan 6 Atlys Board
Kuruvilla Varghese
206206
Kuruvilla Varghese
104
207207Input/Output
Kuruvilla Varghese
prod(15:8)
prod(7:0) 1
0
7 LEDs
50
Mhz
CLK DIV
0.125 Hz27:0
15start
VDD
VDD
ml(3:0)
208208Extra VHDL Code
entity mult8 is port(clk, rst, st: in std_logic;
done: out std_logic; disp: out
std_logic_vector(7 downto 0));
end entity
-- multiplicant, multiplier
signal mc, ml: std_logic_vector(7 downto 0);
-- start pulse
signal dst, start, stclk: std_logic;
-- 50 MHz -> 0.25 Hz
constant termdcount: std_logic_vector(27
downto 0) := X"BEBC200";
signal dcount: std_logic_vector(27 downto 0);
signal dclk: std_logic;
-- Multiplicant, Multiplier assignment
mc <= X"A5"; ml(7 downto 4) <= X"4";
ml (3 downto 0) <= mli;
Kuruvilla Varghese
105
209209Extra VHDL Code
-- Clock for LED multiplexing
-- 50 Mhz to 0.125 Hz clock divider.
dispcount: process (rst, clk)
begin
if (rst = '1') then dcount <= (others => '0');
dclk <= '0';
elsif (clk'event and clk = '1') then
if (dcount = termdcount) then
dcount <= (others => '0'); dclk <= not(dclk);
else
dcount <= dcount + '1';
end if;
end if;
end process dispcount;
-- LED Multiplexing
disp <= r(7 downto 0) when dclk = '1' else r(15
downto 8);
stclk <= dcount(15);
stpulse: process (rst, stclk)
begin
if (rst = '1') then dst <= '0';
elsif (stclk'event and stclk = '1') then dst <=
st;
end if;
end process stpulse;
start <= st and not(dst);
Kuruvilla Varghese
210210Extra VHDL Code
-- Clock for LED multiplexing
-- 50 Mhz to 0.125 Hz clock divider.
dispcount: process (rst, clk)
begin
if (rst = '1') then dcount <= (others => '0');
dclk <= '0';
elsif (clk'event and clk = '1') then
if (dcount = termdcount) then
dcount <= (others => '0'); dclk <= not(dclk);
else
dcount <= dcount + '1';
end if;
end if;
end process dispcount;
-- LED Multiplexing
disp <= r(7 downto 0) when dclk = '1' else r(15
downto 8);
stclk <= dcount(15);
stpulse: process (rst, stclk)
begin
if (rst = '1') then dst <= '0';
elsif (stclk'event and stclk = '1') then dst <=
st;
end if;
end process stpulse;
start <= st and not(dst);
Kuruvilla Varghese
106
211211State Diagram
Kuruvilla Varghese
prst = 1, shift = 0,
load = 1, sel = 0,
done = 0
S0
S1
S2
start/
start
power_on
max/max
prst = 0, shift = 0,
load = 0, sel = 0,
done = 0
prst = 0, shift = 1,
load = 0, sel = r(0),
done = 0
S3
start
start/
prst = 0, shift = 0,
load = 0,
sel = 0, done = 1