fpga-based system design: chapter 6 copyright 2004 prentice hall ptr register-transfer design n...
TRANSCRIPT
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Register-transfer Design
Basics of register-transfer design:– data paths and controllers.
High-level synthesis.
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Register-transfer design
A register-transfer system– is a sequential machine.
Register-transfer design – is structural—complex combinations of state machines
– may not be easily described solely by a large state transition graph.
Register-transfer design– concentrates on functionality, not details of logic
design.
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Register-transfer system example
A register-transfer machinehas combinational logic connecting registers:
DQ combinationallogic
D QD Q combinationallogic
combinationallogic
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Block diagrams
Block diagrams specify structure: wire bundleof width 5
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Data path-controller systems
One good way to structure a system is as a data path and a controller:– data path
» executes regular operations (arithmetic, etc.), holds registers with data-oriented state;
– Controller» evaluates irregular functions, sets control signals
for data path.
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Data and control
+
ctrl
carry select
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Data operators
Arithmetic operations– are easy to spot in hardware description languages:
» x <= a + b;
Multiplexers– are implied by conditionals.
– Must evaluate entire program to determine which sources of data for registers.
– Multiplexers also come from sharing adders, etc.
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Conditionals and multiplexers
if x = ‘0’ then
reg1 <= a;
else
reg1 <= b;
end if;
code
register-transfer
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Alternate data path-controller systems
controller
data path
one controller,one data path
controller
data path
controller
data path
two communicatingdata path-controller
systems
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Pipelines
Provide higher utilization of logic:
Combinational logic
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Pipeline metrics
Throughput: rate at which new values enter the system.– Initiation interval: time between successive
inputs. Latency: delay from input to output.
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Simple pipelines
Pure pipelines have no control. Choose latency, throughput. Choose register locations with retiming. Overhead:
– Setup, hold times.– Power.
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Complex pipelines
Actions in pipeline depend on data or external events.
Actions on pipe:– Stall values.– Abort operation.– Bypass values.
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
High-level synthesis
Sequential operation – is not the most abstract description of behavior.
We can describe behavior – without assigning operations to particular clock
cycles. High-level synthesis
– (behavioral synthesis) transforms an unscheduled behavior into a register-transfer behavior.
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Tasks in high-level synthesis
Scheduling– determines clock cycle on which each
operation will occur. Allocation
– chooses which function units will execute which operations.
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Functional modeling code in Verilog
assign o1 = i1 | i2;
if (! I3) then
o1 = 1’b1;
o2 = a + b;
else
o1 = 1’b0;
end;
clock cycle boundary canbe moved to design differentregister transfers
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Data dependencies
Data dependencies – describe relationships between operations:
» x <= a + b; value of x depends on a, b
High-level synthesis– must preserve data dependencies.
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Data flow graph
Data flow graph (DFG)– models data dependencies.
Does not require that operations be performed in a particular order.
Models operations in a basic block of a functional model—no conditionals.
Requires single-assignment form.
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Data flow graph construction
original code:
x <= a + b;
y <= a * c;
z <= x + d;
x <= y - d;
x <= x + c;
single-assignment form:
x1 <= a + b;
y <= a * c;
z <= x1 + d;
x2 <= y - d;
x3 <= x2 + c;
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Data flow graph construction, cont’d
Data flow forms directed acyclic graph (DAG):
single-assignment form:
x1 <= a + b;
y <= a * c;
z <= x1 + d;
x2 <= y - d;
x3 <= x2 + c;
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Goals of scheduling and allocation
Preserve behavior– at end of execution, should have received all
outputs– be in proper state (ignoring exact times of
events). Utilize hardware efficiently. Obtain acceptable performance.
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Data flow to data path-controller
One feasible schedule for last DFG:
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Binding values to registers
registers fall onclock cycleboundaries
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Allocation creates multiplexers
Same unit used for different values at different times.– Function units.– Registers.
Multiplexer controls which value has access to the unit.
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Choosing function units
muxes allowfunction unitsto be sharedfor severaloperations
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Building the sequencer
sequencer requires three states,even with no conditionals
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Verilog for data path
module dp(reset,clock,a,b,c,d,muxctrl1,muxctrl2,muxctrl3,muxctrl4,loadr1,loadr2,loadr3,loadr4,x3,z);
parameter n=7; input reset; input clock; input [n:0] a, b, c, d; // data primary inputs input muxctrl1, muxctrl2, muxctrl4; // mux control input [1:0] muxctrl3; // 2-bit mux control input loadr1, loadr2, loadr3, loadr4; // register control output [n:0] x3, z;
reg [n:0] r1, r2, r3, r4; // registerswire [n:0] mux1out, mux2out, mux3out, mux3bout, mux4out, mult1out, mult2out;
assign mux1out = (muxctrl1 == 0) ? a : r1;assign mux2out = (muxctrl2 == 0) ? b : r4;assign mux3out = (muxctrl3 == 0) ? a : (muxctrl3 == 1 ? r4 : r3);assign mux4out = (muxctrl4 == 0) ? c : r2;assign mult1out = mux1out * mux2out;assign mult2out = mux3out * mux4out;assign x3 = mult2out;assign z = mult1out;always @(posedge clock)
begin if (reset)
r1 = 0; r2 = 0; r3 = 0; r4 = 0;endif (loadr1) r1 = mult1out;if (loadr2) r2 = mult2out;if (loadr3) r3 = c;if (loadr4) r4 = d;end
endmodule
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Choices during high-level synthesis
Scheduling – determines number of clock cycles required;
Binding– determines area, cycle time.
Area tradeoffs– must consider shared function units vs. multiplexers,
control. Delay tradeoffs
– must consider cycle time vs. number of cycles.
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Finding schedules
Two simple schedules:– As-soon-as-possible (ASAP) schedule
» puts every operation as early in time as possible.
– As-late-as-possible (ALAP) schedule » puts every operation as late in schedule as possible.
Many schedules exist between ALAP and ASAP extremes.
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
ASAP and ALAP schedules
ASAP
ALAP
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Verilog model of ASAP schedule
reg [n-1:0] w1reg, w2reg, w6reg1, w6reg2, w6reg3,w6reg4, w3reg1, w3reg2, w4reg, w5reg;
always @(posedge clock)begin// cycle 1w1reg = i1 + i2; w3reg1 = i4 + i5; w6reg1 = i7 + i8;// cycle 2w2reg = w1reg + i3;w3reg2 = w3reg1;w6reg2 = w6reg1;// cycle 3w4reg = w3reg2 + w2reg;w6reg3 = w6reg2;// cycle 4w5reg = i6 + w4reg;w6reg4 = w6reg3;// cycle 5o1 = w6reg4 + w5reg;end
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Verilog of ALAP schedule
reg [n-1:0] w1reg, w2reg, w6reg, w6reg2, w6reg3, w3reg, w4reg, w5reg;
always @(posedge clock)begin// cycle 1w1reg = i1 + i2; // cycle 2w2reg = w1reg + i3;w3reg = i4 + i5; // cycle 3w4reg = w3reg + w2reg;w6reg3 = w6reg2;// cycle 4w5reg = i6 + w4reg;w6reg = i7 + i8;// cycle 5o1 = w6reg + w5reg;end
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Critical path of schedule
Longest path through data flowdetermines minimum schedule length:
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Operator chaining
Operator chaining.– May execute several operations
in sequence in one cycle Delay through function units
– may not be additive, such as through several adders.
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Control implementation
Clock cycles– are also known as control steps.
Longer schedule– means more states in controller.
Cost of controller– may be hard to judge from casual inspection of
state transition graph.
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Controllers and scheduling
functional model:
x <= a + b;
y <= c + d;one state
two states
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Distributed control
one centralized controller
two distributed controllers
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Synchronized communication between FSMs
To pass values between two machines, must schedule outputof one machine to coincide with input expected by the other:
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Hardwired vs. microcoded control
Hardwired control has a state register and “random logic.”
A microcoded machine has a state register which points into a microcode memory.
Styles are equivalent; choice depends on implementation considerations.
FPGA-Based System Design: Chapter 6 Copyright 2004 Prentice Hall PTR
Data path-controller delay
Watch out for long delay paths created by combination of data path and controller: