bluespec - imperial college londoncas.ee.ic.ac.uk/people/ssingh/bluespec_l3l4.pdf · 2010. 12....
TRANSCRIPT
Bluespec
Lectures 3 & 4
with some slides from Nikhil Rishiyur at Bluespec and Simon Moore at the University of Cambridge
Course Resources
• http://cas.ee.ic.ac.uk/~ssingh
• Lecture notes (Power Point, PDF)
• Example Bluespec programs used in Lectures
• Complete Photoshop system (Bluespec)
• Links to Bluespec code samples
• User guide, reference guide: doc sub-directory of Bluespec installation
• More information at http://bluespec.com
Rules, not clock edges
• rules are atomic
– they execute within one clock cycle
• structure: rule name (explicit conditions) statements; endrule
• conditions:
– explicit – conditions (Boolean expression) provided
– implicit – conditions that have to be met to allow the statements to fire, e.g. for fifo.enq only if fifo not full
Rules: powerful alternative to always blocks
• rules for state updates instead of always blocks
• Simple concept: think if…then…
• Rule can execute (or “fire”) only when its conditions are TRUE
• Every rule is atomic with respect to other rules
• Powerful ramifications: – Executable specification – design around operations as described in specs
– Atomicity of rules dramatically reduces concurrency bugs
– Automates management of shared resources – avoids many complex errors
rule ruleName (<boolean cond>);
<state update(s)>
endrule
Bits, Bools and conversion
• Bit#(width) – vector of bits
• Bool – single bit for Booleans (True, False)
• pack() – function to convert most things (pack) into a bit representation
• unpack() – opposite of pack()
• extend() – extend an integer (signed, unsigned, bits)
• truncate() – truncate an integer
Reg and Bit/Uint/Int types • registers (initialised and uninitialised versions):
Reg#(type) name0 <- mkReg(initial_value); Reg#(type) name1 <- mkRegU;
• some types (unsigned and signed integer, and bits): UInt#(width), Int#(width), Bit#(width)
• example: Reg#(UInt#(8)) counter <- mkReg(0); rule count_up; counter <= counter+1; endrule
name of module to “make” (i.e. instantiate)
N.B. modules are typically prefixed “mk” interface type
type parameter (e.g. UInt#(8))
since Reg is generic
Registers
interface Reg#(type a); method Action _write (a x1); method a _read (); endinterface: Reg • Polymorphic • Just library elements • In one cycle register reads must execute before
register writes • x <= y + 1 is syntactic sugar for
x._write (y._read + 1)
Scheduling Annotations
C Conflict
CF Conflict free
SB Sequence before
SBR Sequence before restricted (cannot be in the same rule)
SA Sequence after
SAR Sequence after restricted (cannot be in the same rule)
Scheduling Annotations for a Register
read write
read CF SB
write SA SBR
• Two read methods would be conflict-free (CF), that is, you could have multiple methods that read from the same register in the same rule, sequenced in any order.
• A write is sequenced after (SA) a read. • A read is sequenced before (SB) a write. • If you have two write methods, one must be sequenced before the other,
and they cannot be in the same rule, as indicated by the annotation SBR.
Updating Registers
Reg#(int) x <- mkReg (0) ; rule countup (x < 30); int y = x + 1; x <= x + 1; $display ("x = %0d, y = %0d", x, y); endrule
Rules of Rules (The Three Basics)
1. Rules are atomic
2. Rules fire or don’t at most once per cycle
3. Rules don’t conflict with other rules
x
y
+1 Q
D
D
Q +1
clk
rule r1; x <= y + 1; endrule rule r2; y <= x + 1; endrule
x2
y2
+1 Q
D
D
Q +1
clk
(* synthesize *) module rules4 (Empty); Reg#(int) x <- mkReg (10); Reg#(int) y <- mkReg (100); rule r1; x <= y + 1; endrule rule r2; y <= x + 1; endrule rule monitor; $display ("x, y = %0d, %0d ", x, y); endrule endmodule
$ ./rules4 -m 5 x, y = 10, 100 x, y = 10, 11 x, y = 10, 11 x, y = 10, 11
$ ./rules5 -m 5 x, y = 10, 100 x, y = 101, 11 x, y = 12, 102 x, y = 103, 13
x
y
+1 Q
D
D
Q +1
clk
(* synthesize *) module rules5 (Empty); Reg#(int) x <- mkReg (10); Reg#(int) y <- mkReg (100); rule r ; x <= y + 1; y <= x + 1; endrule rule monitor; $display ("x, y = %0d, %0d ", x, y); endrule endmodule
x2
y2
+1 Q
D
D
Q +1
clk
(* synthesize *) module rules6 (Empty); Reg#(int) x <- mkReg (10); Reg#(int) y <- mkReg (100); rule r1; x <= y + 1; endrule rule r2; y <= x + 1; endrule (* descending_urgency = "r1, r2" *) rule monitor; $display ("x, y = %0d, %0d ", x, y); endrule endmodule
$ ./rules6 -m 5 x, y = 10, 100 x, y = 101, 100 x, y = 101, 100 x, y = 101, 100
interface Rules7_Interface ; method int readValue ; method Action setValue (int newXvalue) ; method ActionValue#(int) increment ; endinterface (* synthesize *) module rules7 (Rules7_Interface); Reg#(int) x <- mkReg (0); method readValue ; return x ; endmethod method Action setValue (int newXvalue); x <= newXvalue ; endmethod method ActionValue#(int) increment ; x <= x + 1 ; return x ; endmethod endmodule
interface Rules7_Interface ; (* always_ready *) method int readResult ; (* always_enabled *) method Action setValues (int newX, int newY, int newZ) ; endinterface (* synthesize *) module rules7 (Rules7_Interface) ; Reg#(int) x <- mkReg (0) ; Reg#(int) y <- mkReg (0) ; Reg#(int) z <- mkReg (0) ; Reg#(int) result <- mkRegU ; Reg#(Bool) b <- mkReg (False) ; rule toggle ; b <= !b ; endrule rule r1 (b) ; result <= x * y ; endrule rule r2 (!b) ; result <= x * z ; endrule method readResult = result ; method Action setValues (int newX, int newY, int newZ) ; x <= newX ; y <= newY ; z <= newZ ; endmethod endmodule
// remaining internal signals assign x_MUL_y___d8 = x * y ; assign x_MUL_z___d5 = x * z ;
interface Rules8_Interface ; (* always_ready *) method int readResult ; (* always_enabled *) method Action setValues (int newX, int newY, int newZ) ; endinterface (* synthesize *) module rules8 (Rules8_Interface) ; Reg#(int) x <- mkReg (0) ; Reg#(int) y <- mkReg (0) ; Reg#(int) z <- mkReg (0) ; Wire#(int) t <- mkWire ; Reg#(int) result <- mkRegU ; Reg#(Bool) b <- mkReg (False) ; rule toggle ; b <= !b ; endrule rule computeT ; if (b) t <= y ; else t <= z ; endrule rule r1 (b) ; result <= x * t ; endrule method readResult = result ; method Action setValues (int newX, int newY, int newZ) ; x <= newX ; y <= newY ; z <= newZ ; endmethod endmodule
// inlined wires assign t$wget = b ? y : z ; … // remaining internal signals assign x_MUL_t_wget___d6 = x * t$wget ;
High Level Synthesis
• Most work on high level synthesis focuses on the automation scheduling and allocation to achieve resource sharing.
• Perspective: high level synthesis in general applies to many aspects of converting high level descriptions into efficient circuits but there has been an undue level of effort on resource sharing in an ASIC context.
• Bluespec automates many aspects of scheduling (it makes scheduling composable) but resource usage is under the explicit control of the designer.
• For FPGA-based design this is often a better bit as a programming model.
Simple example with concurrency and shared resources
Process 0: increments register x when cond0
Process 1: transfers a unit from register x to register y when cond1
Process 2: decrements register y when cond2
Each register can only be updated by one process on each clock. Priority: 2 > 1 > 0
Just like real applications, e.g.: Bank account: 0 = deposit to checking, 1 = transfer from checking to
savings, 2 = withdraw from savings
0 1 2
x y
+1 -1 +1 -1
Process priority: 2 > 1 > 0
cond0 cond1 cond2
Fundamentally, we are scheduling three potentially concurrent atomic transactions that share resources.
What if the priorities changed: cond1 > cond2 > cond0? What if the processes are in different modules?
0 1 2
x y
+1 -1 +1 -1 Process priority: 2 > 1 > 0
cond0 cond1 cond2
always @(posedge CLK) begin
if (cond2)
y <= y – 1;
else if (cond1) begin
y <= y + 1; x <= x – 1;
end
if (cond0 && !cond1)
x <= x + 1;
end
* There are other ways to write this RTL, but all suffer from same analysis
Resource-access scheduling logic i.e., control logic
always @(posedge CLK) begin
if (cond2)
y <= y – 1;
else if (cond1) begin
y <= y + 1; x <= x – 1;
end
if (cond0 && (!cond1 || cond2) )
x <= x + 1;
end
Better scheduling
With Bluespec, the design is direct
(* descending_urgency = “proc2, proc1, proc0” *)
rule proc0 (cond0);
x <= x + 1;
endrule
rule proc1 (cond1);
y <= y + 1;
x <= x – 1;
endrule
rule proc2 (cond2);
y <= y – 1;
endrule
Hand-written RTL: Explicit scheduling Complex clutter,
unmaintainable
BSV: Functional correctness follows directly from rule semantics (atomicity)
Executable spec (operation-centric)
Automatic handling of shared resource control logic
Same hardware as the RTL
0 1 2
x y
+1 -1 +1 -1
Process priority: 2 > 1 > 0
cond0 cond1 cond2
Now, let’s make a small change: add a new process and insert its priority
0
1
2
x y
+1
-1 +1
-1
Process priority: 2 > 3 > 1 > 0
cond0 cond1 cond2
3 +2 -2
cond3
Process priority: 2 > 3 > 1 > 0
Changing the Bluespec design
0
1
2
x y
+1
-1 +1
-1
cond0 cond1 cond2
3 +2 -2
cond3
(* descending_urgency = “proc2, proc1, proc0” *)
rule proc0 (cond0);
x <= x + 1;
endrule
rule proc1 (cond1);
y <= y + 1;
x <= x – 1;
endrule
rule proc2 (cond2);
y <= y – 1;
endrule
(* descending_urgency = "proc2, proc3, proc1, proc0" *)
rule proc0 (cond0);
x <= x + 1;
endrule
rule proc1 (cond1);
y <= y + 1;
x <= x - 1;
endrule
rule proc2 (cond2);
y <= y - 1;
x <= x + 1;
endrule
rule proc3 (cond3);
y <= y - 2;
x <= x + 2;
endrule
Pre-Change
?
Process priority: 2 > 3 > 1 > 0
Changing the Verilog design
0
1
2
x y
+1
-1 +1
-1
cond0 cond1 cond2
3 +2 -2
cond3
always @(posedge CLK) begin
if (!cond2 && cond1)
x <= x – 1;
else if (cond0)
x <= x + 1;
if (cond2)
y <= y – 1;
else if (cond1)
y <= y + 1;
end
always @(posedge CLK) begin
if ((cond2 && cond0) || (cond0 && !cond1 && !cond3))
x <= x + 1;
else if (cond3 && !cond2)
x <= x + 2;
else if (cond1 && !cond2)
x <= x - 1
if (cond2)
y <= y - 1;
else if (cond3)
y <= y - 2;
else if (cond1)
y <= y + 1;
end
Pre-Change
?
Alternate RTL style (more common)
• Combinatorial explosion
• Case 3’b111 is subtle
• Many repetitions of update actions ( cut-paste errors)
– cf. “WTO Principle” (Write Things Once—Gerard Berry)
• Difficult to maintain/extend
• Difficult to modularize
0 1 2
x y
+1 -1 +1 -1 Process priority: 2 > 1 > 0
cond0 cond1 cond2
always @ (posedge clk)
case ({cond0, cond1, cond2})
3'b000: begin // nothing happens
x <= x; y <= y;
end
3'b001: begin //proc2 fires
y <= y-1;
end
3'b010: begin //proc1
x <= x-1; y <= y+1;
end
3'b011: begin //proc2 fires (2>1)
y <= y-1;
end
3'b100: begin //proc0
x <= x+1;
end
3'b101: begin //proc2 + proc0
x <= x+1; y <= y-1;
end
3'b110: begin //proc1 (1>0)
x <= x-1; y <= y+1;
end
3'b111: begin //proc2 + proc0
x <= x+1; // NOTE – subtle!
y <= y-1;
end
endcase
Late Specifications
Late specification changes and feature enhancements are challenging to deal with.
Micro-architectural changes for timing/area/performance, e.g.: Adding a pipeline stage to an existing pipeline
Adding a pipeline stage where pipelining was not anticipated
Spreading a calculation over more clocks (longer iteration)
Moving logic across a register stage (rebalancing)
Restructuring combinational clouds for shallower logic
Fixing bugs
Bluespec makes it easier to try out multiple macro/micro-architectures earlier in the design cycle
Why Rule atomicity improves correctness
Correctness is often couched (formally or informally) as an invariant E.g.,
Rule atomicity improves thinking about (and formally proving) invariants, because invariants can be verified one rule at a time
In contrast, in RTL and thread models, must think of all possible interleavings cf. The Problem With Threads, Edward A. Lee, IEEE Computer
39(5), May 2006, pp. 33-42
“# ingress packets — # egress packets == packet-count register value”
Bank Account: Key Benefits
• Executable specifications
• Rapid changes
• But, with fine-grained control of RTL:
– Define the optimal architecture/micro-architecture
– Debug at the source OR RTL level – designer understands both
– The Quality of Results (QoR) of RTL!
A more complex example, from CPU design
Speculative, out-of-order
Many, many concurrent activities
Branch
Register File
ALU Unit Re-
Order Buffer (ROB) MEM
Unit
Data Memory
Instruction Memory
Fetch Decode
FIF
O
FIFO FIFO FIFO FIFO
FIF
O
FIF
O F
IFO
FIF
O F
IFO
Re- Order Buffer (ROB)
Branch
Register File
ALU Unit
MEM Unit
Data Memory
Instruction Memory
Fetch Decode
33
Many concurrent actions on common state: nightmare to manage explicitly
Empty Waiting
E W
Head
Tail
V - - Instr - V -
V - - Instr - V -
V - - Instr - V -
V - - Instr - V -
V - - Instr - V -
V - - Instr - V -
V - - Instr - V -
V - - Instr - V -
V - - Instr - V -
V - - Instr - V -
V 0 - Instr B V 0 W
V 0 - Instr C V 0 W
- Instr D V 0 W
V 0 - Instr A V 0 W
V - - Instr - V -
V - - Instr - V - E
E
E
E
E
E
E
E
E
E
E
E
V 0
Re-Order Buffer
Put an instr into
ROB
Decode Unit
Register File
Get operands for instr
Writeback results
Get a ready ALU instr
Get a ready MEM instr
Put ALU instr results in ROB
Put MEM instr results in ROB
ALU Unit(s)
MEM Unit(s) Resolve
branches
Operand 1 Result Instruction Operand 2 State
Branch Resolution
• …
• …
• … Commit Instr
• Write results to register file (or allow memory write for store)
• Set to Empty
• Increment head pointer
Write Back Results to ROB
• Write back results to instr result
• Write back to all waiting tags
• Set to done
Dispatch Instr
• Mark instruction dispatched
• Forward to appropriate unit
In Bluespec…
..you can code each operation in isolation, as a rule
..the tool guarantees that operations are INTERLOCKED (i.e. each runs to completion without external interference)
Insert Instr in ROB
• Put instruction in first available slot
• Increment tail pointer
• Get source operands
- RF <or> prev instr
Which one is correct?
What’s required to verify that they’re correct? What if the priorities changed: cond1 > cond2 > cond0? What if the processes are in different modules?
always @(posedge CLK) begin
if (!cond2 || cond1)
x <= x – 1;
else if (cond0)
x <= x + 1;
if (cond2)
y <= y – 1;
else if (cond1)
y <= y + 1;
end
0 1 2
x y
+1 -1 +1 -1 Process priority: 2 > 1 > 0
cond0 cond1 cond2
always @(posedge CLK) begin
if (!cond2 && cond1)
x <= x – 1;
else if (cond0)
x <= x + 1;
if (cond2)
y <= y – 1;
else if (cond1)
y <= y + 1;
end
Some Verilog solutions
Functional code and scheduling code are deeply (inextricably) intertwined.
What’s required to verify that they’re correct? What if the priorities changed: cond1 > cond2 > cond0? What if the processes are in different modules?
always @(posedge CLK) begin
if (!cond2 || cond1)
x <= x – 1;
else if (cond0)
x <= x + 1;
if (cond2)
y <= y – 1;
else if (cond1)
y <= y + 1;
end
0 1 2
x y
+1 -1 +1 -1
always @(posedge CLK) begin
if (!cond2 && cond1)
x <= x – 1;
else if (cond0)
x <= x + 1;
if (cond2)
y <= y – 1;
else if (cond1)
y <= y + 1;
end
Which one
is correct?
Process priority:
2 > 1 > 0
cond0 cond1 cond2
37
Finite State Machines in Bluespec
for makigncomposable, parallel, nested, suspendable/abortable FSMs
Features: • FSMs automatically synthesized
•Complex FSMs expressed succinctly
• FSM actions have same atomic semantics as BSV rule bodies • Well-behaved on shared resources—no surprises
• Standard BSV interfaces and BSV’s higher-order functions can write your
own FSM generators
fsm
sequential
loops
fsm fsm
sequencing
fsm
fsm
fsm
fsm
if-then-else parallel FSMs
(fork-join)
fsm
fsm fsm
hierarchy
(with suspend and abort)
This powerful capability is enabled by higher-order functions, polymorphic types, advanced parameterization and atomic transactions
Enables exponentially smaller
descriptions compared to flat FSMs
38
FSM example (from testbench stimulus section)
Stmt s =
seq
action
rand_packets0.init;
rand_packets1.init;
endaction
par
for (j0 <= 0; j0 < n; j0 <= j0 + 1) action
let pkt0 <- rand_packets0.next;
switch.ports[0].put (pkt0);
endaction
for (j1 <= 0; j1 < n; j1 <= j1 + 1) action
let pkt1 <- rand_packets1.next;
switch.ports[1].put (pkt1);
endaction
endpar
drain_switch;
endseq;
FSM fsm <- mkFSM (s);
rule go;
s.start;
endrule
Basic FSM statements are “Actions”, just like rule bodies, and have exactly the same atomic semantics. Thus, BSV FSMs are well-behaved with respect to concurrent resource contention and flow control.
39
Strong support for multiple clock and reset domains
• Rich and mature support for MCD (multiple clock domains and
resets)
• Clock is a first-class data type
• Cannot accidentally mix clocks and ordinary signals
• Strong static checking ensures that it is impossible to
accidentally cross clock domain boundaries (i.e., without a
synchronizer)
• No need for linting tools to check domain discipline
• Clock manipulation
• Clocks can be passed in and out of module interfaces
• Library of clock dividers and other transformations
• Module instantiation can specify an alternative clock (instead of
inheriting parent’s default clock)
• (Similarly: Reset and reset domains)
Synthesis of Atomic Actions
state
Compute Predicates
for each rule
Compute next state
for each rule
scheduler
Selector Mux’s & priority
encoders
read
p3
p2
p1
d1
d2
d3
f1 f2 f3
update
Predicates computed for each rule with a combinational circuit
Select maximal subset of applicable rules
enabled rules
Potential update functions
Key Issue: How to select to maximal subset of rules for firing?
• Two rules R1 and R2 can execute simultaneously if they are “conflict free” i.e.
– R1 and R2 do not update the same state; and
– Neither R1 or R2 do not read the that the other updates (“sequentially composable” rules)
Rules of Rules (The Details 1-5/10)
1. Rules are atomic: rules fire completely or not at all, and you can imagine that nothing else happens during their execution.
2. Explicit and implicit conditions may prevent rules from firing. 3. Every rule fires exactly 0 or 1 times every cycle (at this point in our
product's history anyway ;) 4. Rules that conflict in some way may fire together in the same cycle, but
only if the compiler can schedule them in a valid order to do so -- that is, where the overall effect is as if they had happened one at at time as in (1) above.
5. Rules determine if they are going to fire or not before they actually do so. They are considered in their order of "urgency" (by a "greedy algorithm"): they "will fire" if they "can fire" and are not prevented by a conflict with a rule which has been selected already. It's OK to think of this phase as being completed (except for wires) before any rules are actually executed. This is what "urgency" is about.
Rules of Rules (The Details 6-10/10)
6. After determining which rules are going to fire, the simulator can then schedule their execution. (In hardware it's all done by combinational logic which has the same effect.) Rules do not need to execute in the same order as they were considered for deciding whether they "will fire". For example rule1 can have a higher urgency than rule2, but it is possible that rule2 executes its logic before rule1. Urgency is used to determine which rules "will fire“. Earliness defines the order they fire in.
7. All reads from a register must be scheduled before any writes to the same register: any rule which reads from a register must be scheduled "earlier" than any other rule which writes to it.
8. Constants may be "read" at any time; a register *might* have a write but no read.
9. The compiler creates a sequence of steps, where each step is essentially a rule firing. Its inputs are valid at the beginning of the cycle, its outputs are valid at the end of the cycle. Data is not allowed to be driven "backwards" in the schedule: that is, no action may influence any action that happened "earlier" in the cycle. This would go against causality, and constitutes a "feedback" path that the compiler will not allow.
10. If the compiler is not told otherwise, methods have higher urgency than rules, and will execute earlier than rules, unless there's some reason to the contrary. There is a compiler switch to flip this around and make rules have higher urgency.
The Swap Conundrum (* synthesize *) module rules9 (Empty) ; Reg#(int) x <- mkReg (12) ; Reg#(int) y <- mkReg (17) ; rule r1 ; x <= y ; endrule rule r2 ; y <= x ; endrule rule monitor ; $display ("x, y = %0d, %0d", x, y) ; endrule endmodule
$ ./rules9 -m 5 x, y = 12, 17 x, y = 12, 12 x, y = 12, 12 x, y = 12, 12
The Swap Conundrum (* synthesize *) module rules9 (Empty) ; Reg#(int) x <- mkReg (12) ; Reg#(int) y <- mkReg (17) ; rule r1 ; x <= y ; endrule rule r2 ; y <= x ; endrule rule monitor ; $display ("x, y = %0d, %0d", x, y) ; endrule endmodule
rule r1 (tick 1) x._write (y._read ()) y read x write
rule r2 (tick 2) y._write(x._read()) x read y write
PROBLEM: register x must read before write
(* synthesize *) module rules10 (Empty) ; Reg#(int) x <- mkReg (12) ; Reg#(int) y <- mkReg (17) ; rule r ; x <= y ; y <= x ; endrule rule monitor ; $display ("x, y = %0d, %0d", x, y) ; endrule endmodule
$ ./rules10 -m 5 x, y = 12, 17 x, y = 17, 12 x, y = 12, 17 x, y = 17, 12
Schedule wise, step 1 reads x and y at the beginning and writes x and y at the end.
Wires
• In Bluespec from a scheduling perspective registers and wires are dual concepts.
• In one cycle all register reads must execute before register writes.
• In one cycle a wire must be written to (at most once) before it is read (any number of times).
Rules of Wires
• Wires truly become wires in hardware: they do not save “state” between cycles (compare to signal in VHDL).
• A wire’s schedule requires that it be written before it is read (as opposed to a register that is read before it is written).
• A wire can not be written more than once in a cycle.
(* synthesize *) module rules11 (Empty) ; Reg#(int) x <- mkReg (12) ; Reg#(int) y <- mkReg (17) ; Wire#(int) xwire <- mkWire; rule r1 ; x <= y ; endrule rule r2 ; y <= xwire ; endrule rule driveX ; xwire <= x ; endrule rule monitor ; $display ("x, y = %0d, %0d", x, y) ; endrule endmodule
$ ./rules11 -m 5 x, y = 12, 17 x, y = 17, 12 x, y = 12, 17 x, y = 17, 12
(* synthesize *) module rules11 (Empty) ; Reg#(int) x <- mkReg (12) ; Reg#(int) y <- mkReg (17) ; Wire#(int) xwire <- mkWire; rule r1 ; x <= y ; endrule rule r2 ; y <= xwire ; endrule rule driveX ; xwire <= x ; endrule rule monitor ; $display ("x, y = %0d, %0d", x, y) ; endrule endmodule
$ cat rules11.sched === Generated schedule for rules11 === Rule schedule ------------- Rule: monitor Predicate: True Blocking rules: (none) Rule: driveX Predicate: True Blocking rules: (none) Rule: r2 Predicate: xwire.whas Blocking rules: (none) Rule: r1 Predicate: True Blocking rules: (none) Logical execution order: monitor, driveX, r1, r2 =======================================
Question: is monitor, driveX, r2, r1 a valid schedule?
Wire
• Implements Reg interface (_read and _write methods).
• Implicit condition:
– it not ready if it has not been written
• In any cycle if there is no write to a wire then any rule that reads that wire is blocked (it can not fire).
(* synthesize *) module rules12 (Empty) ; Reg#(int) y <- mkReg (17) ; Reg#(int) count <- mkReg (0) ; Wire#(int) x <- mkWire; rule producer ; if (count % 3 == 0) x <= count ; endrule rule consumer ; y <= x ; $display ("cycle %0d: y set to %0d", count, x) ; endrule rule counter ; count <= count + 1 ; endrule endmodule
$ ./rules12 -m 9 cycle 0: y set to 0 cycle 3: y set to 3 cycle 6: y set to 6
DWire
• A Wire with a default value.
• A Dwire is always ready.
• If there is a write to a DWire in a cycle then just like a Wire it assumes that value.
• If there is no write to a DWire in a cycle it assumes a default value (given at instantiation time).
(* synthesize *) module rules13 (Empty) ; Reg#(int) y <- mkReg (17) ; Reg#(int) count <- mkReg (0) ; Wire#(int) x <- mkDWire (42); rule producer ; if (count % 3 == 0) x <= count ; endrule rule consumer ; y <= x ; $display ("cycle %0d: y set to %0d", count, x) ; endrule rule counter ; count <= count + 1 ; endrule endmodule
$ cycle 1: y set to 42 cycle 2: y set to 42 cycle 3: y set to 3 cycle 4: y set to 42 cycle 5: y set to 42 cycle 6: y set to 6 cycle 7: y set to 42
BypassWire
• Closest thing to a wire in Verilog.
• A BypassWire is always ready.
• Rather than having a default value the compiler must be able to statically determine that this wire is driven on every cycle.
FIFOs
• Lots and lots of FIFOs provided in FIFO, FIFOF, SpecialFIFOs libraries
• Examples (2 and 4 element FIFOs): FIFO#(UInt#(8)) myfifo <- mkFIFO; FIFO#(UInt#(8)) biggerfifo <- mkSizedFIFO(4);
• Example BypassFIFO (1 storage element, data passes straight through if enq and deq on same cycle when empty) FIFO#(UInt#(8)) bypassfifo <- mkBypassFIFO;
• Basic interfaces: – enq(value) // enqueue “value”
– first // returns first element of fifo
– deq // dequeue
import FIFO::*; (* synthesize *) module rules14 (Empty) ; Reg#(int) count <- mkReg (0) ; FIFO#(int) fifo <- mkSizedFIFO (30); rule producer (count < 5) ; fifo.enq (count*3) ; $display ("cycle %0d: enqeuing value %d", count, count*3) ; endrule rule consumer (count > 5) ; int x = fifo.first ; fifo.deq ; $display ("cycle %0d: deqeued value %0d", count, x) ; endrule rule counter ; count <= count + 1 ; endrule endmodule
$ ./rules14 -m 20 cycle 0: enqeuing value 0 cycle 1: enqeuing value 3 cycle 2: enqeuing value 6 cycle 3: enqeuing value 9 cycle 4: enqeuing value 12 cycle 6: deqeued value 0 cycle 7: deqeued value 3 cycle 8: deqeued value 6 cycle 9: deqeued value 9 cycle 10: deqeued value 12
import FIFO::*; (* synthesize *) module rules15 (Empty) ; Reg#(int) count <- mkReg (0) ; FIFO#(int) fifo <- mkSizedFIFO (30); rule producer (count < 5) ; fifo.enq (count*3) ; $display ("cycle %0d: enqeuing value %0d", count, count*3) ; endrule rule consumer (count < 5) ; int x = fifo.first ; fifo.deq ; $display ("cycle %0d: deqeued value %0d", count, x) ; endrule rule counter ; count <= count + 1 ; endrule endmodule
import GetPut::* ; import Connectable::* ; module mkProducer (Get#(int)) ; Reg#(int) i <- mkReg (0) ; rule incrementI ; i <= i + 1 ; endrule method ActionValue#(int) get () ; return i ; endmethod endmodule: mkProducer module mkConsumer (Put#(int)) ; Wire#(int) i <- mkWire ; rule report ; $display ("mkConsumer %d", i) ; endrule method Action put (int x) ; i <= x ; endmethod endmodule: mkConsumer
(* synthesize *) module mkConnectableExample(Empty) ; Get#(int) p <- mkProducer ; Put#(int) c <- mkConsumer ; mkConnection (p, c) ; endmodule: mkConnectableTest
Higher Order Types p and c are methods which are passed as arguments
ServerFarm
ServerFarm Information Flow
DividerServer
req
ue
st
resp
on
se
DividerServer
req
ue
st
resp
on
se
resp
on
se
req
ue
st
Conclusions
• Bluespec:
– provides cleaner interfaces
• quicker to create large systems from libraries of components
• easier to refine design
– creates most of the control for you (unless you don’t want it to)
• less likely to get it wrong!
– has strong typing
• helps remove bugs
– provides powerful static elaboration