implementing for correct concurrency nirav dave computer science & artificial intelligence lab

38
March 9, 2011 http:// csg.csail.mit.edu/6.375 L11-1 Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology http:// csg.csail.mit.edu/6.375

Upload: hadar

Post on 23-Feb-2016

49 views

Category:

Documents


0 download

DESCRIPTION

Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology. http://csg.csail.mit.edu/6.375. Dealing with Conflicts. When do conflicts arise? How do we Analyze them? How do we fix them? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 http://csg.csail.mit.edu/6.375 L11-1

Implementing for Correct Concurrency

Nirav DaveComputer Science & Artificial Intelligence LabMassachusetts Institute of Technology

http://csg.csail.mit.edu/6.375

Page 2: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-2http://csg.csail.mit.edu/6.375

Dealing with ConflictsWhen do conflicts arise?

How do we Analyze them?

How do we fix them?

How do we make sure we’re okay?

Page 3: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-3http://csg.csail.mit.edu/6.375

SFIFOinterface SFIFO#(type t, type tr, type v); method Action enq(t); // enqueue an item method Action deq(); // remove oldest entry method t first(); // inspect oldest item method Action clear(); // make FIFO empty method Maybe#(v) find(tr); // search FIFOendinterface

n = # of bits needed to represent the values of type “t“ m = # of bits needed to represent the values of type “tr“ v = # of bits needed to represent the values of type “v“

not full

not empty

not empty

rdyenab

n

nrdy

enab

rdy

enq

deq

first SF

IFO

mod

ule

clea

renab

findmbool

V

Page 4: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-4http://csg.csail.mit.edu/6.375

Processor Example

fetch execute

iMem

rf

CPU

decode memory

pc

write-back

dMem

5 – stage Processor. 1 element FIFOs in between stages

Let’s add bypassing

Page 5: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-5http://csg.csail.mit.edu/6.375

Decode Rulerule decode (!newStallFunc(instr, d2eQ, e2mQ, m2wQ)); let fetInst = f2dQ.first(); f2dQ.deq(); match {.ra, .rb} = getRARB(fetInst);

let va0 = rf[ra]; let va1 = fromMaybe (m2wQ.find(ra), va0); let va2 = fromMaybe (e2mQ.find(ra), va1);

let vb0 = rf[rb]; let vb1 = fromMaybe (m2wQ.find(rb), vb0); let vb2 = fromMaybe (e2mQ.find(rb), vb1);

let newInst = case (fetInst) match Add: return (DAdd .va2 .vb2); … endcase; d2eQ.enq(newInst);endrule When do we want it to execute?

Decode is also correct correct anytime it’s allowed to execute

Search through each place in

design

Page 6: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-6http://csg.csail.mit.edu/6.375

some insight intoConcurrent rule firing

There are more intermediate states in the rule semantics (a state after each rule step) In the HW, states change only at clock edges

Rules

HW

Ri Rj Rk

clocks

rulesteps

Ri

RjRk

http://csg.csail.mit.edu/6.375

Page 7: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-7http://csg.csail.mit.edu/6.375

Parallel executionreorders reads and writes

In the rule semantics, each rule sees (reads) the effects (writes) of previous rules In the HW, rules only see the effects from previous clocks, and only affect subsequent clocks

Rules

HW clocks

rulestepsreads writes reads writes reads writesreads writesreads writes

reads writes reads writes

http://csg.csail.mit.edu/6.375

Page 8: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-8http://csg.csail.mit.edu/6.375

Correctness

Rules are allowed to fire in parallel only if the net state change is equivalent to sequential rule execution Consequence: the HW can never reach a state unexpected in the rule semantics

Rules

HW

Ri Rj Rk

clocks

rulesteps

Ri

RjRk

http://csg.csail.mit.edu/6.375

Page 9: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-9http://csg.csail.mit.edu/6.375

UpshotGiven the concurrency of method/rules in a system we can determine viable schedules Some variation do to applicability

BUT we know what schedule we want (mostly) We should be able to back propagate results

to submodules

Page 10: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-10http://csg.csail.mit.edu/6.375

Determining Concurrency Properties

Page 11: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-11http://csg.csail.mit.edu/6.375

Processor: Concurrencies

In-order: F < D < E < M < WPipelined W < M < E < D < F

fetch execute

iMem

rf

CPU

decode memory

pc

write-back

dMem

http://csg.csail.mit.edu/6.375

Page 12: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-12http://csg.csail.mit.edu/6.375

Concurrency requirements for Full Pipelining – Reg File

In-Order RF: (D calls sub) < (W calls upd)

Pipelined RF: (W calls upd) < (D calls sub)

fetch

execute

imem

rf

CPU

decode memory

pc

write-back

dMem

Page 13: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-13http://csg.csail.mit.edu/6.375

Concurrency requirements for Full Pipelining – FIFOs

In-Order FIFOs: 1. m2wQ, e2mQ: find < enq < first < deq 2. d2eQ: find < enq < first < deq, clear

Pipeline FIFOs: 3. m2wQ, e2mQ : first < deq < enq < find 4. d2eQ : first < deq < find < enq

fetch

execute

imem

rf

CPU

decode memory

pc

write-back

dMem

Page 14: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-14http://csg.csail.mit.edu/6.375

Constructing Appropriately concurrent submodules

Page 15: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-15http://csg.csail.mit.edu/6.375

From Analysis to DesignWe need to create modules which behave as needed

Construct modules using “unsafe” primitives to have “safe” behaviors

Three major concepts: Use primitives which remove “false” concurrency

orderings (e.g. ConfigRegs vs. Regs) Add RWires for forwarding values intra-cycle Reason carefully to assure that execution appears

“atomic”

Page 16: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-16http://csg.csail.mit.edu/6.375

ConfigReg and RWiremkConfigReg is a Reg without this restriction mkReg requires that read < write Allows us to read stale values (dangerous)

RWire is a “wire” wset :: a -> Action writes wget :: Maybe#(a) returns written value if

read happened. wset happens before wget each cycle

Page 17: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-17http://csg.csail.mit.edu/6.375

Let’s implement some modules

Page 18: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-18http://csg.csail.mit.edu/6.375

Processor Redux

In-order: F < D < E < M < WPipelined W < M < E < D < F

fetch execute

iMem

rf

CPU

decode memory

pc

write-back

dMem

http://csg.csail.mit.edu/6.375

Page 19: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-19http://csg.csail.mit.edu/6.375

Concurrency: RegFileThe standard library regfile is implemented using with concurrency (sub < upd) This handles the in-order case

We need to build a RegisterFile for the pipelined case

Page 20: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-20http://csg.csail.mit.edu/6.375

BypassRegFilemodule mkBypassRegFile(RegFile#(a,d)) #(d l, d h) provisos#(Bits(a,asz), Bits#(d,dsz)); RegFile#(a,d) rfInt <- mkRegFileWCF(l,h); RWire#(Tuple2#(a,d)) curWrite <- mkRWire();

method Action upd(a x, d v); rfInternal.upd(x,v); curWrite.wset(tuple2(x,v));endmethod

method d sub(a x); case (curWrite.wget()) matches tagged Valid {.wa, .wd} &&& wa == a: return wd; default: return

rfInternal.sub(a); endcase endmethod endmodule

Page 21: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-21http://csg.csail.mit.edu/6.375

Processor Redux

In-order: F < D < E < M < WPipelined W < M < E < D < F

fetch execute

iMem

rf

CPU

decode memory

pc

write-back

dMem

http://csg.csail.mit.edu/6.375

Page 22: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-22http://csg.csail.mit.edu/6.375

One Element SFIFO (Naïve)module mkSFIFO1#(function Maybe#(v) findf(tr r, t x)) (SFIFO#(t,tr,v)); Reg#(t) data <- mkRegU(); Reg#(Bool) full <- mkReg(False); method Action enq(t x) if (!full); full <= True; data <= x; endmethod method Action deq() if (full); full <= False; endmethod method t first() if (full); return (data); endmethod method Maybe#(v) find(tr r); return (full ? findf(r, data): Nothing); endmethod endmodule

http://csg.csail.mit.edu/6.375

Concurrency:find < first < (enq C deq)

Page 23: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-23http://csg.csail.mit.edu/6.375

One Element SFIFO (In-Order d2eQ #1)module mkSFIFO1#(function Maybe#(v) findf(tr r, t x)) (SFIFO#(t,tr,v)); Reg#(t) data <- mkConfigRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(t) enqv <- mkRWire(); method Action enq(t x) if (!full); full <= True; data <= x; enqv.wset(x); endmethod method Action deq() if (full || isValid(enqv.wget())); full <= False; endmethod method t first() if (full); return data; endmethod method Maybe#(v) find(tr r); return full ? findf(r,data): Nothing; endmethodendmodule

http://csg.csail.mit.edu/6.375

find < first < enq < deq

Page 24: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-24http://csg.csail.mit.edu/6.375

One Element SFIFO (In-Order e2mQ, m2wQ #2)module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(t) enqv <- mkRWire(); method Action enq(t x) if (!full); full <= True; data <= x; enqv.wset(x); endmethod method Action deq() if (full || isValid(enqv.wget())); full <= False; endmethod method t first() if (full || isValid(enqv.wget())); return (fromMaybe(enqv.wget(), data)); endmethod method Maybe#(v) find(tr r); return full ? findf(r,data): Nothing; endmethodendmodule

http://csg.csail.mit.edu/6.375

find < enq < first < deq

Page 25: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-25http://csg.csail.mit.edu/6.375

One Element Searchable SFIFO (Pipelined #3)module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkConfigRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(void) deqw <- mkRWire(); RWire#(void) enqw <- mkRWire(); method Action enq(t x) if (!full || isValid(deqw.wget()); full <= True; data <= x; enqw.wset(x); endmethod method Action deq() if (full); full <= False; deqw.wset(?); endmethod method t first() if (full); return (data); endmethod method Maybe#(v) find(tr r); return (full&&!isValid(deqw.wget()) ? findf(r,data) : isValid(enqw.wget()) ? findf(r, fromMaybe(enqw.wget(),?)): Nothing; endmethod endmodulehttp://csg.csail.mit.edu/6.375

first < deq < enq < find

Page 26: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-26http://csg.csail.mit.edu/6.375

One Element Searchable SFIFO (Pipelined #4)module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkConfigRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(void) deqw <- mkRWire(); method Action enq(t x) if (!full || isValid(deqw.wget()); full <= True; data <= x; endmethod method Action deq() if (full); full <= False; deqw.wset(?); endmethod method t first() if (full); return (data); endmethod method Maybe#(v) find(tr r); return (full&&!isValid(deqw.wget()) ? findf(r, data): Nothing;endmethod endmodule

http://csg.csail.mit.edu/6.375

first < deq < find < enq

Page 27: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-27http://csg.csail.mit.edu/6.375

One Element Searchable SFIFO (Pipelined #4)module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(void) deqEN <- mkRWire(); Bool deqp = isValid (deqEN.wget())); method Action enq(t x) if (!full|| deqp); full <= True; data <= x; 12endmethod method Action deq() if (full); full <= False; deqEN.wset(?); endmethod method t first() if (full); return (data); endmethod method Maybe#(v) find(tr r);

return (full&&!deqp) ? findf(r, data): Nothing; endmethod endmodule

http://csg.csail.mit.edu/6.375

first < deq < find < enq

Page 28: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-28http://csg.csail.mit.edu/6.375

Up-Down Counter

Page 29: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-29http://csg.csail.mit.edu/6.375

Counter Module Interfaceinterface Counter method Action up(); method Action down(); method Bit#(32) _read();endinterface

Concurrency: up and down should be independent

Page 30: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-30http://csg.csail.mit.edu/6.375

Naïve Counter Examplemodule mkCounter(Counter); Reg#(int) r <- mkReg(); method int _read(); return r; endmethod method Action up(); r <= r + 1; endmethod method Action down(); c <= r – 1; endmethodendmodule

Page 31: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-31http://csg.csail.mit.edu/6.375

Counter Examplemodule mkCounter(Counter); Reg#(int) r <- mkConfigReg(); RWire#(void) upW <- mkRWire(); RWire#(void) downW <- mkRWire();

method int _read(); return r; endmethod method Action up(); upW.wset(); endmethod method Action down(); downW.wset(); endmethod

rule updateR(True); r <= r + (isValid( upW.wget()) ? 1 : 0) - (isValid(downW.wget()) ? 1 : 0); endruleendmodule

What if want to call up then _read?

Page 32: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-32http://csg.csail.mit.edu/6.375

Completion Buffer

Page 33: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-33http://csg.csail.mit.edu/6.375

Completion buffer: Interface

interface CBuffer#(type t); method ActionValue#(Token) getToken(); method Action put(Token tok, t d); method ActionValue#(t) getResult();endinterface

typedef Bit#(TLog#(n)) TokenN#(numeric type n);typedef TokenN#(16) Token;

cbuf getResultgetToken

put (result & token)

http://csg.csail.mit.edu/6.375

Page 34: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-34http://csg.csail.mit.edu/6.375

IP-Lookup module with the completion buffer

module mkIPLookup(IPLookup); rule recirculate… ; rule exit …; method Action enter (IP ip); Token tok <- cbuf.getToken(); ram.req(ip[31:16]); fifo.enq(tuple2(tok,ip[15:0])); endmethod method ActionValue#(Msg) getResult(); let result <- cbuf.getResult(); return result; endmethodendmodule

done?RAM

fifo

enter

getResultcbufyes

no

getToken

for enter and getResult to execute simultaneously, cbuf.getToken and cbuf.getResult must execute simultaneously

http://csg.csail.mit.edu/6.375

Page 35: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-35http://csg.csail.mit.edu/6.375

IP Lookup rules with completion buffer

rule recirculate (!isLeaf(ram.peek())); match{.tok,.rip} = fifo.first(); fifo.enq(tuple2(tok,(rip << 8))); ram.req(ram.peek() + rip[15:8]); fifo.deq(); ram.deq();endrule

rule exit (isLeaf(ram.peek())); cbuf.put(ram.peek()); fifo.deq(); ram.deq();endrule

For rule exit and method enter to execute simultaneously, cbuf.put and cbuf.getToken must execute simultaneously

For no dead cycles cbuf.getToken and cbuf.put and cbuf.getResult must be able to execute simultaneously

http://csg.csail.mit.edu/6.375

Page 36: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-36http://csg.csail.mit.edu/6.375

Naïve Completion Buffermodule mkCBuffer(CBuffer#(a)); Vector#(Reg#(Bool)) valids <- replicateM(mkReg(False)); RegFile#(Token, t) data <- mkRegFile(); Reg#(Token) rdP <- mkReg(0); Reg#(Token) wrP <- mkReg(0); Reg#(Token) cnt <- mkReg(0); method ActionValue#(Token) getToken() if (cnt < Max); cnt <= cnt + 1; rdP <= nextPointer(rdP); valids[rdP] <= False; return rdp; endmethod method Action put(Token tok, t d); valids[tok] <= True; data.upd(tok, d); endmethod method ActionValue#(t) getResult() if (valids[wrP]) cnt <= cnt -1; wrP <= nextPointer(wrP); return (data.sub(wrP)); endmethodendmodule

Page 37: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-37http://csg.csail.mit.edu/6.375

Completion buffer: Interface Requirements

cbuf getResultgetToken

put (result & token)

Rules and methods concurrency requirement to avoid dead-cycles: exit < getResult < enter cbuf methods’ concurency: cbuf.getResult < cbuf.put < cbuf.getToken

http://csg.csail.mit.edu/6.375

Page 38: Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

March 9, 2011 L11-38http://csg.csail.mit.edu/6.375

Completion Buffermodule mkCBuffer(CBuffer#(a)); Vector#(Reg#(Bool)) valids <- replicateM(mkReg(False)); RegFile#(Token, t) data <- mkRegFile(); Reg#(Token) rdP <- mkConfigReg(0); Reg#(Token) wrP <- mkConfigReg(0); Counter cnt <- mkCounter(); method ActionValue#(Token) getToken() if (cnt < Max); cnt.up(); rdP <= rdP + 1; valids[rdP] <= False; return rdp; endmethod method Action put(Token tok, t d); valids[tok] <= True; data.upd(tok, d); endmethod method ActionValue#(t) getResult() if (valids[wrP]) cnt.down(); wrP <= wrP + 1; return (data.sub(wrP)); endmethodendmodule

getResult < put < getToken

Is the ordering correct?

Is valids okay?