automatic verification and fence inference for relaxed memory models
DESCRIPTION
Martin Vechev. Eran Yahav. Michael Kuperstein. Automatic verification and fence inference for relaxed memory models. Technion. IBM Research. Technion. (FMCAD’10, PLDI’11). Textbook Example: Peterson’s Algorithm. Process 0 : while (true) { store ent0 = true; store turn = 1; - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/1.jpg)
AUTOMATIC VERIFICATION AND FENCE INFERENCE
FOR RELAXED MEMORY MODELS
Martin VechevIBM Research
Michael KupersteinTechnion
Eran YahavTechnion
(FMCAD’10, PLDI’11)
![Page 2: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/2.jpg)
2
Textbook Example: Peterson’s Algorithm
Specification: mutual exclusion over critical section
Process 0:while(true) { store ent0 = true; store turn = 1; do { load e = ent1; load t = turn; } while(e == true && t == 1); //Critical Section here store ent0 = false;}
Process 1:while(true) { store ent1 = true; store turn = 0; do { load e = ent0; load t = turn; } while(e == true && t == 0); //Critical Section here store ent1 = false;}
![Page 3: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/3.jpg)
3
Beyond Textbooks: Relaxed Memory Models
Specification: mutual exclusion over critical section
Process 0:while(true) { store ent0 = true; store turn = 1; do { load e = ent1; load t = turn; } while(e == true && t == 1); //Critical Section here store ent0 = false;}
Process 1:while(true) { store ent1 = true; store turn = 0; do { load e = ent0; load t = turn; } while(e == true && t == 0); //Critical Section here store ent1 = false;}
Re-ordering of operations Non-atomic stores
![Page 4: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/4.jpg)
4
Memory Fences
Enforce order… at a cost Fences are expensive
10s-100s of cycles collateral damage (e.g., prevent compiler opts)
example: removing a single fence yields 3x speedup in a work-stealing queue
Required fences depend on memory model
Where should I put fences?
![Page 5: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/5.jpg)
5
Where should I put fences?
On the other, synchronization bugs can be very difficult to track down, so memory barriers should be used liberally, rather than relying on complex platform-specific guarantees about limits to memory instruction reordering.
On the one hand, memory barriers are expensive (100s of cycles, maybe more), and should be used only when necessary.
– Herlihy and Shavit, The Art of Multiprocessor Programming.
![Page 6: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/6.jpg)
6
May Seem Easy…
Specification: mutual exclusion over critical section
Process 0:while(true) { store ent0 = true; fence; store turn = 1; fence; do { load e = ent1; load t = turn; } while(e == true && t == 1); //Critical Section here store ent0 = false;}
Process 1:while(true) { store ent1 = true; fence; store turn = 0; fence; do { load e = ent0; load t = turn; } while(e == true && t == 0); //Critical Section here store ent1 = false;}
![Page 7: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/7.jpg)
7
Chase-Lev Work-Stealing Queue1 int take() {2 long b = bottom – 1;3 item_t * q = wsq;4 bottom = b5 long t = top6 if (b < t) {7 bottom = t;8 return EMPTY;9 }10 task = q->ap[b % q->size];11 if (b > t)12 return task13 if (!CAS(&top, t, t+1))14 return EMPTY;15 bottom = t + 1;16 return task;17 }
1 void push(int task) {2 long b = bottom;3 long t = top;4 item_t * q = wsq;5 if (b – t >= q->size – 1) {6 wsq = expand();7 q = wsq;8 }9 q->ap[b % q->size] = task;10 bottom = b + 1;11 }
1 int steal() {2 long t = top;3 long b = bottom;4 item_t * q = wsq;5 if (t >= b)6 return EMPTY;7 task = q->ap[t % q->size];8 if (!CAS(&top, t, t+1))9 return ABORT;10 return task;11 }
Specification: no lost items, no phantom items, memory safety
![Page 8: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/8.jpg)
8
Fender
Help the programmer place fences Find optimal fence placement
Ideal setting for synthesis Restrict non-determinism s.t.
program stays within set of safe executions
![Page 9: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/9.jpg)
9
Goal
P’ satisfies the specification S under M
FENDER
ProgramP
Specification S
MemoryModel
M
Program P’
with Fences
![Page 10: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/10.jpg)
10
Naïve Approach: Recipe
Compute reachable states for the program
Compute constraints on execution that guarantee that all “bad states” are avoided
Implement the constraints with fences
Bad News [Atig et. al POPL’10]
Reachability undecidable for RMO
Non-primitive recursive complexity for TSO/PSO
![Page 11: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/11.jpg)
11
Store Buffers
…P0
MainMemor
y
…P1
……
……
ent0ent1turn
ent0ent1turn
![Page 12: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/12.jpg)
12
Unbounded Store Buffers…
Even for very simple patterns e.g. spin-loops with writes in loop body
flag := true while other_flag = true { flag := false //Do something flag := true }
true
false
true
false
true
false
true
![Page 13: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/13.jpg)
13
What can we do?
Under-approximate Bound the length of execution buffers
[FMCAD’10] Implies a bound on state space
Bound context switches (Ahmed’s talk from yesterday)
Dynamic synthesis of fences (In progress)
Over-approximate Sound abstraction of buffer content [PLDI’11]
![Page 14: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/14.jpg)
14
Abstract Interpretation for RMM Main Idea: Bounded over-
approximation of unbounded buffers
Hierarchy of abstract memory models
Strictly weaker than concrete TSO/PSO Maintain partial-coherence
![Page 15: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/15.jpg)
15
First Attempt: Set Abstraction
Abstract each store buffer as a set
Process 0:while(true) { store ent0 = true; fence; store turn = 1; fence; do { load e = ent1; load t = turn; } while(e == true && t == 1); //Critical Section here store ent0 = false;}
Process 1:while(true) { store ent1 = true; fence; store turn = 0; fence; do { load e = ent0; load t = turn; } while(e == true && t == 0); //Critical Section here store ent1 = false;}
…P0
MainMemor
y
…P1
……
……
ent0ent1turn
ent0ent1turn
{ }{ }{ }
{1}
{true} { }{ false }{ false, true }
{ } { } { } { }
![Page 16: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/16.jpg)
16
Second Attempt: record most recent value
Process 0
X := 1
Process 1
while (X != 1) { nop } X := 2fence e := Xassert e == 2;
Initially X == 0
P0Main
Memory
P1
{ }
{ }
{ 1 }
X = 0X = 1
{2 }X = 2
flush (1st time)
{ }
flush (2nd time)
![Page 17: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/17.jpg)
17
Abstract Memory Models - Requirements
Intra-process coherence: a process should see the most recent value it wrote
Preserve fence semantics
Partial inter-process coherence: preserve as much order information as feasible (bounded) Enable strong flushes
Sound Simple construction!
![Page 18: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/18.jpg)
18
Partial Coherence Abstractions
…P0
MainMemor
y
…P1
……
……
ent0ent1turn
ent0ent1turn
P0
MainMemor
y
P1
ent0
turn
ent0ent1turn
Recent value
Bounded
length kUnordered elements
ent1
Allows precise fence semantics
Allows precise loads from bufferKeeps the analysis precise for “normal” programs
Fallback for soundness
![Page 19: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/19.jpg)
19
Intuition
Making unbounded number of writes to a memory location without a flush? Seems very esoteric Your program is suspicious
flag := true while other_flag = true { flag := false //Do something flag := true }
![Page 20: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/20.jpg)
Adjusted Recipe
Compute (abstract) reachable states for the program using partial-coherence abstraction
Compute constraints on execution that guarantee that all “bad states” are avoided
Implement the constraints with fences
![Page 21: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/21.jpg)
21
Precision/fences tradeoff
Different abstractions lead to different fences More precise abstraction – potentially
fewer fences
In the paper SC-Recoverability PSO as an abstraction of TSO Partially disjunctive abstraction allows
more aggressive merging of buffer, scales better
![Page 22: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/22.jpg)
22
Inference Results
Mostly mutual exclusion primitives Different variations of the abstractionProgram FD k=0 FD k=1 PD k=0 PD k=1 PD k=2Sense0 Pet0 Dek0 Lam0 T/O T/O T/O
Fast0 T/O T/O T/O T/O
Fast1a Fast1b Fast1c T/O T/O
![Page 23: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/23.jpg)
23
Limitations…
More relaxed memory models Need reasonable concrete semantics
first
What if the program is infinite-space? Or simply large Partial-coherence isn’t enough We want to also use “standard” abstract
interpretation
![Page 24: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/24.jpg)
24
Composing Abstractions
Trivial for value abstractions Abstract buffers of abstract values
Challenging for more complex abstractions Heap abstractions Predicate abstractions Buffers for “abstract locations” (?)
![Page 25: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/25.jpg)
25
Summary
Partial-coherence abstractions Verification without context bounds but possibly with false alarms
Automatic fence inference Precision affects quality of results
In progress Combination of abstractions Scalable (dynamic) fence inference
![Page 26: Automatic verification and fence inference for relaxed memory models](https://reader035.vdocuments.us/reader035/viewer/2022062310/56816692550346895dda7248/html5/thumbnails/26.jpg)
26
Invited Questions
1. Do you ever need to use k>1 ?2. In your abstraction, do you ever fall
into the unordered set?3. How far can we expect this to scale?4. Why over-approximation for fence
inference?5. Can you explain how the inference
works?