abstractions for relaxed memory models
DESCRIPTION
Abstractions for Relaxed Memory Models. Andrei Dan, Yuri Meshman , Martin Vechev , Eran Yahav. Verification under a relaxed memory model. ?. P M S. Really about. P M S. P’ M S. P M S. Logozzo , Ball. Modular and Verified Automatic Program Repair, OOPSLA'12. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/1.jpg)
1
Abstractions for Relaxed Memory Models
Andrei Dan, Yuri Meshman, Martin Vechev, Eran Yahav
![Page 2: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/2.jpg)
2
Verification under a relaxed memory model
P M S?
![Page 3: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/3.jpg)
3
Really about
P M SP’ M SChange the program
P M S
…
Logozzo, Ball. Modular and Verified Automatic Program Repair, OOPSLA'12
Automatic Inference of Memory Fences, FMCAD’10
Abstraction-Guided Synthesis of Synchronization, POPL’10
![Page 4: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/4.jpg)
4
Example: Correct and Efficient Synchronization with Barriers
• Boids simulation• Craig Reynolds ,"Flocks, herds and schools: A distributed
behavioral model.", SIGGRAPH '87• Sequential implementation by Paul Richmond (U. of Sheffield)
…
Global state contains shared arrayof Boid locations
N Boids
![Page 5: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/5.jpg)
5
Boid Task
…Read locations of other boids
…Update my location
Specification: guarantee conflict-freedom
Conflict: two threads are enabled to access the same memory location and (at least) one of these accesses is a write
Boid task pid=2
pid=2
…Read locations of other boids
…Update my location
Boid task pid=3
![Page 6: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/6.jpg)
6
Boids Simulation
…
Generate initial locationsSpawn N Boid TasksWhile (true) { Read locations of all boids Render boids}
While (true) { Read locations of other boids Compute new location Update my location}
Main task Boid task
Shared Memory (Global State)
Locations of N Boids
Where should I put synchronization barriers?
![Page 7: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/7.jpg)
7
Textbook Example: Dekker’s Algorithm
Thread 0: flag0 := truewhile flag1 = true { if turn ≠ 0 { flag0 := false while turn ≠ 0 { } flag0 := true }}// critical sectionturn := 1flag0 := false
Thread 1:
flag1 := truewhile flag0 = true { if turn ≠ 1 { flag1 := false while turn ≠ 1 { } flag1 := true }}// critical sectionturn := 0flag1 := false
initial: flag0 = false, flag1 = false, turn = 0
spec: mutual exclusion over critical section
sequential consistency
relaxed model x86 TSO
Yes
No
![Page 8: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/8.jpg)
8
Concrete PSO semantics using store buffers
…P0
MainMemory
…T0
……
……
flag0flag1turn
flag0flag1turn
store flush
load
fence
![Page 9: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/9.jpg)
9
Where should I put fences?
On the one hand, memory barriers are expensive (100s of cycles, maybe more), and should be used only when necessary.
On the other, synchronization bugs can be very difficult to track down, so memory barriers should be used liberally, rather than relying on complex platform-specific guarantees about limits to memory instruction reordering.
– Herlihy and Shavit
![Page 10: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/10.jpg)
10
May seem easy…
Thread 0: flag0 := truefencewhile flag1 = true { if turn ≠ 0 { flag0 := false while turn ≠ 0 { } flag0 := true fence }}// critical sectionturn := 1flag0 := false
Thread 1:
flag1 := truefencewhile flag0 = true { if turn ≠ 1 { flag1 := false while turn ≠ 1 { } flag1 := true fence }}// critical sectionturn := 0flag1 := false
initial: flag0 = false, flag1 = false, turn = 0
relaxed model x86 TSO Yes
spec: mutual exclusion over critical section
![Page 11: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/11.jpg)
11
1 int take() {2 long b = bottom – 1;3 item_t * q = wsq;4 bottom = b
5 long t = top6 if (b < t) {7 bottom = t;8 return EMPTY;9 }10 task = q->ap[b % q->size];11 if (b > t)12 return task13 if (!CAS(&top, t, t+1))14 return EMPTY;15 bottom = t + 1;16 return task;17 }
1 void push(int task) {2 long b = bottom;3 long t = top;4 item_t * q = wsq;5 if (b – t ≥ q->size – 1) {6 wsq = expand();7 q = wsq;8 }9 q->ap[b % q->size] = task;
10 bottom = b + 1;11 }
1 int steal() {2 long t = top;
3 long b = bottom;
4 item_t * q = wsq;5 if (t >= b)6 return EMPTY;7 task = q->ap[t % q->size];
8 if (!CAS(&top, t, t+1))9 return ABORT;10 return task;11 }
fence
fence
fence
fence
fence
Chase-Lev Work-Stealing Queue
![Page 12: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/12.jpg)
12
Goal
• Help the programmer place fences– Find optimal fence placement
• Principle– Restrict non-determinism s.t. program stays
within set of safe executions
![Page 13: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/13.jpg)
13
Our Approach: Overview
• P’ satisfies the specification S under M
FENDER
ProgramP
Specification S
MemoryModel
M
Program P’with
Fences
![Page 14: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/14.jpg)
14
1. Compute reachable states for the program
2. Compute weakest constraints on execution that guarantee all “bad states” are avoided
3. Implement the constraints with fences <Formula>
Our Approach: Recipe
![Page 15: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/15.jpg)
15
• Compute reachable states of the program
• Compute constraints on execution that guarantee that all “bad states” are avoided
• Implement the constraints with fences
Bad News [Atig et al. POPL’10]
Even for programs that are finite-state under SC
Reachability undecidable for RMO
Non-primitive recursive complexity for TSO/PSO
Our Approach: Recipe
![Page 16: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/16.jpg)
16
Challenges
• Automatic verification and fence inference that works for realistic programs
• Handle two sources of unboundedness– Unbounded SC state– Unbounded store buffers
![Page 17: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/17.jpg)
17
Bound state and buffer (under-approx)[Automatic Inference of Memory Fences, FMCAD’10]
![Page 18: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/18.jpg)
18
Demonic scheduling for dynamic exploration of executions[Dynamic Synthesis for Relaxed Memory Models , PLDI’12]
Can infer fences for large tricky programs such as a lock-free memory allocator
![Page 19: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/19.jpg)
LLVM-GCC
LLVM Interpreter
Threading
DemonicScheduler
Memory Model Specification
Trace Analysis SAT Solver
Fence Enforcement
Concurrent C/C++ code Client
trace Order formula
.bc modified .bc
our extension
existing work
Fixed bytecode &Fence location report
DFENCE: support for concurrency and RMM
satisfyingassignment
Open-source, available at: http://practicalsynthesis.org/fender/
![Page 20: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/20.jpg)
20
Idea 1: Abstraction-Guided Synthesis
• Synthesis of synchronization via abstract interpretation– Compute over-approximation of all possible program
executions– Add minimal synchronization to avoid
(over-approximation of) bad schedules
• Interplay between abstraction and synchronization– Finer abstraction may enable finer synchronization– Coarse synchronization may allow coarser abstraction
![Page 21: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/21.jpg)
21Change the abstraction to match the program
A Standard Approach: Abstraction Refinement
ProgramP
SpecificationS
Abstractcounterexample
Abstraction
AbstractionRefinement
Abstractcounterexample
Verify
Valid
![Page 22: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/22.jpg)
22
ProgramP
Abstractcounterexample
AbstractionRefinement
Change the program to match the abstraction
Verify
Abstraction-Guided Synthesis [VYY-POPL’10]
ProgramRestriction Implement P’
Abstractcounterexample
SpecificationS
Abstraction
![Page 23: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/23.jpg)
23
1. Compute over-approximation of reachable states for the program using sound abstractions
2. Compute weakest constraints on abstract execution that guarantee all “bad abstract states” are avoided
3. Implement the constraints with fences
Our Approach Revisited: Recipe
<Formula>
![Page 24: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/24.jpg)
24
Conservative Abstractions
![Page 25: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/25.jpg)
25
Different ApproachesInfinite-State
Unounded-buffer
Automatic Inference of Memory Fences [FMCAD’10]
Dynamic Synthesis for Relaxed Memory Models [PLDI’12]
Partial-Coherence Abstractions for Relaxed Memory Models [PLDI’11]
Predicate Abstraction for Relaxed Memory Models [SAS’13]
Synthesis of Memory Fences via Refinement Propagation [SAS’14]
Effective Program Transformation forVerification under Relaxed Models[in progress]
![Page 26: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/26.jpg)
26
Partial Coherence Abstractions [PLDI’11]
…P0
MainMemory
…P1
……
……
flag0flag1turn
flag0flag1turn
P0
MainMemory
P1
flag0
turn
flag0
flag1turn
Recent value
Bounded length k
Unordered elements
flag1
Allows precise fence semantics
Allows precise loads from buffer
Keeps the analysis precise for “well behaved” programs
Record what values appeared (withoutorder or number)
Sound abstractions of store buffers
![Page 27: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/27.jpg)
27
Abstract Memory Models - Requirements
• Intra-process coherence– A process should see the most recent value it wrote
• Preserve fence semantics– The value written to main memory when flushed by a fence is
the most recent value stored before the fence• Preserve buffer emptiness
– Values do not appear out of nowhere• Partial inter-process coherence
– Preserve as much order information as feasible (bounded)
• Simple construction!
![Page 28: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/28.jpg)
28
State Abstraction Techniques
• Predicate abstraction– Simple – Requires initial set of predicates
• Numerical domains– Octagon and Polyhedra abstractions– Automatically handle programs with (linear)
numerical invariants
![Page 29: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/29.jpg)
29
Predicate Abstraction
• Successful for sequential program analysis– Graf and Saidi (CAV' 97)– Microsoft's SLAM (PLDI’01)– …
• Some work for SC concurrent programs– Symmetry-Aware Predicate Abstraction for Shared-Variable
Concurrent Programs. Kroening et al. (CAV' 11)– Threader: A constraint-based verifier for multi-threaded programs
Gupta et al. (CAV' 11)– …
How can we apply standard predicate abstraction to verification under relaxed memory models?
![Page 30: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/30.jpg)
30
Classical predicate abstractionThread 0:1 X = Y+12 fence(X)
Thread 1:1 Y = X+12 fence(Y)
initial: X=Y=0
assert(XY)
B0: X=Y, B1: X=1, B2: Y=1, B3: X=0 ,B4: Y=0
P
V
/* Statement X = 0 */21: store B1 = false;/*update predicate - B1: (X = 1) */22: store B3 = true;/* update predicate - B3: (X = 0) */23: store B0 = false…/* Statement Y = X + 1 */54: store B0 = false; /* update predicate - B0: (X = Y) */55: store B2 = choose(t3(t0t4), t1t3(t0t2)(t0 t4)(t0t4)); /* B2: (Y = 1) */56: store B4 = choose(false, (t1)(t3)(t0t2)(t0t4)); /*B4: (Y = 0) */…
BP(P,V)
BP(P,V)SC S entails PSC S
![Page 31: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/31.jpg)
31
Direct application is not sound
Thread 0:1 X = Y+12 fence(X)
Thread 1:1 Y = X+12 fence(Y)
initial: X=Y=0
assert(XY)
B0: X=Y, B1: X=1, B2: Y=1, B3: X=0 ,B4: Y=0
PPSO S but BP(P,V)PSO S
predicates with false value other than (x=y) have been omitted
Concrete T0 T1 Glob (X,Y) (X,Y) (X,Y) (0,0) (0,0) (0,0)
T0: X = Y+1 (1,0) (0,0) (0,0) T1: Y = X+1 (1,0) (0,1) (0,0) T0: flush(X) (1,0) (1,1) (1,0) T1: flush(Y) (1,1) (1,1) (1,1)
Predicate Abstraction T0 T1 Global
X=Y, X=0, Y=0 X=Y, X=0, Y=0 X=Y, X=0, Y=0 (X=Y),X=1, Y=0 X=Y, X=0, Y=0 X=Y, X=0, Y=0 (X=Y),X=1, Y=0 (X=Y), Y=1, X=0 X=Y, X=0, Y=0 (X=Y),X=1, Y=0 (X=Y), Y=1, X=0 (X=Y), X=1, Y=0 (X=Y),X=1, Y=0 (X=Y), Y=1, X=0 (X=Y), X=1, Y=1
![Page 32: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/32.jpg)
32
How do we restore soundness?
• Option 0: restrict programs/properties • Option 1: BP(P,V)specialized
– Capture dependencies between updates– Invalidation of predicates – Synchronized updates of multiple predicates
• Option 2: BP(PM,V)SC
– Capture all relaxed memory model effects in the program itself
– Boolean program construction as usual – Verification as usual (using SC tools)
![Page 33: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/33.jpg)
33
Encode memory model effects in the Program
PM S?
PMSC S?
The behavior of PM under sequential consistency is anover-approximation of the behavior of P running under model M
![Page 34: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/34.jpg)
34
Encode RMM effects into the program
• Pick a bound k for store buffers (sound)• Encode store buffers as program variables • Shared variable X encoded as
– Xcnt – a counter for the buffer position
– X1, …, Xk – buffer contents
X1 X2 Xk… …X (PSO)
![Page 35: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/35.jpg)
35
Encode Program: Example for k=1
load t = X if (Xcnt == 0) t = Xif (Xcnt == 1) t = X1
store X = t if (Xcnt == k) “overflow”Xcnt ++if (Xcnt == 1) X1 = t
![Page 36: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/36.jpg)
36
Where do predicates come from?
Predicate Abstraction
ProgramP
Predicates
V
Boolean Program B
Model Checker
Verified
Counterexample
MemoryModel
M
Reduction
Program PM
?
![Page 37: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/37.jpg)
37
Idea 2: Proof Extrapolation
• Leverage the similarity between behaviors in PM and those in PSC
• Verify program under SC using a given vocabulary V
• Extrapolate predicates VM for PM from the SC proof
![Page 38: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/38.jpg)
38
Step 1: Verify program under SC
• Find a set of predicates V• Construct the Boolean program B(P,V)• Verify B(P,V)SC S
![Page 39: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/39.jpg)
39
Step 2: Predicate Extrapolation
• Discover new predicates for RMM based on the predicates used in the SC proof
• Generic predicates– Buffer size, overflow
• SC-Based extrapolated predicates– from SC relationships as captured in V
![Page 40: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/40.jpg)
40
Predicate Extrapolation Example
• xshared variables, 0 i k – (Xcnt == i) tracks buffer size
– (Xi==Xi-1), i 0 for flush actions
• pV where p is of the form “(X<Y)”, 0 i k – (Xi < Y)
– (X < Yi)
![Page 41: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/41.jpg)
41
Dekker with extrapolated predicates
Thread 0:flag0 := truewhile flag1 = true { if turn ≠ 0 { flag0 := false while turn ≠ 0 { } flag0 := true }}// critical sectionturn := 1flag0 := false
Thread 1:flag1 := truewhile flag0 = true { if turn ≠ 1 { flag1 := false while turn ≠ 1 { } flag1 := true }}// critical sectionturn := 0flag1 := false
initial: flag0 = false, flag1 = false, turn = 0
SC (t2 = 0),(t1 = 0), (f1 = 0), (f2 = 0), (flag0 = 0), (flag1 = 0), (turn = 0)
PSO (overflow = 0), (t2 = 0), (t1 = 0), (f1 = 0), (f2 = 0), (flag0 = 0), (flag1 = 0), (turn = 0),(turn_cnt_T0 = 0), (turn_cnt_T0 = 1), (turn_cnt_T1 = 0), (turn_cnt_T1 = 1),(turn_1_T0 = 0), (turn_1_T1 = 0)(flag0_cnt_T0 = 0), (flag0_cnt_T0 = 1), (flag0_1_T0 = 0)(flag1_cnt_T1 = 0), (flag1_cnt_T1 = 1), (flag1_1_T1 = 0)
TSO (overflow = 0), (t2 = 0), (t1 = 0), (f1 = 0), (f2 = 0), (flag0 = 0), (flag1 = 0), (turn = 0)(T0_cnt = 0), (T0_cnt = 1), (lhs_1_T0 = 0), (lhs_1_T0 = 1)(T1_cnt = 0), (T1_cnt = 1), (lhs_1_T1 = 0), (lhs_1_T1 = 2), (rhs_1_T0 = 0), (rhs_1_T1 = 0)
![Page 42: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/42.jpg)
42
Our approach so far
Predicate Abstraction
ProgramP
Predicates
V
Boolean Program B
Model Checker
Verified
Counterexample
MemoryModel
M
Reduction Extrpolation
Program PM Predicates VM
![Page 43: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/43.jpg)
43
Unfortunately…
• Building the Boolean program is exponential in the number of predicates
• Non-feasible for some benchmarks– For example: Bakery goes for more than 10 hours
|VSC| |VPSO| |VTSO|
Dekker 7 28 26
Szymanski 20 47 51
Bakery 15 38 36
Ticket 11 56 48
for k = 2
![Page 44: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/44.jpg)
44
Core problem: abstract transformers
Literals qi = pi or qi = ¬pi, pi ∊ VM
Cubes(VM) = {qi1 ∧ … ∧ qij, j ≤ |VM|}
|Cubes(VM)| = 3|VM|
For st Statements for pi V f = wp(pi,st) for c Cubes(VM) if c f // SMT call add c to the transformer
![Page 45: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/45.jpg)
45
Cube Extrapolation
• Reuse more information from the SC proof
• In addition to input predicates, extrapolate from the cubes used in the Boolean program
• Cube search space restricted only to extrapolated cubes
![Page 46: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/46.jpg)
46
Cube Extrapolation Example
Cube in the SC Boolean Program B Potential Cubes for the RMM Boolean Program
(X 0 X < Y) (X1 0 X1 < Y)…(Xk 0 Xk < Y)(X 0 X < Y1) …(X 0 X < Yk)
![Page 47: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/47.jpg)
47
Abstract transformers with extrapolated Cubes
Literals qi = pi or qi = ¬pi, pi ∊ VM
Cubes(VM) = {qi1 ∧ … ∧ qij, j ≤ |VM|}ExtCubes(B,VM) = CubeExtrapolation(B)
|ExtCubes(B,VM)|<< |Cubes(VM)|
For st Statements for pi V f = wp(pi,st) for c ExtCubes(B,VM) if c f // SMT call add c to the transformer
![Page 48: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/48.jpg)
48
Complete Approach
Predicate Abstraction
ProgramP
Predicates
V
Boolean Program B
Model Checker
Verified
Counterexample
MemoryModel
M
Reduction Extrpolation
Program PM Predicates VM
Cube Extraction
Boolean Program
BSC
Cubes from BSC
![Page 49: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/49.jpg)
49
Results: Predicate Extrapolation Build Boolean Program Model Check
algorithm memory # input # SMT time # cubes cube # states memory time
model preds calls (K) (sec) used size (K) (MB) (sec)
Dekker
SC 7 0.7 0.1 0
1
14 6 1
PSO 20 26 6 0 80 31 5
TSO 18 22 5 0 45 20 3
Peterson
SC 7 0.6 0.1 2
2
7 3 1
PSO 20 15 3 2 31 13 3
TSO 18 13 3 2 25 11 2
ABP
SC 8 2 0.5 5
2
0.6 1 0.6
PSO 15 20 4 5 2 3 1
TSO 17 23 5 5 2 3 1
Szymanski
SC 20 16 3.3 1
2
12 6 2
PSO 35 152 33 1 61 30 4
TSO 37 165 35 1 61 31 5
![Page 50: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/50.jpg)
50
Results: Cube Extrapolation Build Boolean Program Model check
algorithm memory method # input # input # SMT time # cubes cube # states memory time
model preds cubes calls (K) (sec) used size (K) (MB) (sec)
Queue
SC Trad 7 - 20 5 50
4
1 2 1
PSO PE
15 - 5,747 1,475 412 1 4 1
CE 99 98 17 99 11 6 2
TSO PE
16 - 11,133 2,778 412 12 4 1
CE 99 163 31 99 12 7 2
Bakery
SC Trad 15 - 1,552 355 161
4
20 8 2
PSO PE
38 - - T/O - - - -
CE 422 9,018 1,773 381 979 375 104
TSO PE
36 - - T/O - - - -
CE 422 7,048 1,386 383 730 285 121
Ticket
SC Trad 11 - 218 51 134
4
2 2 1
PSO PE
56 - - T/O - - - -
CE 622 15,644 2,163 380 193 123 40
TSO PE
48 - - T/O - - - -
CE 622 6,941 1,518 582 71 67 545
![Page 51: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/51.jpg)
51
Numerical analysis under SC
Thread 0:0:1:flag0 := true2: while flag1 = true {3: if turn ≠ 0 {4: flag0 := false5: while turn ≠ 0 { }6: flag0 := true7: }8:}9:// critical sectionA: turn := 1B: flag0 := false
Thread 1:0:1: flag1 := true2: while flag0 = true {3: if turn ≠ 1 {4: flag1 := false5: while turn ≠ 1 { }6: flag1 := true7: }8: }9: // critical sectionA: turn := 0B: flag1 := false
initial: flag0 = false, flag1 = false, turn = 0
(0,0) {turn=0; flag1=0; flag0=0}(9,9) { }(2,2) {flag1-1=0; flag0-1=0; -turn+1>=0; turn>=0}(2,9) {flag1-1=0; flag0-1=0;}
//line number indicate state at the end of the line (i.e. after executing)
![Page 52: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/52.jpg)
52
Use same encoding for PM
load t = X if (Xcnt == 0) t = Xif (Xcnt == 1) t = X1 if (Xcnt == 2) t = X2
store X = t if (Xcnt == 2) “overflow”Xcnt ++if (Xcnt == 1) X1 = tif (Xcnt == 2) X2 = t
(shown for k=2)
![Page 53: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/53.jpg)
while random do if flag0_cnt_0 > 0 then flag0 = flag0_1_0; if flag0_cnt_0 > 1 then flag0_1_0 = flag0_2_0; flag0_cnt_0 = flag0_cnt_0 - 1; yield;
Flush operation
//At this point we can’t know if there was a flush or not, due to the non deterministic loop.
![Page 54: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/54.jpg)
flush is a problem for convex domains
● The non deterministic flush captures two possible buffer states 1. value is flushed -- buffer content shifted one slot 2. value is not flushed – buffer does not change
● To avoid losing precision, have to track disjuctions in a convex numerical domain
1 33 3cnt_t1=1 cnt_t1=2
join
3cnt_t1=[1,2]
[1,3]
flushed non flushed
![Page 55: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/55.jpg)
55
Refine the abstraction
• Leverage boolean-numerical domains– Add boolean flags– Similar to trace partitioning domain– Supported by our SC verifier – ConcurInterproc
1 33 3cnt_t1=1 cnt_t1=2
join
flushed: non flushed:
1 3
cnt_t1=2
3 3cnt_t1=1
,
¬𝑏
𝑏
![Page 56: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/56.jpg)
Refined flush operation
b_f1_flag0_0_t0 = false;b_f1_flag0_1_t0 = false;yield;while random do if flag0_cnt_0 > 0 then flag0 = flag0_1_0; if flag0_cnt_0 > 1 then b_f1_flag0_1_t0 = true; flag0_1_0 = flag0_2_0; else b_f1_flag0_0_t0 = true; flag0_cnt_0 = flag0_cnt_0 - 1; yield;
![Page 57: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/57.jpg)
57
Challenge: state explosion
• Using refined flush operations everywhere is not feasible – state explosion
• We would like to find a minimal refinement that enables verification with a minimal fence placement
• Search space exponential in number of fence placements and in number of refinement placements
![Page 58: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/58.jpg)
58
Idea 3: Refinement propagation
propagation of: program correctness + abstraction refinements
f1,r1 f1,r1f2,r2 f2,r2
f3,r3f3,r3
program has been explored
means is a successful abstraction refinement used to verify program
program to be explored program need not be explored
is an attempt to verify with a combined abstraction refinement
![Page 59: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/59.jpg)
59
Two dimensional search
• Start from full fence placement• A verification attempt produces new options to explore
– If verified – smaller placements should be explored• Either fewer fences, or coarser abstraction
– If failed – larger placements should be explored• Either additional fences, or finer abstraction
– If “unknown” • Try both directions
• We keep a worklist from which we choose the next placement to explore– Do not try subset of failed or superset of verified– A small verified placement or a large failed placement reduce the
search space substantially – so we guide the search
![Page 60: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/60.jpg)
60
Benchmark
● 15 concurrent algorithms● 8 infinite state● Safety specifications: Either mutual exclusion
or reachability invariants involving labels of different threads
![Page 61: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/61.jpg)
61
Example: PC1
• 9 possible fences • 27 possible locations for flush refinements • BFS
– explore various boolean placements for full fenced placement for 3:30 hours• DFS
– Verifies 5 fence placements in under 5 mins– State explosion leads to exploring failing placements for the rest of the time
• Propagation– finds that a single fence is needed
![Page 62: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/62.jpg)
0:00:00 0:00:17 0:00:34 0:00:51 0:01:09 0:01:26 0:01:43 0:02:00 0:02:180
0.51
1.52
2.5ABP TSO
propbfsdfs
num
fenc
es
0:00:00 0:14:24 0:28:48 0:43:12 0:57:36 1:12:00 1:26:24 1:40:48 1:55:12 2:09:36 2:24:004.5
5
5.5
6
6.5Loop2_TLM TSO
propbfsdfs
num
fenc
esResults
0:00:00 0:02:52 0:05:45 0:08:38 0:11:31 0:14:24 0:17:16 0:20:09012345
WSQ-Chase TSOpropbfsdfs
#r lo
catio
ns
0:00:00 0:00:08 0:00:17 0:00:25 0:00:34 0:00:43 0:00:510
0.20.40.60.8
11.2
Queue TSOpropbfsdfs
#r lo
catio
ns
![Page 63: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/63.jpg)
63
Summary
• Abstraction-guided synthesis– Compute over-approximation of all possible program executions– Add minimal synchronization to avoid
(over-approximation of) bad schedules• Proof extrapolation
– Use information from the SC proof to help proof under RMM– Extrapolate predicates and cubes
• Refinement propagation – Implied correctness/incorrectness in the space of
fence/refinement placements– Combining information from different fence/refinement
placements
![Page 64: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/64.jpg)
64
Back to Boids
• With synchronization barriers• Numerical abstractions for tracking array
indices• Establishing conflict-freedom of array accesses
that may happen in parallel
• Computing forces can be done in parallel• Different Boids write to disjoint parts of the
array
![Page 65: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/65.jpg)
65
Boids Simulation
…
Generate initial locationsSpawn N Boid TasksWhile (true) { Wait on display-barrier Read locations of all boids Render boids}
While (true) { Read locations of other boids Wait on message-barrier Compute new location Update my location Wait on display-barrier}
Main task Boid task
Shared Memory (Global State)
Locations of N Boids
![Page 66: Abstractions for Relaxed Memory Models](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813c76550346895da60c40/html5/thumbnails/66.jpg)
66
http://practicalsynthesis.org/fender/