heap decomposition for concurrent shape analysis
DESCRIPTION
Heap Decomposition for Concurrent Shape Analysis. R. Manevich T. Lev-Ami M. Sagiv Tel Aviv University. G. Ramalingam MSR India. J. Berdine MSR Cambridge. Dagstuhl 08061, February 7, 2008. Thread modular analysis for coarse-grained concurrency. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/1.jpg)
Heap Decompositionfor Concurrent Shape
Analysis
R. ManevichT. Lev-AmiM. SagivTel Aviv
University
G. Ramalingam
MSR India
J. Berdine
MSR Cambridge
Dagstuhl 08061, February 7, 2008
![Page 2: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/2.jpg)
2
Thread modular analysisfor coarse-grained concurrency E.g., [Qadeer & Flanagan,
SPIN’03][Gotsman et al., PLDI’07] …
With each lock lk subheap h(lk) Partition heap
H = h(lk1) *…* h(lkn) local invariant I(lk)
inferred/specified When thread t
acquires lk it assumes I(lk) releases lk it ensures I(lk) Can analyze each thread “separately”
Avoid explicitly enumerating all thread interleavings
![Page 3: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/3.jpg)
3
Thread modular analysisfor fine-grained concurrency?
CAS
CAS
CAS
CAS
CAS (Compare And Swap)
No locks means more interference between threads
No nice heap partitioning
Still idea of reasoning about threads separately appealing
![Page 4: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/4.jpg)
4
Overview State space is too large for two reasons
Unbounded number of objects infinite Apply finitary abstractions to data structures (e.g.,
abstract away length of list) Exponential in the number of threads
Observation: Threads operate on part of state Correlations between different substates often
irrelevant to prove safety properties Our approach: develop abstraction for
substates Abstract away correlations between substates
of different threads Reduce exponential state space
![Page 5: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/5.jpg)
5
Non-blocking stack [Treiber 1986]
[1] void push(Stack *S, data_type v) {[2] Node *x = alloc(sizeof(Node));[3] x->d = v;[4] do {[5] Node *t = S->Top;[6] x->n = t;[7] } while (!CAS(&S->Top,t,x));[8] }
[9] data_type pop(Stack *S){[10] do {[11] Node *t = S->Top;[12] if (t == NULL)[13] return EMPTY;[14] Node *s = t->n;[15] data_type r = s->d;[16] } while (!CAS(&S->Top,t,s));[17] return r;[18] }
#define EMPTY -1
typedef int data type;
typedef struct node t { data type d; struct node t *n;} Node;
typedef struct stack t { struct node t *Top;} Stack;
![Page 6: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/6.jpg)
6
Example: successful push
[1] void push(Stack *S, data_type v) {[2] Node *x = alloc(sizeof(Node));[3] x->d = v;[4] do {[5] Node *t = S->Top;[6] x->n = t;[7] } while (!CAS(&S->Top,t,x));[8] }
Top
n
tn
xn
![Page 7: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/7.jpg)
7
Example: successful push
[1] void push(Stack *S, data_type v) {[2] Node *x = alloc(sizeof(Node));[3] x->d = v;[4] do {[5] Node *t = S->Top;[6] x->n = t;[7] } while (!CAS(&S->Top,t,x));[8] }
Top=CAS succeeds
n
n
tn
x
![Page 8: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/8.jpg)
8
Example: unsuccessful push
[1] void push(Stack *S, data_type v) {[2] Node *x = alloc(sizeof(Node));[3] x->d = v;[4] do {[5] Node *t = S->Top;[6] x->n = t;[7] } while (!CAS(&S->Top,t,x));[8] }
CAS fails
Top
n
tn
xn
n
![Page 9: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/9.jpg)
9
Concrete states with storable threads
Top
n
x
nx t
st
t
n
n
prod1
cons1
prod2
pc=7
cons2
pc=6
pc=14
pc=16
t
thread object:name +program location
local variable
next field of list
![Page 10: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/10.jpg)
10
Full state S1
Top
n
x
nx t
st
t
n
n
prod1
cons1
prod2
pc=7
cons2
pc=6
pc=14
pc=16
t
![Page 11: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/11.jpg)
11
Top
n
x
n
t
n
prod1
pc=7
Top
n
nx
t
prod2
pc=6
Top
n
n
cons1
pc=14t
Top
n
n
t
s
n
cons2
pc=16
M1 M2 M3 M4
Decomposition(S1) = M1 M2 M3 M4
Decomposition(S1)
Note that S1Decomposition(S1)
A substate represents all full states that
contain it
Decomposition isstate-sensitive
(depends on values of pointers and heap
connectivity)
![Page 12: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/12.jpg)
12
Full states S1 S2
S1 S2
Top
n
x
nx t
st
t
n
n
prod1
cons1
prod2
pc=7
cons2
pc=6
pc=14
pc=16
t
Top
n
x
nx t
st
t
n
n
prod2
cons2
prod1
pc=7
cons1
pc=6
pc=14
pc=16
t
![Page 13: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/13.jpg)
13
Decomposition(S1 S2)improve explanation
Top
nx
n
t
n
prod1
pc=7
Top
n
nx
t
n
prod2
pc=6
Top
n
n
t
cons1
pc=14
Top
n
nt
s
n
pc=16
cons2
Top
n
nx
t
n
prod1
pc=6
Top
nx
n
t
n
prod2
pc=7
Top
n
nt
s
n
pc=16
cons1
Top
n
n
t
cons2
pc=14
M1
M2
M3
M4
K1
K2
K3
K4
(S1S2) Decomposition(S1S2)Cartesian abstraction ignores
correlations between substates
Decomposition(S1S2) = (M1K1) (M2K2) (M3K3) (M4K4)
State space exponentially more compact
![Page 14: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/14.jpg)
14
Abstraction properties Substates in each subdomain
correspond to a single thread Abstract away correlations between
threads Exponential reduction of state space
Substates preserve information on part of heap (relevant to one thread)
Substates may overlap Useful for reasoning about programs with
fine-grained concurrency Better approximate interference between
threads
![Page 15: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/15.jpg)
15
Main results New parametric abstraction for heaps
Heap decomposition + Cartesian abstraction Parametric in underlying abstraction +
decomposition Parametric sound transformers
Allows balancing efficiency and precision Implementation in HeDec
Heap Decomposition + Canonical Abstraction Used to prove interesting properties of heap-
manipulating programs with fine-grained concurrency Linearizability
Analysis scales linearly in number of threads
![Page 16: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/16.jpg)
16
Sound transformers
{XHj1} j1
{XHj2} j2
{XHj3} j3
{Xj4} j4
{YHj1’} j1’
{YHj2’} j2’
{YHj3’} j3’
{YHj4’} j4’
#
![Page 17: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/17.jpg)
17
Pointwise transformers
{XHj1} j1
{XHj2} j2
{XHj3} j3
{XHj4} j4
{YHj1’} j1’
#
{YHj2’} j2’
#
{YHj3’} j3’
#
{YHj4’} j4’
#
often too imprecise
efficient
![Page 18: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/18.jpg)
18
Imprecision example[1] void push(Stack *S, data_type v) {[2] Node *x = alloc(sizeof(Node));[3] x->d = v;[4] do {[5] Node *t = S->Top;
[6] x->n = t;[7] } while (!CAS(&S->Top,t,x));[8] }
Top
n
nx
t
n
prod2
pc=6
M2 # : schedules prod1 and executes x->n=t
But where do x and t of prod1
point to?
![Page 19: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/19.jpg)
19
Imprecision example[1] void push(Stack *S, data_type v) {[2] Node *x = alloc(sizeof(Node));[3] x->d = v;[4] do {[5] Node *t = S->Top;
[6] x->n = t;[7] } while (!CAS(&S->Top,t,x));[8] }
Top
n
x
nx t
st
t
n
n
prod2
cons1
prod1
pc=7
cons2
pc=6
pc=14
pc=16
t #Top
n
x
n
t
n
prod2
pc=7
false alarm:possible cyclic
list
![Page 20: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/20.jpg)
20
Full composition transformers
{XHj1} j1
{XHj2} j2
{XHj3} j3
{XHj4} j4{XHj1}{XHj1}{XHj1}{X
Hj1} #
#({XHj1}{XHj2}{XHj3}{XHj4})
{YHj1’} j1’
{YHj2’} j2’
{YHj3’} j3’
{YHj4’} j4’
exponential space blow-up
precise
![Page 21: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/21.jpg)
21
Partial composition
{XHj1} j1
{XHj2} j2
{XHj3} j3
{XHj4} j4
{XHj1}{XHj2}
{XHj1}{XHj3}
{XHj1}{XHj4}
![Page 22: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/22.jpg)
22
Partial composition
{XHj1}{XHj2}
{XHj1}{XHj3}
{XHj1}{XHj4}
{YHj1’} j1’
{YHj2’} j2’
{YHj3’} j3’
{YHj4’} j4’
#
#({XHj1}{XHj2})
#
#({XHj1}{XHj3})
#
#({XHj1}{XHj4})
efficient and precise
![Page 23: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/23.jpg)
23
Partial composition example
Top
nx
n
t
n
prod1
pc=7
Top
n
nx
t
n
prod2
pc=6
Top
n
nx
t
n
prod1
pc=6
Top
nx
n
t
n
prod2
pc=7
M1
M2
K1
K2
{XHj1}{XHj2}
![Page 24: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/24.jpg)
24
Partial composition example
{XHj1} j1
{XHj2} j2
{XHj1}{XHj2}
Top
n
x
nx
t
t
n
prod2
prod1
pc=7
pc=7
Top
n
x
nx
t
t
n
prod2
prod1
pc=7
pc=6n
K2k1 K2M1
pc=7
false alarm avoided
![Page 25: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/25.jpg)
26
Experimental results List-based fine-grained algorithms
Non-blocking stack [Treiber 1986] Non-blocking queue [Doherty and Groves
FORTE’04]
Two-lock queue [Michael and Scott PODC’96] Benign data races
Verified absence of nullderef + mem. Leaks Verified Linearizability
Analysis built on top of existing full heap analysis of [Amit et al. CAV’07]
Scaled analysis from 2/3 threads to 20 threads Extended to unbounded threads (different work)
![Page 26: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/26.jpg)
27
0
50000
100000
150000
200000
250000
0 5 10 15 20
number of threads
nu
mb
er
of
stat
es
Decomp
Full
0
1000
2000
3000
4000
0 10 20
number of threads
tim
e (s
ec.)
Experimental results Exponential time/space reduction
Non-blocking stack + linearizability
![Page 27: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/27.jpg)
28
Related work Disjoint regions decomposition [TACAS’07]
Fixed decomposition scheme Most precise transformer is FNP-complete
Partial join [Manevich et al. SAS’04]
Orthogonal to decomposition In HeDec we combine decomposition + partial join
[Yang et al.] Handling concurrency for an unbounded
number of threads Thread-modular analysis [Gotsman et al. PLDI’07] Rely-guarantee [Vafeadis et al. CAV’07] Thread quantification (submitted)
![Page 28: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/28.jpg)
29
More related work Local transformers
Works by Reynolds, O’Hearn, Berdine, Yang, Gotsman, Calcagno
Heap analysis by separation[Yahav & Ramalingam PLDI’04] [Hackett & Rugina POPL’05] Decompose verification problem itself and
conservatively approximate contexts Heap decomposition for interprocedural
analysis [Rinetzky et al. POPL’05] [Rinetzky et al. SAS’05] [Gotsman et al. SAS’06] [Gotsman et al. PLDI’07] Decompose/compose at procedure boundaries
Predicate/variable clustering [Clark et al. CAV’00] Statically-determined decomposition
![Page 29: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/29.jpg)
30
Conclusion Parametric framework for shape
analysis Scaling analyses of program with fine-
grained concurrency Generalizes thread-modular analysis Key idea: state decomposition Also useful for sequential programs
Used prove intricate properties like linearizability
HeDec tool http://www.cs.tau.ac.il/~tvla#HEDEC
![Page 30: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/30.jpg)
31
Future/ongoing work Extended analysis for an unbounded
number of threads via thread quantification Orthogonal technique Both techniques compose very well
Can we automatically infer good decompositions?
Can we automatically tune transformers?
Can we ruse ideas to non-shape analyses?
![Page 31: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/31.jpg)
32
Invited questions How do you choose a decomposition? How do you choose transformers? How does it compare to separation
logic? What is a general principle and what
is specific to shape analysis? Caveats / limitations?
![Page 32: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/32.jpg)
33
How do you choose a decomposition? In general this an open problem
Perhaps ctrex. refinement can help Depends on property you want to prove Aim at causes of combinatorial explosion
Threads Iterators
For linearizability we used For each thread t
Thread node, objects referenced by local variables, objects referenced by global variables
Objects referenced by global variables and objects correlated with seq. execution
Locks component: for each lock thread that acquires it
![Page 33: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/33.jpg)
34
How do you choose transformers? In general challenging problem
Have to balance efficiency and precision Have some heuristics
Core subdomains
![Page 34: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/34.jpg)
35
How does it compare to separation logic? Relevant separating conjunction *r
Like * but without the disjointness requirement Do you have an analog of the frame rule?
For disjoint regions decomposition [TACAS’07] In general no, but instead we can use
transformers of different level of precision
#(I1 I2) = #precise(I1) #less-precise(I2)
where #less-precise is cheap to compute Perhaps can find conditions for which
#(I1 I2) = #precise(I1) I2 Relativized formulae
![Page 35: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/35.jpg)
36
What is a general principle and what is specific to shape analysis? Decomposing abstract domains is
general Substate abstraction + Cartesian product
Parametric transformers for Cartesian abstractions is general
Chopping down heaps by heterogeneous abstractions is shape-analysis specific
![Page 36: Heap Decomposition for Concurrent Shape Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062518/5681462f550346895db33cfb/html5/thumbnails/36.jpg)
37
Caveats / limitations? Decomposition + transformers defined by
user Not specialized for program/property
Too much overlap between substates can lead to more expensive analyses
Too fine decomposition requires lots of composition
Partial composition is a bottle neck We have the theory for finer grained
compositions + incremental transformers but no implementation
Instantiated framework for just one abstraction (Canonical Abstraction) Can this be useful for separation logic-based
analyzers?