texpoint fonts used in emf. read the texpoint manual before you delete this box.: a a
DESCRIPTION
The Reachability -Bound Problem. Sumit Gulwani (Microsoft Research, Redmond). Sudeep Juvekar (UC-Berkeley). Joint work with. Florian Zuleger (TU Darmstadt). TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A. The Reachability -Bound Problem. - PowerPoint PPT PresentationTRANSCRIPT
Sumit Gulwani(Microsoft Research, Redmond)
The Reachability-Bound Problem
Florian Zuleger(TU Darmstadt)
Sudeep Juvekar(UC-Berkeley)
Joint work with
Let ¼ be some control location inside a procedure.
• Safety: Is ¼ never visited?– Violation is a finite trace
• Liveness: Is ¼ visited at most finite number of times?– Violation is an infinite trace
• Reachability-Bound: Symbolic bound on maximum visits to ¼.– Quantitative question as opposed to Boolean.– Checking validity of a given bound is a safety property.– Checking precision is not even a trace property.– The problem is challenging!
2
The Reachability-Bound Problem
• Programs consume a variety of resources.– CPU time, Memory, Network Bandwidth, Power
• It is important to bound use of such resources.– Economic incentives– Better user experience– Hard constraints on availability of resources
• Real-time/embedded systems, Low power/bandwidth devices
• This requires computing bounds on # of visits to control-locations that consume these resources.– Memory Allocated = §¼ [Visits(¼) £ BytesAllocated(¼)]
– Asymptotic Time Complexity = §H [Visits(H)], where H ranges over loop headers. 3
Motivation 1: Resource Bound Analysis
• Program execution affects certain quantitative properties of data.– Secrecy: information leakage.– Robustness: error/uncertainty propagation.
• Bounding such properties requires computing bound on # of visits to control-locations that affect such properties of the data.
4
Motivation 2: Quantitative Analysis of Data
• Time Complexity = Visits(¼1) + Visits(¼2)– Visits(¼1) · n and Visits(¼2) · n2
• Memory Allocated = Visits(¼3) £ SizeOf(C)
– Visits(¼3) · n2
5
Example (.Net Library)
Inputs: int n, bool[] A i := 0; ¼1: while (i < n) { j := i+1; ¼2: while (j < n) { if (A[j]) { ¼3: B[n] := new C(); } j++; } i++; }
• Time Complexity = Visits(¼1) + Visits(¼2)– Visits(¼1) · n and Visits(¼2) · n2
• Memory Allocated = Visits(¼3) £ SizeOf(C)
– Visits(¼3) · n
– Nested loop does not necessarily imply quadratic complexity. 6
Example (.Net Library)
Inputs: int n, bool[] A i := 0; ¼1: while (i < n) { j := i+1; ¼2: while (j < n) { if (A[j]) { ¼3: B[n] := new C(); j--; n--; } j++; } i++; }
Examine the loop induced by the control-flow graph starting at , and the next visit to it.
• Loop has one path.– Compute ranking function using constraint-based or
proof rules based technique.• Loop has multiple paths.
– Compose ranking functions for paths using proof rules.– One proof rule each for Max, Sum, and Product
composition.• Loop has inner loops.
– Summarize inner loops by precise disjunctive invariants using forward iterative technique (abstract interpretation).
• Loop has other loops before it.– Perform backward symbolic execution (using proof
rules to trace across loops) to express bound in terms of inputs.
7
Algorithm: A variety of fixed-point techniques
Examine the loop induced by the control-flow graph starting at , and the next visit to it.
Loop has one path.– Compute ranking function using constraint-based or
proof rules based technique.• Loop has multiple paths.
– Compose ranking functions for paths using proof rules.– One proof rule each for Max, Sum, and Product
composition.• Loop has inner loops.
– Summarize inner loops by precise disjunctive invariants using forward iterative technique (abstract interpretation).
• Loop has other loops before it.– Perform backward symbolic execution (using proof
rules to trace across loops) to express bound in terms of inputs.
8
Algorithm: A variety of fixed-point techniques
• There is one path between ¼ and the next visit to it. Path 1: i<n Æ j<m Æ j’=j+1 Æ i’=i+1 Æ
Same({n,m}) • n-i is a ranking function for path 1 because
– n-i > 0– n-i decreases in each iteration, i.e., (n’-i’) < (n-i)
• Visits(¼) · Value of n-i immediately before the loop = n
• Similarly, m-j is also a ranking function and Visits(¼) · m
9
Ranking Function: Arithmetic Loops
Inputs: uint n,m i := j := 0;¼: while (i<n Æ
j<m) j++;
i++;
Visits(¼) · Min(n,m)
• Guess a ranking function e– For each (syntactically appearing) inequality e1 ¸
e2 in P, guess e1-e2 to be a candidate.
• Check whether e is a ranking function by validating the following constraints using an SMT solver.
P ) e¸0 P ) (e[X’/X] · e-1)
• The proof rule based technique extends readily to cases other than integer arithmetic.– E.g., loops that iterate over bit-vectors or data-
structures 10
Computing Ranking Functions: Proof Rule Technique
• The proof-rule based technique is not complete. Consider the following example.– P: x¸0 Æ y¸0 Æ x’=y Æ y’=x-1– Neither x nor y is a ranking function, but x+y is.
• There is a “complete” method to find linear ranking functions [Podelski, Rybalchenko, VMCAI ‘04] – Let ranking function be of form a1x + a2y + a3
– We want to find a1, a2, a3 such that for all x,y
• P ) (a1x+a2y+a3) ¸ 0 and
• P ) (a1x’+a2y’+a3) · (a1x+a2y+a3) -1– Farkas Lemma can be used to reduces the above
system of quantified equations to that of linear inequalities.
11
Computing Ranking Functions: Constraint-based Technique
Ranking Function: Bitvector Loops (SQL)
12
Input: bitvector b¼: while (b 0) b := b <<
1;Visits(¼) · RMB(b)
Input: bitvector b ¼: while (BitScanForward(&id1,b)) b := b | ((1 << id1)-1); if (BitScanForward(&id2,~x)
break; b := b & (~((1 << id2)-1);
Visits(¼) · Min { Ones(b), RMB(b)/2 }
Input: bitvector b¼: while (b 0) b := b & (b-
1);Visits(¼) · Ones(b)
Ones(b): # of 1 bits in bitvector bRMB(b): position of right-most 1-bit
Ranking Function: Data-structure Loops
13
Input: List L¼: while (L Null) L := L.Next;
Visits(¼) · Length(L, Next)
Input: ICollection C¼: foreach(Element e in
C) …Requires analysis of C.MoveNext() method.In case of virtual method, we define Visits(¼) to be C.count
Examine the loop induced by the control-flow graph starting at , and the next visit to it.
• Loop has one path.– Compute ranking function using constraint-based or
proof rule based technique. Loop has multiple paths.
– Compose ranking functions for paths using proof rules.– One proof rule each for Max, Sum, and Product
composition.• Loop has inner loops.
– Summarize inner loops by precise disjunctive invariants using forward iterative technique (abstract interpretation).
• Loop has other loops before it.– Perform backward symbolic execution (using proof
rules to trace across loops) to express bound in terms of inputs.
14
Algorithm: A variety of fixed-point techniques
Composition of Ranking Functions
15
Inputs: uint n,m
i := j := 0;¼: while (i<n) if (j<m)
j++; else i++;
Visits(¼) · n + m
Inputs: uint n,m i := j := 0;¼: while (i<n) if (j<m) j++;
else {i++; j:=0;}
Visits(¼) · n £ (1+m)
Inputs: uint n,m i := j := 0;¼: while (j<m Ç
i<n) j++; i++;
Visits(¼) · Max(n,m)
Path 1: j<m Æ j’=j+1 Æ i’=i+1Path 2: i<n Æ j’=j+1 Æ i’=i+1
Path 1: i<n Æ j<m Æ j’=j+1 Path 2: i<n Æ j¸m Æ i’=i+1
Path 1: i<n Æ j<m Æ j’=j+1 Path 2: i<n Æ j¸m Æ i’=i+1 Æ j’=0
Let r1, r2 be ranking functions for p1, p2 respectively. Non-Interference NI(p1,p2,r2):• Non-enabling condition: p1 ± p2 = false• Rank preserving condition: p1 ) r2[x’/x] · r2
Proof Rule: If NI(p1,p2,r2) and NI(p2,p1,r1), then: Bound(p1 Ç p2) =
Example: p1: (i<n Æ i’=i+1 Æ Same({j,n,m}) )p2: (j<m Æ j’=j+1 Æ Same({i,n,m}) )r1: n-i, r2: m-jBound(p1 Ç p2) = Max(0, n-i) + Max(0, m-j) = n + m
16
Proof Rule for Additive Composition
Max(0, r1) + Max(0,r2)
Let r1, r2 be ranking functions for p1, p2 respectively.
Proof Rule: If NI(p2,p1,r1), then: Bound(p1 Ç p2) =
where u2(X) is an upper bound on r2[X’/X] as implied by p1.
Example: p1: (i<n Æ i’=i+1 Æ j’=0 Æ Same({n,m}))p2: (j<m Æ j’=j+1 Æ Same({i,n,m}))r1: n-i, r2: m-jBound(p1 Ç p2) = Max(0,n-i) * [1 + Max(0,m-j)] = n * (1+m) 17
Proof Rule for Multiplicative Composition
Max(0,r1) + Max(0,r2) + Max(0,u2)*Max(0,r1)
Let r1, r2 be ranking functions for p1, p2 respectively. Cooperative Interference CI(p1,r1,p2,r2):• Non-enabling condition: p1 ± p2 = false• Rank decrease condition: p1 ) r2[x’/x] · Max(r1,r2)-
1
Proof Rule: If CI(p1, r1, p2, r2) and CI(p2,r2,p1,r1), then:
Bound(p1 Ç p2) =
Example: p1: (i<n Æ i’=i+1 Æ j’=j+1 Æ Same({n,m}) )p2: (j<m Æ i’=i+1 Æ j’=j+1 Æ Same({n,m}) )r1: n-i, r2: m-jBound(p1 Ç p2) = Max(0, n-i, m-j) = Max(n,m)
18
Proof Rule for Max Composition
Max(0, r1, r2)
Examine the loop induced by the control-flow graph starting at , and the next visit to it.
• Loop has one path.– Compute ranking function using constraint-based or
proof rule based technique.• Loop has multiple paths.
– Compose ranking functions for paths using proof rules.– One proof rule each for Max, Sum, and Product
composition. Loop has inner loops.
– Summarize inner loops by precise disjunctive invariants using forward iterative technique (abstract interpretation).
• Loop has other loops before it.– Perform backward symbolic execution (using proof
rules to trace across loops) to express bound in terms of inputs.
19
Algorithm: A variety of fixed-point techniques
• A loop with body T can be replaced by TransitiveClosure(T).
• We say that a relation R is TransitiveClosure(T) if Id ) R and R ± T ) R where Id is the relation X’=X
• Precise transitive closures can be computed using iterative fixed-point techniques such as abstract interpretation or model checking.
• Example of TransitiveClosure(s1 Ç s2)
20
Transitive Closure
s1: i’=i+1 Æ j’=0
s2: i’=i Æ j’=j+1
(i’¸i+1 Æ j’¸0) Ç (i’=i Æ j’¸j)
Visits(¼3) · n
21
Example (.Net Library)
Inputs: int n, bool[] A i := 0; ¼1: while (i < n) { j := i+1; ¼2: while (j < n) { if (A[j]) { ¼3: B[n] := new C(); j--; n--; } j++; } i++; }
22
Split Control Location
B[n] := new C;
yesπ3j--;n--;
no A[j]
i := 0;
j := i+1;
i := i+1;
j := j+1;
yes
yes
no
no
end
i < n
j < n
begin
B[n] := new C;
j--;n--;
yesno
π3a
A[j] π3b
i := 0;
j := i+1;
i := i+1;
j := j+1;
yes
yes
no
no
begin
end
i < n
j < n
23
Split Control Location
B[n] := new C;
j--;n--;
yesno
π3a
A[j] π3b
i := 0;
j := i+1;
i := i+1;
j := j+1;
yes
yes
no
no
begin
end
i < n
j < n
j := i+1;
i := i+1;
B[n] := new C;
j := j+1;
j--;n--;
π3a
π3b
yes
yes
yes
no
no
i < nj < n
A[j]
j < n
24
Transition System Generation
j := i+1;
i := i+1;
B[n] := new C;
j := j+1;
j--;n--;
π3a
π3b
yes
yes
yes
no
no
i < n
A[j]
j < n
• Transition-system T1 of inner loop (j¸n Æ i<n-1 Æ i’=i+1 Æ j’=i+2)• T1’ = Transitive Closure(T1) = (i’=iÆj’=j) Ç (j¸n Æ i<n-1 Æ i’>i Æ
j’¸i+2)
25
Transition System Generation
j := i+1;
i := i+1;
B[n] := new C;
j := j+1;
j--;n--;
π3a
π3b
yes
yes
yes
no
no
i < n
A[j]
• Transition-system T1 of inner loop: (j¸n Æ i<n-1 Æ i’=i+1 Æ j’=i+2)• T1’ = Transitive Closure(T1) = (i’=iÆj’=j) Ç (j¸n Æ i<n-1 Æ i’>i Æ
j’¸i+2)
26
Transition System Generation
B[n] := new C;
j := j+1;
j--;n--;
π3a
π3b
yes
noA[j]
T1‘
• Transition-system T1 of inner loop: (j¸n Æ i<n-1 Æ i’=i+1 Æ j’=i+2)• T1’ = Transitive Closure(T1) = (i’=iÆj’=j) Ç (j¸n Æ i<n-1 Æ i’>i Æ
j’¸i+2)• Transition-system T2 of outer loop (j<n-1 Æj’=j+1 Æi’=i) Ç (i<n-1 Æi’>i
Æj’¸i+2)• T2’ = Transitive Closure(T2) = (j’¸j Æ i’=i) Ç (i<n-1 Æ i’>i Æ
j’¸i+2)
27
Transition System Generation
B[n] := new C;
j := j+1;
j--;n--;
π3a
π3b
yes
noA[j]
T1‘
• Transition-system T1 of inner loop: (j¸n Æ i<n-1 Æ i’=i+1 Æ j’=i+2)• T1’ = Transitive Closure(T1) = (i’=iÆj’=j) Ç (j¸n Æ i<n-1 Æ i’>i Æ
j’¸i+2)• Transition-system T2 of outer loop (j<n-1 Æj’=j+1 Æi’=i) Ç (i<n-1 Æi’>i
Æj’¸i+2)• T2’ = Transitive Closure(T2) = (j’¸j Æ i’=i) Ç (i<n-1 Æ i’>i Æ
j’¸i+2)• Transition-system(¼3)
(n’=n-1 Æ j<n-1 Æ j’¸j Æ i’=i) Ç (n’=n-1 Æ i<n-1 Æ i’>i Æ j’¸i+2)
28
Transition System Generation
B[n] := new C;
j--;n--;
π3a
π3b
T2‘
1. Transition-system(¼3) P1: (n’=n-1 Æ j<n-1 Æ j’¸j Æ i’=i) Ç
P2: (n’=n-1 Æ i<n-1 Æ i’>i Æ j’¸i+2)
2. n-1-j is a ranking function for P1. n-1-i is a ranking function for P2.
3. Proof Rule for Max Composition yields a bound of Max(0, n-1-i, n-1-j), which involves variables live at ¼3.
4. During first visit to ¼3, we have i¸0 Æ j¸1. This yields a bound of Max(0,n-1) in terms of procedure inputs.
29
Reachability-Bound Computation
Examine the loop induced by the control-flow graph starting at , and the next visit to it.
• Loop has one path.– Compute ranking function using constraint-based or
proof rule based technique.• Loop has multiple paths.
– Compose ranking functions for paths using proof rules.– One proof rule each for Max, Sum, and Product
composition.• Loop has inner loops.
– Summarize inner loops by precise disjunctive invariants using forward iterative technique (abstract interpretation).
Loop has other loops before it.– Perform backward symbolic execution (using proof
rules to trace across loops) to express bound in terms of inputs.
30
Algorithm: A variety of fixed-point techniques
Visits(¼) = C3.Count · C1.Count
31
Backward Symbolic Execution (.Net Library)
Inputs: List<int> C1, List<int> C2
List<int> C3 = new List<int>();
AddElements(C3,C1); DeleteElements(C3,C2); ¼: foreach (int e in C3) … AddElements(List<int> L1, List<int>
L2) foreach (int e in L2) L1.Add(e);
• Backward Propagation may require tracing back across procedure calls and loops.
DeleteElements(List<int> L1, List<int> L2)
foreach (int e in L2) if (L1.Contains(e)) L1.Delete(e);
Use algorithm for computing Visits to relate values of a variable before and after a loop.
32
Backward Symbolic Execution across Loops
n := mwhile (e) { S1 ¼: n :=
n+3; S2}
nafter · nbefore + 3£Visits(¼)
• Computes symbolic computational complexity of procedures.
• Built over Phoenix Compiler Infrastructure and analyzes .Net binaries.
• Uses Z3 SMT solver as the logical reasoning engine.– Can reason about various data-types: arithmetic, bit-
vector, boolean, list/collection variables.• Takes between 0.1 to 1 second to analyze each loop.• Success ratio of 60-90% for computing loop bounds.• Representative failure cases:
– Lack of global invariant analysis.• for (i:=0; i<n; i := i+g);• for (i:=0; ig; i := i+1);
– Failure to resolve virtual method calls. 33
SPEED Tool
Limitations and potential Extensions• Worst-case bounds (as opposed to average bounds)
– Challenge: Requires modeling average/representative inputs.
– Use profiling/user-annotations to rule out exceptional paths.
• Static cost model for timing analysis– Challenge: Difficult to model low-level architectural
details like caches, pipelines. – Profiling may help generate a precise cost model.
• Imprecision (may generate higher bounds than possible)– Challenge: Undecidable problem in general.– Possible to generate proof of precision of bounds.
• Sequential Programs (as opposed to Concurrent programs)– Challenge: Variety of concurrent programming models;
scheduling policies; # of processors– Might be possible to model some of them.
34
• Detailed lecture notes available at http://www.cs.uoregon.edu/research/summerschool/summer09
• Bound computation using Recurrence Relations– Albert, Arenas, Genaim, Puebla, SAS ‘08
• Termination– Disjunctively well-founded ranking functions
• Cook, Podelski, Rybalchenko, PLDI 2006– Size-change abstraction
• Ben-Amran, CAV 2009• Worst Case Execution Time
– R. Wilhelm et.al., ACM TECS 2007
35
Related Work
• Bound Computation: An important application area that can leverage advances in static program analysis.
• An effective solution involved a variety of techniques for reasoning about loops/fixed-points.– Iterative techniques for summarizing inner loops.– Constraint-based techniques for ranking functions. – Proof-rule based technique for composition of ranking
functions and bound computation in terms of inputs.
• Several important/open/challenging problems.– Concurrent Procedures, Average-case Bounds
36
Conclusion