new results in program slicing
Post on 30-Dec-2015
24 Views
Preview:
DESCRIPTION
TRANSCRIPT
1
New Results inProgram Slicing
Aharon Abadi, Ran Ettinger, and Yishai Feldman
IBM Haifa Research Lab
2
Context
• The Programmer’s Apprentice– The Plan Calculus
• Bogart
• Midas
• Sliding
• Painless– Paz– Aderet
3
Improving Slice Accuracy by Compression of Data and Control Flow Paths
Presented at ESEC/FSE 2009
4
Program Slicing
Program
x := expStart Slice x := exp
Slice
The same sequence of values
5
A:Z0A:Z0
Control-Flow Path Compression
go-to Bgo-to B
if-zero-go-to A
test X
Work in two stages:- Compute the ‘traditional’ slice
- Control dependences- Data Dependences
- Compute the necessary branches to prevent infeasible control paths
test X
if-zero-go-to A
. . .
L:test Y
if-zero-
. . .
go-to L
B:
. . .
B:
6
A:Z0
go-to B
if-zero-go-to A
test X
Limitations of previous approaches:- insert all the loop;- add branches not from the program; or- do not preserve behavior
This algorithm:- preserves behavior- yields a sub-program
- one version may turn conditional branches into unconditional ones (“rhetorization”)
B:go-to Bgo-to B
test X test X
. . .
L:test Y
if-zero-
. . .
go-to L
B:
. . .
if-zero-go-to A
A:Z0
Control-Flow Path Compression
7
Data-Flow Path Compression
The result is too large
The value of R7 does not depend on the loop
R7:=exp1Out: R0:=R7 + 1
Previous syntax-preserving algorithms insert the loop and the assignments inside it
Out: R0:= R7 + 1
Start:R2:=0
R7:=exp1
Loop: R2:=R2 + 1
compare R2, R9
if-not-less-go-to Out
use R7
Temp:=R7; spill R7 to memory
… ; code that uses
; all registers
R7:=Temp; restore R7
go-to Loop
Start:R2:=0 R7:=exp1Loop: R2:=R2 + 1 compare R2, R9 if-not-less-go-to Out Temp:=R7 R7:=Temp go-to Loop Out: R0:=R7 + 1
8
Control-Flow Path Compressionx<11
F T
x:=x+1 goto A4
goto A2
y<TT F
y:=y-1
goto A
print(x)
x<9T F
x:=x-1x:=x+2
goto A2
goto A3
if (x<11)
x := x+1
goto A2
A1: if (y<T)
y := y–1
goto A1
goto A2
goto A4
x := x-1
A4: if (x<9) goto A3
A3: x := x+2
A2: print(x)
9
Compute the ‘Traditional’ Slicex<11
F T
x:=x+1 goto A4
goto A2
y<TT F
y:=y-1
goto A
print(x)
x<9T F
x:=x-1x:=x+2
goto A2
goto A3
if (x<11)
x := x+1
goto A2
A1: if (y<T)
y := y–1
goto A1
goto A2
goto A4
x := x-1
A4: if (x<9) goto A3
A3: x := x+2
print(x)
A2: print(x)
x:=x+1
x:=x+2 x:=x-1
x<11
x<9
y<T
10
Completing Control Flow Paths:Main Lemma
• precisely identifies the possible sets of branches that may be added to the slice
• any path in the original program can be chosen
• optimizations can be performed
All paths from the same point in the slice enter the slice at a single point
11
Compute the Necessary Branchesx<11
F T
x:=x+1 goto A4
goto A2
y<TT F
y:=y-1
goto A
print(x)
x<9T F
x:=x-1x:=x+2
goto A2
goto A3
if(x<11)
x:=x+1
goto A2
A1: if(y<T)
y:=y–1
goto A1
goto A2
goto A4
x:=x-1
A4: if(x<9) goto A3
A3: x:=x+2
A2: print(x)
12
Start:R2:=0 R7:=exp1Loop: R2:=R2 + 1 compare R2, R9 if-not-less-go-to Out use R7 Temp:=R7; spill R7 to ;memory … ; code that uses ;all registers R7:=Temp; restore R7 go-to Loop Out: R0:=R7 + 1
Data-Flow Path Compression
R7:=exp1Out:R0:=R7 + 1 +1
R7:=exp1
exit
R0:=R7+1
R2:=0
R2:=R2+1
compare R2,R9
if-not-less
use R7
Temp:=R7
R7:=Temp
goto Loop
go-to Out
13
++
++
exp1
Data-Flow Path Compression
R7:=exp1
exit
R7:=R7+1
R2:=0
R2:=R2+1
compare R2,R9
if-not-less
use R7
Temp:=R7
R7:=Temp
goto-Loop
• R7,Temp carry the value of exp1
• Use data edges instead of variables
go-to Out
out data portholds the last valuein data port
holds the next value
d1 d2
d1
Start:R2:=0 R7:=exp1Loop: R2:=R2 + 1 compare R2, R9 if-not-less-go-to Out use R7 Temp:=R7; spill R7 to ; memory … ; code that uses ; all registers R7:=Temp; restore R7 go-to Loop Out: R0:=R7+1
0
• The Plan Calculus:The Programmer’s Apprentice,Rich and Waters, 1990
14
exp1
entry
0
exit
++
R7
R0
R9
R2
++
R2
T F
compare R2,R9
R7:= exp1R0:=R7 + 1
Start:R2:=0
Loop: R2:=R2 + 1 compare R2, R9 if-not-less-go-to Out use R7 Temp:=R7; spill R7 to ; memory … ; code that uses ; all registers R7:=Temp; restore R7 go-to Loop Out: R0:=R7 + 1
R7:=exp1
Out: R0:=R7 + 1
R7:=exp1
if-not-less
use R7
15
exp1
0
exit
++
R7
R0
R9
R2
++
R2
T F
compare R2,R9
Start:R2:=0
Loop:R2:=R2 + 1 compare R2, R9 if-not-less- use R7 ; spill R7 to ; memory … ; code that uses ; all registers ; restore R7 go-to Loop Out: R0:=R7 + 1
R7:=exp1
if-not-less
use R7
Decompression
go-to Out
Temp:=R7
R7:=Temp
R7:=exp1
R0:=R7 + 1
go-to Out
entry
Out:
16
Properties of the Slices
• Syntax preserving, possibly rhetorizing• Behavior preserving• Executable• For structured programs
– At least as accurate as previous algorithms– Strictly smaller in interesting cases
• For unstructured programs– Empirically shown to be superior– Modification of the algorithm guaranteed at least as
accurate
17
Implementation
• A family of slicing algorithms– rhetorizing (*RB, *RM)– strictly syntax-preserving
(*PB, *PM)– amorphous (*AB, *AM)
• adds new branches(not from the program)
A1:if(y<T) goto A2
A:Z 0
if-zero-go-to A
test X
. . .
L:test Y
if-zero-go-to B
. . .
go-to L
C:
go-to exit
. . .goto exit
B:go-to C
18
Empirical Study
• Corpus of 15 manually-written assembly-language modules from a large mainframe product
• 8578 non-comment source lines
• Computed slices from all lines
• 5801 non-empty slices
19
Empirical Results
Effect of%slices better
%average decrease
%slices worse
%average decrease
Rhetorization177.5
Control path compression
Lenient BH3017
Strict BH9465
Data path compression
implemented124815
modified
20
Related WorkBehaviorPreserve
behaviorMay add infinite loops
Not executable
BH,CF1,Ag, HLB,*P,*R, *A
HLB, HDKH
Subset of the original program(for flat languages)
Syntax-preserving
RhetorizingAmorphous
BH, CF1, Ag, HD, HLB, *P
*RHLB, CF, *A
Comparison to traditional algorithm on structured programs
Smaller than traditional
Equal to traditional
Larger than traditional
*P, *R, *ABH, CF1, Ag, HD, KH, HLB, CF2
BH: Ball & Horwitz 1993CF: Choi & Ferrante 1994Ag: Agrawal 1994
KH: Kumar & Horwitz 2002HD: Harman & Danicic 1998HLB: Harman, Lakhotia & Binkley 2006
21
Conclusions
• Two techniques for reducing slice size– Control-Flow Path Compression
• Precise identification of all correct solutions• Shortest paths significantly improve slice accuracy
– 17-22% improvement for 30-37% of the cases– Data-Flow Path Compression
• Eliminates copy assignments• Yields significant improvement in a few cases
– 24% improvement for 1% of the slices computed
• Strictly smaller even for structured programs
22
Fine Slicing forProgram Transformation
23
Refactoring’s Rubicon:Extract Method
• Automating Extract Method is Refactoring’s Rubicon (Fowler*)– The one that demonstrates “serious tool
support”– Precondition for many other transformations
• This Rubicon has not yet been crossed– Getting it right requires more analysis
capability than is available in current IDEs
*http://www.martinfowler.com/articles/refactoringRubicon.html
24
Fowler’s Example (website)void printOwing() { printBanner();
//print details System.out.println("name: " + _name); System.out.println("amount " + getOutstanding());}
void printOwing() { printBanner(); printDetails(getOutstanding());}
void printDetails(double outstanding) { System.out.println("name: " + _name); System.out.println("amount " + outstanding);}
25
A Case Study inEnterprise Refactoring
• Converted a Java Servlet to use the MVC pattern*
• Used as much automated support as available– The whole conversion could be described as a series
of cataloged (“small”) refactorings– Most steps were inadequately supported by the IDE– Some were not supported at all
* Based on Alex Chaffee’s “Refactoring to Model-View-Controller” article (http://www.purpletech.com/articles/mvc/refactoring-to-mvc.html)
26
Case-Study: Automation (1)
13Total
3
3
2
1
1
1
1
1
Extract Method
Extract Temp
(Self) Encapsulate Field
Replace Magic Number with Symbolic Constant
Inline Temp
Extract Superclass
Delete Methods
Move Method
UsesFully Supported Refactorings
27
Case-Study: Automation (2)
23Total
10
5
3
2
1
1
1
Extract Method *
Substitute Expression **
Replace Temp with Query *
Replace Method with Method Object **
Substitute Statement **
Extract Class *
Move Statement (or Swap Statements) **
UsesPartial(*) or No(**) Support
28
Currently Unsupported Casesof Extract Method
(a) Extract multiple fragments
(b) Extract a partial fragment– select sub-expressions as parameters
(c) Extract loop with partial body– loop duplication with data flow
(d) Extract code with conditional exits
Program slicing pulls related code together!
29
slice (v.): to cut with or as if with a knife
Merriam-Webster
slice (n.): a thin flat piece cut from something
30
A (backward) slice of a given program with respect to selected “interesting” variables is a subprogram that computes the same values as the original program for the selected variables
A (backward) fine slice of a given program with respect to selected “interesting” variables and other “oracle” variables is a subprogram that computes the same values as the original program for the selected variables, given values for the oracle variables
31
Fine Slicing
• A generalization of traditional program slicing• Fine slices can be precisely bounded
– Slicing criteria include set of data and control dependences to ignore
• Fine slices are executable and extractable• Complement slices (co-slices) are also fine slices• Oracle-based semantics for fine slices• Algorithm for computing data-structure representing the
oracle• Forward fine slices are executable, may be slightly larger
than traditional forward slices• Confines generalize blocks for unstructured programs
32
Extract Computation
• A new refactoring
• Extracts a fine slice into contiguous code
• Computes the co-slice
• Computation can then be extracted into a separate method using Extract Method
• Passes necessary “oracle” variables between slice and co-slice
• Generates new containers if series of values need to be passed
33
(a) Extract multiple fragmentsUser user = getCurrentUser(request);
if (user == null) {
response.sendRedirect(LOGIN_PAGE_URL);
return;
}
response.setContentType("text/html");
disableCache(response);
String albumName = request.getParameter("album");
PrintWriter out = response.getWriter();
34
(b) Extract a partial fragment
out.println(DOCTYPE_HTML);
out.println("<html>");
out.println("<head>");
out.println("<title>Error</title>");
out.println("</head>");
out.print("<body><p class='error'>");
out.print("Could not load album '" +
albumName + "'");
out.println("</p></body>");
out.println("</html>");
35
out.println("<table border=0>");
int start = page * 20;
int end = start + 20;
end = Math.min(end,
album.getPictures().size());
for (int i = start; i < end; i++) {
Picture picture = album.getPicture(i);
printPicture(out, picture);
}
out.println("</table>");
(c) Extract loop with partial body
1
2
3
4
5
6
7
8
9
10
36
2
3
4
5
***
***
6
7
***
9
1
6
8
10
int start = page * 20;
int end = start + 20;
end = Math.min(end,
album.getPictures().size());
Queue<Picture> pictures =
new LinkedList<Picture>();
for (int i = start; i < end; i++) {
Picture picture = album.getPicture(i);
pictures.add(picture);
}
out.println("<table border=0>");
for (int i = start; i < end; i++)
printPicture(out, pictures.remove());
out.println("</table>");
37
(d) Extract code with conditional exits
if (album == null) {
new ErrorPage("Could not load album '"
+ album.getName() + "'").printMessage(out);
return;
}
//...
38
if (invalidAlbum(album, out))
return;
}
//...
boolean invalidAlbum(Album album,
PrintWriter out) {
boolean invalid = album == null;
if (invalid) {
new ErrorPage("Could not load album '"
+ album.getName() + "'").printMessage(out);
}
return invalid;
}
39
++
out.println("<table border=0>");int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); printPicture(out, picture);}out.println("</table>");
entry
println
out
*
album
getPictures
size
page
min
+ out
start
end
T F
>
getPicture
i
out
end
printPicture
out
out
println
i
"<table border=0>"
20
"</table>"
exit
p1
p1
p2
p2
Token Semantics
40
++
out.println("<table border=0>");int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); printPicture(out, picture);}out.println("</table>");
entry
println
out
*
album
getPictures
size
page
min
+ out
start
end
T F
>
getPicture
i
out
end
printPicture
out
out
println
i
"<table border=0>"
20
"</table>"
exit
printPicture
Fine Slicing
41
++
out.println("<table border=0>");for (int i = start; i < end; i++) { printPicture(out, picture);}out.println("</table>");
entry
println
out
out
T F
>
i
out
end
printPicture
out
out
println
i
"<table border=0>"
"</table>"
exit
printPicture
startpicture
The Fine Slice
42
++
out.println("<table border=0>");int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); printPicture(out, picture);}out.println("</table>");
entry
println
out
*
album
getPictures
size
page
min
+ out
start
end
T F
>
getPicture
i
out
end
printPicture
out
out
println
i
"<table border=0>"
20
"</table>"
exit
printPicture
Co-Slicing
43
++
int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); }
entry
*
album
getPictures
size
page
min
+
start
end
T F
>
getPicture
i
end
out
i
20
exit
startpicture
The Co-Slice
44
++
entry
*
album
getPictures
size
page
min
+
start
end
T F
>
getPicture
i
end
out
i
20
exit
start
picture
++
entry
println
out
T F
>
end
out
println
i
"<table border=0>"
"</table>"
exit
printPicture
startpicture
Fine slice Co-slice
out
45
++
println
>
remove
printPicture println
++
out.println("<table border=0>");int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());Queue<Picture> pictures = new LinkedList<Picture>();for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); pictures.add(picture); printPicture(out,pictures.remove());}out.println("</table>");
entry
println
out
*
album
getPictures
size
page
min
+ out
start
end
T F
>
getPicture
i
out
end
printPicture
out
out
println
i
"<table border=0>"
20
"</table>"
exit
new
remove
add
picture
pictures
picture
pictures
pictures
Adding a Container
pictures
46
++
println
<
remove
printPicture println
++
void display(PrintStream out, int start, int end, Queue<Picture> pictures){ out.println("<table border=0>"); for (int i = start; i < end; i++) { printPicture(out, pictures.remove()); } out.println("</table>");}
entry
println
out
out
start
T F
>
out
end
printPicture
out
println
i
"<table border=0>"
"</table>"
exit
pictures
remove
entry
i
out
The Fine Slice
pictures
pictures
picture
47
++
println
>
remove
printPicture println
++
entry
println
out
*
album
getPictures
size
page
min
+ out
start
end
T F
>
getPicture
i
out
end
printPicture
out
out
println
i
"<table border=0>"
20
"</table>"
exit
new
remove
add
out.println("<table border=0>");int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());Queue<Picture> pictures = new LinkedList<Picture>();for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); pictures.add(picture); printPicture(out,pictures.remove());}out.println("</table>");
Program with
Container
pictures
pictures
pictures
pictures
picture
picture
48
++
>
++
int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());Queue<Picture> pictures = new LinkedList<Picture>();for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); pictures.add(picture); }display(out, start, end, pictures);
entry
*
album
getPictures
size
page
min
+
out
start
end
T F
>
getPicture
i
end
i
20
exit
newpictures
add
display
pictures
start
out
The Co-Slice
pictures
pictures
pictures
picture
49
Conclusions
• Fine slicing algorithm yields executable slices whose boundaries can be precisely controlled
• Can be used to make any subset of a program executable by adding some control structures but not the data on which they depend– including forward slices, thin slices, barrier
slices, chops, and barrier chops– Conjecture: the size of these executable
programs will not be substantially larger
50
Conclusions
• New Extract Computation refactoring is an important step towards the automation of Extract Method in difficult cases– Enables the automation of big refactorings
from smaller building blocks
• Uses new fine-slicing algorithm• Automatically computes complement
slices (co-slices)• Automatically generates containers to
pass series of values if necessary
51
Related Work (I): Non-Executable Slices
• Traditional backward slicing (e.g., Weiser [ICSE81] or Ottenstein & Ottenstein [PSDE84]), when applied to unstructured code– Solved by path-completion stage in plan-based slicing (Abadi,
Ettinger & Feldman [FSE09])
• Forward slicing (Horwitz, Reps & Binkley, [TOPLAS90])• Barrier slicing (Krinke [SCAM03])• Chopping (Jackson & Rollins [FSE94]) and Barrier
Chopping (Krinke [SCAM03])• Thin slicing (Sridharan, Fink & Bodik [PLDI07])• All the above can be made executable with an
appropriate oracle, by adding the required control structure
52
Related Work (II): Executable Slices with Reduced Scope or Size
• Block-based slicing (Maruyama [SSR01]): structured code only, no correctness proof
• Co-slicing (Ettinger's thesis, Oxford 2006): limited to slicing from the end and oracle of final values only; proof on toy language
• Parametric slicing (Field, Ramalingam & Tip [POPL95]): an executable generalization of static and dynamic slices; like oracle semantics, they formalize programs with holes; however, their holes stand for expressions whose values are irrelevant, while our holes stand for significant (oracle) values
• Some forms of dynamic and forward slicing are executable (Binkley et al. [SCAM04]): forward slices made excessively large through the addition of backward slices
53
Related Work (III): Behavior- Preserving Procedure Extraction
• Contiguous code– Bill Opdyke's thesis (UIUC 1992): for C++– Griswold and Notkin [ToSE93]: for Scheme
• Arbitrary selections– Tucking (Lakhotia & Deprez [IST98]): the complement is a slice too; no dataflow from the
extracted slice to its complement yields over-duplication; strong preconditions (e.g., no global variables involved, and no live-on-exit variable defined in both the slice and complement)
– Semantics-Preserving Procedure Extraction (Komondoor & Horwitz [POPL00]): considers all permutations of selected and surrounding statements; no duplication allowed; not practical (exponential time complexity); very strong preconditions
– Effective Automatic Procedure Extraction (Komondoor & Horwitz [IWPC03]): improves on their previous algorithm by improving complexity (cubic time and space), allowing some duplication (of conditionals and jumps); might miss some correct permutations; no duplication of assignments or loops; allows dataflow from complement to extracted code and from extracted code to (the second portion of the) complement; supports extraction of returns
– Extraction of block-based slices (Maruyama [SSR01]): extracts a slice of one variable only; restricted to structured code; no proof given
– Ettinger's thesis (Oxford 2006): sliding transformation sequentially composes a slice and its complement, allowing dataflow from the former to the latter; supports loop untangling and duplication of assignments; restricted to slicing from the end, and only final values from the extracted slice can be reused in the complement; proof for toy language
top related