performance problems you can fix: a dynamic analysis of memoization opportunities luca della toffola...
DESCRIPTION
Apache POI – Issue Performance IssueTRANSCRIPT
1
Performance Problems You Can Fix: A Dynamic Analysis of Memoization OpportunitiesLuca Della Toffola – ETH ZurichMichael Pradel – TU DarmstadtThomas R. Gross – ETH Zurich
October 30th, 2015 - OOPSLA15
MemoizeIt
2
Dynamic analysis
Memoization opportunities
Automatic
9 new real-world memoization opportunities
Apache POI – Issue 55611
3
PerformanceIssue
public boolean DateUtil.isADateFormat(int idx, String format) {StringBuilder sb = new StringBuilder(format.length());for (int i = 0; i < sb.length(); i++) {
// Modify format and write to sb}String f = sb.toString();// Process f using date pattern matchingreturn date_ptrn.matcher(f).matches();
}
Apache POI – Issue 55611
3
public boolean DateUtil.isADateFormat(int idx, String format) {StringBuilder sb = new StringBuilder(format.length());for (int i = 0; i < sb.length(); i++) {
// Modify format and write to sb}String f = sb.toString();// Process f using date pattern matchingreturn date_ptrn.matcher(f).matches();
}
Apache POI – Issue 55611
3
Java profilerRanked 10 (189), 4000 callsJava profilerNo additional bottleneck info
public boolean DateUtil.isADateFormat(int idx, String format) {StringBuilder sb = new StringBuilder(format.length());for (int i = 0; i < sb.length(); i++) {
// Modify format and write to sb}String f = sb.toString();// Process f using date pattern matchingreturn date_ptrn.matcher(f).matches();
}
Apache POI – Issue 55611
3
Research toolsSympthoms are not there*
No nested loops
No memory
bloat
* [Nistor, ISCE13], [Xu, OOPSLA12]
public boolean DateUtil.isADateFormat(int idx, String format) {StringBuilder sb = new StringBuilder(format.length());for (int i = 0; i < sb.length(); i++) {
// Modify format and write to sb}String f = sb.toString();// Process f using date pattern matchingreturn date_ptrn.matcher(f).matches();
}
Apache POI – Issue 55611
3
ObservationMany calls have the same input and output values!
OutputReturned value
InputParameters +
accessed fields
true
true
true
false
false
0, “m/d/yy”
0, “m/d/yy”
0, “m/d/yy”
1, “h:mm”
1, “h:mm”
Memoization ?
public boolean DateUtil.isADateFormat(int idx, String format) {StringBuilder sb = new StringBuilder(format.length());for (int i = 0; i < sb.length(); i++) {
// Modify format and write to sb}String f = sb.toString();// Process f using date pattern matchingreturn date_ptrn.matcher(f).matches();
}
Apache POI – Issue 55611
3
Purity analysis?Too conservative!
Sideeffects
Sideeffects
Sideeffects
Ignore side effects!
public boolean DateUtil.isADateFormat(int idx, String format) {StringBuilder sb = new StringBuilder(format.length());for (int i = 0; i < sb.length(); i++) {
// Modify format and write to sb}String f = sb.toString();// Process f using date pattern matchingreturn date_ptrn.matcher(f).matches();
}
Apache POI – Issue 55611
3
MemoizeIt1st ranked method!
MemoizeItFinds calls with the same input and output values.
Memoization!
boolean cache_value;int cache_key1;String cache_key2;
public boolean isADateFormatSlow(int idx, String format) {// Slow isADateFormat code
}
public boolean isADateFormat(int idx, String format) {if (cache_key1 == idx && cache_key2 .equals(format)) {
return cache_value;}// Update cache keys and valuereturn isADateFormatSlow(idx, format);
}
Apache POI – Issue 55611
3
Single entry instance cache
Up to 25% speed-up!
MemoizeIt – Contributions
4
1. Automatic analysis to find memoization opportunities
2. Suggest fix configurations for candidate methods
MemoizeIt – Contributions
5
1. Automatic analysis to find memoization opportunities
2. Suggest fix configurations for candidate methodsChallengeboolean DateUtil.isADateFormat(int idx, MyClass format)
Heap
MemoizeIt – Contributions
6
1. Automatic analysis to find memoization opportunities
2. Suggest fix configurations for candidate methodsChallenge
MemoizeIt==
Memoization + Iterative
MemoizeIt
7
Program Profiling Input
CPU-TimeProfiling
Filtering of methods:
1. Number of executions2. Average execution time3. Relative execution time
Initial method candidates
MemoizeIt
8
Program Profiling Input
CPU-TimeProfiling
Input-OutputProfiling
Input-Output Profiling
9
Input:Parameters + accessed
fields
Output:Returned value
Input-output tuple (T)
main
… …
…
1. For each call of candidate method
3. Select method candidates
T1
T2
multiplicity(T1) = 3
multiplicity(T2) = 2
Repeated Input-Output Memoization
boolean DateUtil.isADateFormat(int idx, String format)
2. Trace method input-output values
true
true
true
false
false
0, “m/d/yy”
0, “m/d/yy”
0, “m/d/yy”
1, “h:mm”
1, “h:mm”
Challenge – Complex Objects
10
boolean DateUtil.isADateFormat(int idx, MyClass format)
Challenge – Complex Objects
10
…
x: 45
MyClass
y: 1
z: Ba:
equals?
Structural and content equivalence
…
x: 45
MyClass
y: 0
z: Ba:
Challenge – Complex Objects
11
flat(object)(MyClass1, [45, 1, (B1, [...])])
…
x: 45
MyClass
y: 1
z: Ba:
Challenge – Complex Objects
12
Heap…
x: 45
MyClass
y: 1
z: Ba:
Can’t keep everything!
Challenge – Complex Objects
13
depth = 1 depth = 2
x: 45
MyClass
y: 0z: B
a:
x: 45
MyClass
y: 1
z: Ba:
Heap
ref1
ref2
equals?
Exhaustive traversal is expensive!
Solution - Iterative Profiling
14
depth = 1 depth = 2
x: 45
MyClass
y: 0z: B
a:
x: 45
MyClass
y: 1
z: Ba:
Heap
ref1
ref2
equals? Iterative approach can analyze programs
with complex structures
MemoizeIt
15
Program Profiling input
CPU-TimeProfiling
Input-OutputProfiling
Candidatesranking
Fixsuggestions
Initial methodcandidates
Input-OutputProfiling
Filter methodcandidates
if max depth || time limit
new candidates
depth++
exit()
d = 1
MemoizeIt
16
Program Profiling Input
CPU-TimeProfiling
Input-OutputProfiling
Ranking of Candidates !
Ranked candidatemethods
Ranking based1. Estimated saved time2. Estimated hit-ratio
MemoizeIt
17
Program Profiling Input
CPU-TimeProfiling
Input-OutputProfiling
Ranking of Candidates
FixSuggestions
Optimal cache configuration
!Ranked candidatemethods
Suggests configuration among:
SingleInstance
SingleGlobal
MultiInstance
MultiGlobal
+ need for invalidation
Experimental Setup
18
Program Description
DaCapo 2006 MR2 antlr, bloat, chart, fop, luindex, pmd
Checkstyle - 5.6 Source-code style checker
Soot – ae0cec69c0 Static program analysis / manipulation
Apache Tika - 1.3 Content analysis toolkit
Apache POI - 3.9 MS Office documents manipulation
Evaluation – Research Question
Is MemoizeIt effective at finding new memoization opportunities?
1. Manually select realistic input2. Execute MemoizeIt3. Manually inspect methods4. Implement MemoizeIt’s suggestions
Timeout for profiling: 1 hour
19
Evaluation – Results
20
9 new opportunities
DaCapo-antlr, DaCapo-bloat, DaCapo-fopSoot , Apache-Tika, Apache-POI, Checkstyle
1 duplicate method in Apache-Tika, Apache-POI
31 memoization opportunities
Is MemoizeIt effective at finding new memoization opportunities?
Evaluation – Results
21
Small workload
[speed-up]
Largeworkload
[speed-up]DaCapo-antlr 1.04 ± 0.03 1.05 ± 0.02
DaCapo-bloat 1.08 ± 0.03 -
DaCapo-fop 1.05 ± 0.01 NA
Checkstyle - 9.95 ± 0.10
Soot 1.27 ± 0.03 12.93 ± 0.05
Apache-Tika Excel - 1.25 ± 0.02
Apache-Tika Jar 1.09 ± 0.01 1.12 ± 0.02
Apache-POI (1) 1.11 ± 0.01 1.92 ± 0.01
Apache-POI (2) 1.07 ± 0.01 1.12 ± 0.01
Evaluation – Research Question
22
Is the iterative or exhaustive approach more efficient?
Evaluation – Results
22
Iterative Time [minutes]
Exhaustive Time [minutes]
DaCapo-antlr timeout timeoutDaCapo-bloat timeout timeoutDaCapo-chart 2 2DaCapo-fop 18 timeoutDaCapo-luindex 32 timeoutDaCapo-pmd timeout timeoutCheckstyle 6 22Soot timeout timeoutApache-Tika Excel 58 56Apache-Tika Jar 41 35Apache-POI 23 37
Iterative wins
Exhaustive wins
Is the iterative or exhaustive approach more efficient?
Related Work
Performance problems
Detecting[Xu, OOPSLA12], [Zaparanuks, PLDI12]
Understanding[Song, OOPSLA14], [Yu, ASPLOS14]
Fixing[Nistor, ICSE15]
23
Compiler optimizations[Ding, CGO04], [Costa, CGO13], [St-Amour, OOPSLA12]
Incremental computations[Pugh, POPL89]
Other caching techniques[Ma, WWW15]
Conclusions
Profiling of memoization opportunities• New real-world opportunities• Relevant speed-ups• Iterative strategy beneficial
Suggests cache configurations• Suggestions easy to implement
Artifact evaluated• https://github.com/lucadt/memoizeit
24
Heap
SingleGlobal
MultiInstance
MultiGlobal
SingleInstance