variational path profiling
DESCRIPTION
Variational Path Profiling. Erez Perelman * , Trishul Chilimbi † , Brad Calder * * University of Califonia, San Diego † Microsoft Research, Redmond. Observation: Variation in Paths Exists. Goal: find the paths to focus on for optimization What is a path - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/1.jpg)
Variational Path Profiling
Erez Perelman*, Trishul Chilimbi†, Brad Calder*
*University of Califonia, San Diego
†Microsoft Research, Redmond
![Page 2: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/2.jpg)
Observation: Variation in Paths Exists
• Goal: find the paths to focus on for optimization
• What is a path– Acyclic control flow trace thru binary (i.e. loop
body)
• Variation in path performance is optimization potential
![Page 3: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/3.jpg)
What is variation?
• Performance between iterations of a path is not constant– Can be underlying architecture effects
(cache misses) that cause variations
• Example of amount of variation seen– One common path in gzip observed to
execute within 48,409 cycles and also 4,004,226 cycles
![Page 4: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/4.jpg)
Goal: Optimize Away Variation
• Hypothesis: – All execution of a path can take the minimum time
(if architecture effects are ignored)
• Want: Reduce variation of a path to improve program performance– Ideal Time = The fastest execution for a path– Optimize path to execute near its ideal time every
time
• Result– Balanced path execution time (smaller net
variation for a path)
![Page 5: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/5.jpg)
How to Find the Variation
• Sample path executions and measure performance variations– Rank top varying paths in program
• Highly optimized paths won’t have much variation– Using traditional hot path profilers won’t find you
the variation• Optimized paths execute same number of times
– VPP will focus on good optimization points that have not been exploited
![Page 6: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/6.jpg)
Outline• Variational Path Profiling
– Profiling– Analysis– Measuring Stability
• Optimizations– Apply simple optimizations on top paths– Speedup results – Comparison to other path profiling techniques
• Future Work– Discovering Structure in variation and its implication
![Page 7: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/7.jpg)
VPP: Profiling
• Sample execution of acyclic paths with Bursty Tracing – Measure time in path– Unique path signature
• Entry PC and Branch History
0x0040211F-110
• Accurate measurement of performance essential
![Page 8: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/8.jpg)
Bursty Tracing
A
B
A’A
B’B
Original Procedure Modified Procedure (Bursty Tracing)
![Page 9: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/9.jpg)
Sampling Overhead
• Accuracy is critical for time measurement of path– Bursty Tracing has less than 5%
instrumentation overhead– Timing of path is even lower overhead
• Don’t measure time of instrumentation code
• Small bias exists, but consistent and can be accounted for
![Page 10: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/10.jpg)
Outline• Variational Path Profiling
– Profiling– Analysis– Measuring Stability
• Optimizations– Apply simple optimizations on top paths– Speedup results – Comparison to other path profiling techniques
• Future Work– Discovering Structure in variation and its implication
![Page 11: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/11.jpg)
VPP: Analysis
• Compute net variation time for each path– Basetime(i) = fastest execution time– Net variation path (i) =Total time(i) – [Frequency(i) x
Basetime(i)]
• Rank paths according to net variation– Top few paths dominate all program variation
![Page 12: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/12.jpg)
Structure within Variation
0
50
100
150
200
250
1 21 41 61 81 101 121 141 161 181 201 221 241 261
Time Varations Relative to Fastest Path Execution
Num
ber o
f Occ
uran
ces
• Bzip2 Top 5 Varying Paths
![Page 13: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/13.jpg)
VPP: Top 10 Paths
0
10
20
30
40
50
60
70
amm
p art
bzip
equa
ke gcc
mcf
pars
ertw
olf
vorte
xvp
rav
g
% E
xcut
ion
Tim
e
![Page 14: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/14.jpg)
Outline• Variational Path Profiling
– Profiling– Analysis– Measuring Stability
• Optimizations– Apply simple optimizations on top paths– Speedup results – Comparison to other path profiling techniques
• Future Work– Discovering Structure in variation and its implication
![Page 15: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/15.jpg)
Stability
• Do top varying paths change when system load or program input is changed?– System load measures the resource utilization
(processor, memory, buses, etc…)
• Measure stability of tops paths across system loads– Heavy system load vs. light system load
• Across program inputs– Program execution varies, how does it affect top
paths?
![Page 16: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/16.jpg)
Stability: System Load
0
10
20
30
40
50
60
70
80
90
100
amm
par
tbz
ip
equa
ke gcc
mcf
pars
ertw
olf
vorte
xvp
rav
g
% E
xecu
tion
Tim
e
low load top 10low load w/ top in bothhigh load top 10high load w/ top in both
![Page 17: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/17.jpg)
Stability: Input
0
10
20
30
40
50
60
70
ammp art bzip equake gcc mcf parser tw olf vortex vpr avg
% E
xecu
tion
Tim
e
top 10 self trained
top 10 cross trained
![Page 18: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/18.jpg)
Outline• Variational Path Profiling
– Profiling– Analysis– Measuring Stability
• Optimizations– Apply simple optimizations on top paths– Speedup results – Comparison to other path profiling techniques
• Future Work– Discovering Structure in variation and its implication
![Page 19: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/19.jpg)
VPP: Optimize Top Paths
• Simple optimization strategy for top paths to show optimization potential– Prefetch loads in path one or two iterations ahead of
loop– Check for loop bounds to stay within bounds of data
accesses
• After optimization paths lost 41% of net variation on average
• More elaborate optimizations can reduce more variation
![Page 20: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/20.jpg)
Optimization Example: VPR
1 while (ito < heap_tail) {2 if (heap[ito+1]->cost < heap[ito]->cost)3 ito++;4 if (heap[ito]->cost > heap[ifrom]->cost)5 break;6** if (ito*8 < heap_tail)7** _mm_prefetch((char*)&heap[ito*8]->cost, 1);8 temp_ptr = heap[ito];9 heap[ito] = heap[ifrom];10 heap[ifrom] = temp_ptr;11 ifrom = ito;12 ito = 2*ifrom;13 }
• this optimization results in 9% speedup!
![Page 21: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/21.jpg)
VPP: Spec 2K Speedup
0
5
10
15
20
25
amm
p art
bzip
equa
ke gcc
mcf
parse
rtw
olf
vorte
xvp
rav
g
% S
pee
du
p
![Page 22: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/22.jpg)
Outline• Variational Path Profiling
– Profiling– Analysis– Measuring Stability
• Optimizations– Apply simple optimizations on top paths– Speedup results – Comparison to other path profiling techniques
• Future Work– Discovering Structure in variation and its implication
![Page 23: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/23.jpg)
Comparing to other Profiling Techniques
• Path profiling techniques often base hotness on frequency– Most executed paths are considered hot– Once these are optimized
• Still hot based on frequency• Lower variation, ranking goes down with VPP
• VPP dynamically ranks paths– Once optimized, path ranking can change
![Page 24: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/24.jpg)
Comparing to other Profiling Techniques
0
1
2
3
4
5
6
7
8
9
10
art
equa
ke
amm
p
gzip
vpr
gcc
mcf
craf
ty
pars
er gap
vort
ex
bzip
2
twol
f
foxp
ro
pc g
ame
mul
timed
ia
avgDiff
in T
op 1
0 V
aria
tiona
l Pat
hs
Hot Path Ranking-Frequency of Path
Time Ranking: net exectime of path
![Page 25: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/25.jpg)
Outline• Variational Path Profiling
– Profiling– Analysis– Measuring Stability
• Optimizations– Apply simple optimizations on top paths– Speedup results – Comparison to other path profiling techniques
• Future Work– Discovering Structure in variation and its implication
![Page 26: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/26.jpg)
Observation: Variation Structure
• Is there a pattern in variation?– If we plot the variation over time we can see
interesting structure
• Future work: – Does the context leading up to a path have
correlation with the path performance– Can specific hardware structures be identified
to cause variation– Can specific optimization be recommended
based on variation structure
![Page 27: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/27.jpg)
Structure within Variation
0
50
100
150
200
250
1 21 41 61 81 101 121 141 161 181 201 221 241 261
Time Varations Relative to Fastest Path Execution
Num
ber o
f Occ
uran
ces
• Bzip2 Top 5 Varying Paths
![Page 28: Variational Path Profiling](https://reader035.vdocuments.us/reader035/viewer/2022062314/56812ad0550346895d8eaec8/html5/thumbnails/28.jpg)
Conclusion
• VPP finds the top varying paths with good optimization potential– Few top paths account for majority of variation– Top variational paths are stable
• Applying simple optimization has 8.5% speedup on avg for Spec 2k on P4
• VPP finds hot paths that are not found with other techniques – Once path is optimized, its variation is reduced (the
_hotness_ in VPP)