a probabilistic pointer analysis for speculative optimization jeff dasilva greg steffan jeff dasilva...
Post on 21-Dec-2015
222 views
TRANSCRIPT
A Probabilistic Pointer Analysis for Speculative
Optimization
A Probabilistic Pointer Analysis for Speculative
Optimization
Jeff DaSilva Greg Steffan
Jeff DaSilva Greg Steffan
Electrical and Computer EngineeringUniversity of TorontoToronto, ON, CanadaOct 17th, 2005
Electrical and Computer EngineeringUniversity of TorontoToronto, ON, CanadaOct 17th, 2005
2 University Of Toronto
Pointers Impede OptimizationPointers Impede Optimization
Many optimizations come to a halt when they encounter an ambiguous pointer
foo(int *a) { … while(…) {
x = *a; … }}
Loop InvariantCode Motion
Parallelize
Pointer Analysis is Important
3 University Of Toronto
Pointer AnalysisPointer Analysis
Do pointers a and b point to the same location? Do this for every pair of pointers at every program point
*a = ~ ~ = *b
*a = ~ ~ = *b
Definitely Not
Definitely
Maybe
PointerAnalysis
optimize
4 University Of Toronto
Pointer analysisPointer analysis is a difficult problem scalablescalable and overly conservativeoverly conservative
fails-to-scalefails-to-scale and accurateaccurate
Ambiguous pointers will persist even when using the most accurate accurate of algorithms output is often unavoidable
What can be done with ?
Pointer Analysis is DifficultPointer Analysis is Difficult
Maybe
Maybe
oror
5 University Of Toronto
Lets SpeculateLets Speculate
Compilers make conservativeconservative assumptions They must alwaysalways preserve program correctness
“It's easier to apologize than ask for permission.” Author: Anonymous
Implement a potentially unsafeunsafe optimizationVerifyVerify and RecoverRecover if necessary
6 University Of Toronto
Speculation applied to PointersSpeculation applied to Pointers
int *a, x;…while(…){
x = *a; …}
a is probably loop invariant
int *a, x, tmp;…tmp = *a;while(…){
x = tmp; …}
<verify, recover?><verify, recover?>
7 University Of Toronto
Data Speculative OptimizationsData Speculative Optimizations The EPIC Instruction set
Explicit support for speculative load/store instructions (eg. Itanium)
Speculative compiler transformations Dead store elimination, redundancy elimination, copy propagation,
strength reduction, register promotion
Thread-level speculation (TLS) Hardware support for tracking speculative parallel threads
Transactional programming Rollback support for aborted transactions
When to speculate? Techniques rely on profiling
8 University Of Toronto
Quantitative Output RequiredQuantitative Output Required
Estimate the potential benefit for speculating:
Speculate?
ExpectedExpectedspeedupspeedup(if successful)(if successful)
RecoveryRecoverypenaltypenalty
(if unsuccessful)(if unsuccessful)
OverheadOverheadfor verifyfor verify
Probabilistic output neededMaybe
Maybe
Maybe
ProbabilityProbabilityof successof success
9 University Of Toronto
Conventional Pointer AnalysisConventional Pointer Analysis
Do pointers a and b point to the same location? Do this for every pair of pointers at every program point
*a = ~ ~ = *b
*a = ~ ~ = *b
Definitely Not
Definitely
Maybe
PointerAnalysis
optimize
Probabilistic Pointer Analysis Probabilistic Pointer Analysis (PPA)(PPA)
PPApp = 0.0
pp = 1.0
0.0 < pp < 1.0
With what probability pp, do pointers a and b point to the same location? Do this for every pair of pointers at every program point
optimize
10 University Of Toronto
PPA Research ObjectivesPPA Research Objectives Accurate points-to probability information
at every static pointer dereference
Scalable analysis Goal: The entire SPEC integer benchmark suite
Understand scalabilityscalability/accuracyaccuracy tradeoff through flexible static memory model
Improve our understanding of programs
11 University Of Toronto
Algorithm Design ChoicesAlgorithm Design Choices FixedFixed
Bottom Up / Top Down Approach LinearLinear transfer functions (for scalability)
One-level contextcontext and flowflow sensitive FlexibleFlexible
Edge profiling (or static prediction)
Safe (or unsafe)
Field sensitive (or field insensitive)
12 University Of Toronto
Traditional Points-To GraphTraditional Points-To Graphint x, y, z, *b = &x;void foo(int *a) {
if(…) b = &y;
if(…) a = &z; else(…) a = b; while(…) { x = *a; … }}
y UND
a
z
b
x
= pointer
= pointed at
Definitely
Maybe
=
=
Results are inconclusive
13 University Of Toronto
Probabilistic Points-To GraphProbabilistic Points-To Graphint x, y, z, *b = &x;void foo(int *a) {
if(…) b = &y;
if(…) a = &z; else(…) a = b; while(…) { x = *a; … }}
y UND
a
z
b
x
0.10.1 taken taken((edge profileedge profile))
0.20.2 taken taken((edge profileedge profile))
= pointer
= pointed at
p = 1.0
0.0<p< 1.0
=
=p
0.10.9
0.72
0.08
0.2
Results provide more information
14 University Of Toronto
LOLLOLLLIPIPOOPP Our PPA AlgorithmOur PPA Algorithm
LLinear inear
OOne -ne -
LLevel evel
IInterprocedural nterprocedural
PProbabilistic robabilistic
PPointer Analysisointer Analysis
15 University Of Toronto
Points-To MatrixPoints-To Matrix
Location Sets
AreaOf
Interest
I
1
2
…
N-1
N
…
M-1 M
All matrix rows sum to 1.0
Location Sets
Pointer Sets
16 University Of Toronto
Points-To Matrix ExamplePoints-To Matrix Example
y UNDzx
y
UND
z
x
a
b
0.72 0.08 0.20
0.90 0.10
1.0
1.0
1.0
1.0
Iy UND
a
z
b
x
0.10.9
0.72
0.08
0.2
17 University Of Toronto
Solving for a Points-To MatrixSolving for a Points-To Matrix
IAny InstructionAny Instruction
I
Points-ToMatrix In
Points-ToMatrix Out
18 University Of Toronto
The Fundamental PPA EquationThe Fundamental PPA Equation
Points-ToMatrix Out
Points-ToMatrix In
TransformationMatrix=
This can be applied to any instruction (incl. function calls)
19 University Of Toronto
Transformation MatrixTransformation Matrix
1
2
N-1 N
ø I
1 2 3 …
…
N-1
N
…
Area of Interest
Location Sets Pointer Sets
All matrix rows sum to 1.0
Location Sets
Pointer Sets
20 University Of Toronto
Transformation Matrix ExampleTransformation Matrix Example
y
UND
z
x
a
b
y UNDzxa b
1.0
1.0
1.0
1.0
1.0
1.0
S1:S1: a = &z;a = &z;
=TS1
21 University Of Toronto
Example - The PPA EquationExample - The PPA EquationS1:S1: a = &z;a = &z;PTout = TS1 PTin
y UNDzx
y
UND
z
x
a
b
y UND
a
z
b
x
0.10.9
0.72
0.08
0.2
0.72 0.08 0.20
0.90 0.10
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
=PTout
22 University Of Toronto
Example - The PPA EquationExample - The PPA Equation
y UNDzx
y
UND
z
x
a
b
1.0
0.90 0.10
1.0
1.0
1.0
1.0
I y UNDz
b
x
0.10.9
a
S1:S1: a = &z;a = &z;PTout = TS1 PTin
=PTout
23 University Of Toronto
TS2
Combining Transformation MatricesCombining Transformation Matrices
IBasic Block
S1: S1: InstrInstr
S2: S2: InstrInstr
S3: S3: InstrInstr
I
PTout TS1=
PTin
TS3PTin PTin PTin
PTout = TBB
TS1
24 University Of Toronto
Control flow - if/elseControl flow - if/else
YYXX
TX TY= +
p qp q
p + q = 1.0
25 University Of Toronto
Control flow - loopsControl flow - loops
TX=XXNN
U
LiTY=
1U-L+1
i<L,U>
YY
Both operations can be implemented efficiently<L,U> <min,max>
26 University Of Toronto
Safe vs. Unsafe Safe vs. Unsafe Pointer Assignment InstructionsPointer Assignment Instructions
x = &y Address-of Assignment
x = y Copy Assignment
x = *y Load Assignment
*x = y Store Assignment
Safe?
27 University Of Toronto
LOLLOLLLIPIPOOPP Implementation Implementation
.spd
EdgeProfile
Results
SUIF Infrastructure
Static Memory
Model
MATLABC Library
TF-Matrix Collector
Points-ToMatrix
Propagator Stats
ICFGICFG SMMSMM BUBU TDTD .spx
Implementation details:•Sparse matrices•Efficient matrix algorithms•Result memoization
28 University Of Toronto
MeasuringMeasuring
LOLLOLLLIPIPOOPP’s ’s EfficiencyEfficiency and and AccuracyAccuracy
29 University Of Toronto
SPEC2000 Benchmark DataSPEC2000 Benchmark Data
Experimental Framework: 3GHz P4 with 2GB of RAM
Benchmark LOC Matrix Size N
PPA Analysis Time
[Unsafe]PPA Analysis Time
[Safe]Bzip2 4686 251 0.3 seconds
Mcf 2429 354 0.39 seconds
Gzip 8616 563 0.71 seconds
Crafty 21297 1917 5.49 seconds
Vpr 17750 1976 9.33 seconds
Twolf 20469 2611 16.59 seconds
Parser 11402 2732 30.72 seconds
0.3 seconds
0.61 seconds
0.77 seconds
5.51 seconds
10.34 seconds
20.64 seconds
50.04 seconds
Vortex 67225 11018 3min 59seconds
Gap 71766 25882 54min 56seconds
Perlbmk 85221 20922 44min 15seconds
Gcc 22225 42109 5hour 10 min
4min 56seconds
83min 38seconds
89min 43seconds
Still Running…
Scales to all of SPECint
30 University Of Toronto
Comparison with Das’s GOLFComparison with Das’s GOLF
GOLF LOLLIPOPProbabilistic No Yes
Context Sensitive One-level One-level
Flow Sensitive No Yes
Field Sensitive No Turned Off
Indirect Calls Solved Profiled
Library Calls Modeled All Modeled Some
Heap Model Callsite Alloc Callsite Alloc
Safe Yes Yes
Analysis Time on GCC < 10 seconds > 5 hours
31 University Of Toronto
Comparison with Das’s GOLFComparison with Das’s GOLFA
ve
rag
e D
ere
fere
nc
e S
ize
mor
e ac
cura
te
LOLLIPOP is very AccurateAccurate (even without probability information)
14.8
3.37.7
1.2
11.9
21.9
59.3
3.3 1.8 2.6 1.1
80.1
6.7
18.5
6.1
0
10
20
30
40
50
60
70
80
90
100
go
m88
ksim
*gcc
com
press li
ijpeg per
l
vorte
x
GOLF
LOLLI POP
185.6
32 University Of Toronto
Easy SPEC2000 BenchmarksEasy SPEC2000 Benchmarks
1.41.2
1.51.8
6.1
1.01.3
0
1
2
3
4
5
6
7
gzip vpr mcf crafty vortex bzip2 twolf
unsafe
safe
p > 0.001
Av
era
ge
De
refe
ren
ce
Siz
e
mor
e ac
cura
te
A one-level Analysis is often adequate (i.e. safe=unsafe)
33 University Of Toronto
Challenging SPEC 95/2000 BenchmarksChallenging SPEC 95/2000 Benchmarks
42.5
89.0
143.8
80.1
6.7
18.5
0
20
40
60
80
100
120
140
parser perlbmk gap li ijpeg perl
unsafe
safe
p > 0.001
Av
era
ge
De
refe
ren
ce
Siz
e
mor
e ac
cura
te
Many improbable points-to relations can be pruned away
34 University Of Toronto
Metric: Average Certainty Metric: Average Certainty
while(…){
x = *a; …}
{ (0.72, ), (0.08, ), (0.2, ) }y zx
Σ(max probability value)
(num of loads & stores)Avg Certainty =
Max probability value = 0.72
35 University Of Toronto
SPEC2000 Average CertaintySPEC2000 Average Certainty
0.9050.949 0.970
0.9200.946 0.969
0.870
0.783
0.908
1.000
0.901
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
gzip vp
r*g
cc mcf
craf
ty
pars
er
perlb
mk
gap
vorte
xbz
ip2
twolf
Pro
bab
ilis
tic
Cer
tain
ty
mor
e ac
cura
te
On average, LOLLIPOP can predict a single likely points-to relation
36 University Of Toronto
Conclusions and Future WorkConclusions and Future Work
LOLLOLLLIPIPOOPPA novel PPA algorithmScales to SPECint 95/2000As accurate as the most precise algorithms
Future Ongoing WorkFuture Ongoing WorkMeasure the probabilistic accuracyOptimize LOLLIPOP’s implementationApply PPA
Provides the key puzzle piece for a speculation compiler
37 University Of Toronto
ReferencesReferences Manuvir Das, Ben Liblit, Manuel Fahndrich, and Jakob Rehof. Estimating
the Impact of Scalable Pointer Analysis on Optimization. SAS 2001, 260-278.
Peng-Sheng Chen, Ming-Yu Hung, Yuan-Shin Hwang, Roy Dz-Ching Ju, and Jenq Kuen Lee. Compiler support for speculative multithreading architecture with probabilistic points-to analysis. PPOPP 2003, 25-36.
Jin Lin, Tong Chen, Wei-Chung Hsu, Peng-Chung Yew, Roy Dz-Ching Ju, Tin-Fook Ngai and Sun Chan, A Compiler Framework for Speculative Analysis and Optimizations. PLDI 2003, 289-299.
R.D. Ju, J. Collard, and K. Oukbir. Probabilistic Memory Disambiguation and its Application to Data Speculation. SIGARCH Comput. Archit. News 27 1999, 27-30.
Manel Fernandez and Roger Espasa. Speculative Alias Analysis for Executable Code. PACT 2002, 221-231.
38 University Of Toronto
Backup SlidesBackup Slides
39 University Of Toronto
Metric: Average Dereference SizeMetric: Average Dereference Size
while(…){
x = *a; …}
{ (0.72, ), (0.08, ), (0.2, ) }y zx
Σ(set cardinality)
(num of loads & stores)Avg Deref Size =
Dereference set cardinality = 3
40 University Of Toronto
SPEC2000 Benchmark DataSPEC2000 Benchmark DataBenchmar
kLOC Matrix
Size NPPA Analysis
Time
[Unsafe]
PPA Analysis Time
[Safe]
Bzip2 4686 251 0.3 seconds 0.3 seconds
Mcf 2429 354 0.39 seconds 0.61 seconds
Gzip 8616 563 0.71 seconds 0.77 seconds
Crafty 21297 1917 5.49 seconds 5.51 seconds
Vpr 17750 1976 9.33 seconds 10.34 seconds
Twolf 20469 2611 16.59 seconds 20.64 seconds
Parser 11402 2732 30.72 seconds 50.04 seconds
Vortex 67225 11018 3min 59seconds 4min 56seconds
Gap 71766 25882 54min 56seconds 83min 38seconds
Perlbmk 85221 20922 44min 15seconds 89min 43seconds
Gcc 22225 42109 5hour 10 min Still Running
41 University Of Toronto
Measure the probabilistic accuracy Compare PPA with a runtime points-to profiler Evaluate analysis sensitivity to edge profiling Investigate the scalabilityscalability/accuracyaccuracy tradeoff
Optimize LOLLIPOP’s implementation Apply PPA
Thread level speculation (TLS) compiler Speculative compiler optimizations Helper threads Speculative ___________
Conslusions and Future Ongoing WorkConslusions and Future Ongoing Work
42 University Of Toronto
Transformation Matrix ExampleTransformation Matrix Example
y
UND
z
x
a
b
y UNDzxa b
1.0
1.0
1.0
1.0
TS2 =
S2: b = a;
1.0
1.0
43 University Of Toronto
Transformation Matrix ExampleTransformation Matrix Example
S1: a = &z;S2: b = a;
Tx y
UND
z
x
a
b
y UNDzxa b
1.0
1.0
1.0
1.0
1.0
1.0
= = TS2TS1
xx
44 University Of Toronto
< 9 >
0.50.5
45 University Of Toronto
Shadow Variable AliasingShadow Variable Aliasing
p
p_1
a
a
= pointer
= pointed at
= shadow ptr
+ =
Unsafe SMMUse FICI PA toaccount for SVA
b
UND
p
p_1
a
ap_1 a
p_1 and a are considered
aliased
Safe SMM
46 University Of Toronto
Pointer AnalysisPointer Analysis
Do pointers a and b point to the same location? Do this for every pair of pointers at every program point
*a = ~ ~ = *b
*a = ~ ~ = *b
Definitely Not
Definitely
Maybe
PointerAnalysis
optimize
47 University Of Toronto
*a = ~ ~ = *b
*a = ~ ~ = *b
Probabilistic Pointer Analysis Probabilistic Pointer Analysis (PPA)(PPA)
PPApp = 0.0
pp = 1.0
0.0 < pp < 1.0
With what probability pp, do pointers a and b point to the same location? Do this for every pair of pointers at program point
optimize