automatically proving the correctness of compiler optimizations
DESCRIPTION
Automatically Proving the Correctness of Compiler Optimizations. Sorin Lerner Todd Millstein Craig Chambers University of Washington. Goal: correct compilers. The compiler is usually part of the trusted computing base. “But I use gcc, and it works great!”. gcc-bugs mailing list. - PowerPoint PPT PresentationTRANSCRIPT
Automatically Proving the Correctness of Compiler
OptimizationsSorin Lerner Todd Millstein Craig
ChambersUniversity of Washington
Goal: correct compilers
• The compiler is usually part of the trusted computing base.
• “But I use gcc, and it works great!”
gcc-bugs mailing list
• c/9525: incorrect code generation on SSE2 intrinsics• target/7336: [ARM] With -Os option, gcc incorrectly computes the
elimination offset• optimization/9325: wrong conversion of constants: (int)(float)(int)
(INT_MAX)• optimization/6537: For -O (but not -O2 or -O0) incorrect assembly is
generated• optimization/6891: G++ generates incorrect code when -Os is used• optimization/8613: [3.2/3.3/3.4 regression] -O2 optimization generates
wrong code • target/9732: PPC32: Wrong code with -O2 –fPIC• c/8224: Incorrect joining of signed and unsigned division • …
Searched for “incorrect” and “wrong” in the gcc-bugs mailing list.Some of the results:
And this is only for February 2003!On a mature compiler!
compilerSource CompiledProg
run!
inputexp-ectedoutput
Testing
• No correctness guarantees:• neither for the compiled
prog• nor for the compiler
DIFF
• To get benefits, must:• run over many inputs• compile many test cases
output
Verify each compilation
compilerSource CompiledProg
SemanticDIFF
• Translation validation [Pnueli et al 98, Necula 00]
• Credible compilation[Rinard 99]
• Compiler can still have bugs.
• Compile time increases.• “Semantic Diff” is hard.
Proving the whole compiler correct
compilerSource CompiledProg
Correctnesschecker
Proving the whole compiler correct
compiler
Correctnesschecker
Correctness checker
• Option 1: Prove compiler correct by hand.
• Proofs are long…
• And hard.• Compilers are
proven correct as written on paper. What about the implementation?
ProofProofProof«¬
$ \ rt l / .
Link?
Correctness checker
Our Approach
• Our approach: prove compiler correct automatically.
AutomaticTheoremProver
compiler
This seems really hard!
AutomaticTheoremProver
Task of provingcompiler correct
Complexity that an automatic theorem prover can handle.
Complexity of proving a compiler correct.
Making the problem easier
AutomaticTheoremProver
Task of provingcompiler correct
Making the problem easier
AutomaticTheoremProver
Task of provingoptimizer correct • Only prove optimizer correct.
• Trust front-end and code-generator.
Making the problem easier
AutomaticTheoremProver
Write optimizations in Cobalt, a domain-specific language.
Task of provingoptimizer correct
Making the problem easier
AutomaticTheoremProver
Separate correctness from profitability.
Write optimizations in Cobalt, a domain-specific language.
Task of provingoptimizer correct
Making the problem easier
Write optimizations in Cobalt, a domain-specific language.
Separate correctness from profitability.
Factor out the hard and common parts of the proof, and prove them once by hand.
AutomaticTheoremProver
Task of provingoptimizer correct
Results• Cobalt language
– realistic C-like IL– implemented const prop and folding, branch
folding, CSE, PRE, DAE, partial DAE, and simple forms of points-to analyses
• Correctness checker for Cobalt opts– using the Simplify theorem prover
• Execution engine for Cobalt opts– in the Whirlwind compiler
Caveats• May not be able to express your opt Cobalt:
– no interprocedural optimizations for now.– optimizations that build complicated data
structures may be difficult to express.
• A sound Cobalt optimization may be rejected by the correctness checker.
• Trusted computing base (TCB) includes:– front-end and code-generator, execution engine,
correctness checker, proofs done by hand once
Outline• Overview
• Forward optimizations (see paper for backwards)– Example: constant propagation– Strategy for proving forward optimizations sound
• Profitability heuristics
• Pure analyses
y := 5
x := yREPLACE
x := 5
statement y := 5
statements thatdon’t define y
statement x := y
Constant Prop (straight-line code)
Adding arbitrary control flow
y := 5
x := y REPLACE x := 5
statement y := 5
statements thatdon’t define y
statement x := y
y := 5y := 5
is followed by
until
transform statement to x := 5
if
then
Constant prop in
statement y := 5
statements thatdon’t define y
is followed by
until
if
thentransform statement to x := 5
statement x := y
English
boolean expressions evaluated at nodes in the CFG
stmt(Y := C)
X := Y
followed by
until
Cobalt versionEnglish version
: mayDef(Y)
statement y := 5
statements thatdon’t define y
is followed by
until
if
thentransform statement to x := 5
statement x := y
Constant prop inCobalt
X := C
Outline• Overview
• Forward optimizations (see paper for backwards)– Example: constant propagation– Strategy for proving forward optimizations sound
• Profitability heuristics
• Pure analyses
Proving correctness automatically
y := 5
x := y x := 5
y := 5y := 5
• Witnessing region• Invariant: y == 5
Constant prop revisited
stmt(Y := C)
: mayDef(Y)
X := Y
followed by
until
with witnessY == C
Ask a theorem prover to show:1. A statement satisfying stmt(Y :=
C) establishes Y == C2. A statement satisfying :mayDef(Y)
maintains Y == C3. The statements X := Y and X := C
have the same semantics in a program state satisfying Y == C
X := C
Generalize to any forward optimization
Ask a theorem prover to show:1. A statement satisfying 1
establishes P2. A statement satisfying 2
maintains P3. The statements s and s’
have the same semantics in a program state satisfying P
We showed by hand once that these conditions imply correctness.
1
2
s
followed by
until
with witnessP
s’
Outline• Overview
• Forward optimizations (see paper for backwards)
• Profitability heuristics
• Pure analyses
Profitability heuristics
• Optimization correct ) safe to perform any subset of the matching transformations.
• So far, all transformations were also profitable.
• In some cases, many transformations are legal, but only a few are profitable.
The two pieces of an optimization
1
followed by 2
until s
s’with witness Pfiltered through choose
• Transformation pattern:– defines which
transformations are legal.
• Profitability heuristic:– describes which of the legal
transformations to actually perform.
– does not affect soundness.– can be written in a language
of the user’s choice.
• This way of factoring an optimization is crucial to our ability to prove optimizations sound automatically.
Profitability heuristic example: PRE
• PRE as code duplication followed by CSE
Profitability heuristic example: PRE
a := ...;
b := ...;
if (...) {
a := ...;
x := a + b;
} else {
...
}
x := a + b;x := a + b;
• Code duplication
• PRE as code duplication followed by CSE
Profitability heuristic example: PRE
• PRE as code duplication followed by CSE
a := ...;
b := ...;
if (...) {
a := ...;
x := a + b;
} else {
}
x :=
x := a + b;
• Code duplication
• CSE• self-assignment
removal
a + b; x;
Profitability heuristic example: PRE
a := ...;
b := ...;
if (...) {
a := ...;
x := a + b;
} else {
...
}
x := a + b;
Legal placements of x := a + bProfitable placement
Outline• Overview
• Forward optimizations (see paper for backwards)
• Profitability heuristics
• Pure analyses
Constant prop revisited (again)
stmt(Y := C)
: mayDef(Y)
X := Y
followed by
until
with witnessY == C
X := C
mayDef in Cobalt
stmt(Y := C)
: mayDef(Y)
X := Y
followed by
until
with witnessY == C
X := C
mayDef in Cobalt
• Very conservative!• Can we do better?
stmt(Y := C)
: mayDef(Y)
X := Y
followed by
until
with witnessY == C
X := C
mayDef in Cobalt
• Very conservative!• Can we do better?
stmt(Y := C)
: mayDef(Y)
X := Y
followed by
until
with witnessY == C
X := C
mayDef in Cobalt
stmt(Y := C)
: mayDef(Y)
X := Y
followed by
until
with witnessY == C
X := C
mayDef in Cobalt
• mayPntTo is a pure analysis.• It computes dataflow info,
but performs no transformations.
stmt(Y := C)
: mayDef(Y)
X := Y
followed by
until
with witnessY == C
X := C
mayPntTo in Cobalt
addrNotTaken(X)
“no location in the store points to X”
decl X
s
mayPntTo(X,Y) , : addrNotTaken(Y)
stmt(decl X)
followed by: stmt(... := &X)
defines
with witness
Future work
• Improving expressiveness– interprocedural optimizations– one-to-many and many-to-many
transformations
• Inferring the witness
• Generate specialized compiler binary from the Cobalt sources.
Summary and Conclusion
• Optimizations written in a domain-specific language can be proven correct automatically.
• Our correctness checker found several subtle bugs in Cobalt optimizations.
• A good step towards proving compilers correct automatically.