verifying the llvm · 2017-11-05 · • open-source compiler infrastructure – see llvm.org for...
TRANSCRIPT
VerifyingtheLLVM
SteveZdancewicDeepSpecSummerSchool2017
ThanksTo• DmitriGarbuzov• NicolasKoh• OlekGierczak
And…collaboratorsonVellvm• JianzhouZhao
– developedthe"legacy"VellvmCoqframework• SantoshNagarakaMe• MiloMarNn• WilliamMansky• ChrisNneRizkallah
Low-LevelVirtualMachine(LLVM)• Open-SourceCompilerInfrastructure
– seellvm.orgforfulldocumentaNon• CreatedbyChrisLaMner(advisedbyVikramAdve)atUIUC
– LLVM:AninfrastructureforMulN-stageOpNmizaNon,2002– LLVM:ACompilaNonFrameworkforLifelongProgramAnalysisand
TransformaNon,2004• 2005:AdoptedbyAppleforXCode3.1• Frontends:
– llvm-gcc(drop-inreplacementforgcc)– Clang:C,objecNveC,C++compilersupportedbyApple– variouslanguages:ADA,Scala,Haskell,…
• Backends:– x86/Arm/Power/etc.
• Usedinmanyacademic/researchprojects– HereatPenn:SohBound,Vellvm
ZdancewicCIS341:Compilers 3
LLVMCompilerInfrastructure
LLVM
FrontEnds
CodeGen/Jit
OpNmizaNons/TransformaNons
TypedSSAIR
Analysis
[LaMneretal.]
MoNvaNon:SohBound/CETS
• BufferoverflowvulnerabiliNes.• DetectspaNal/temporalmemorysafetyviolaNonsinlegacyCcode.
• ImplementedasanLLVMpass.• Whataboutcorrectness?
[NagarakaMe,etal.PLDI’09,ISMM‘10]
hMp://www.cis.upenn.edu/acg/sohbound/
Context:Penn'sPOPLMarkchallenge:usingCoqwasbecomingcoolXavierLeroy'sCompCert:providedinspiraNon!
LLVMCompilerInfrastructure
LLVM
FrontEnds
CodeGen/Jit
OpNmizaNons/TransformaNons
TypedSSAIR
Analysis
[LaMneretal.]
TheVellvmProject
OpNmizaNons/TransformaNons
TypedSSAIR
Analysis
• FormalsemanNcs• FaciliNesforcreaNngsimulaNonproofs
• ImplementedinCoq• ExtractpassesforusewithLLVMcompiler
• Example:verifiedmemorysafetyinstrumentaNon
[Zhaoetal.POPL2012,CPP2012,PLDI2013]
VellvmFramework
Transform CSourceCode
OtherOpNmizaNons
LLVMIR
LLVMIR Target
LLVMOCamlBindings
PrinterParser
Coq
Syntax
OperaNonalSemanNcs
MemoryModel
TypeSystemandSSA
ProofTechniques&Metatheory
Extract
VellvmFramework
CSourceCode
OtherOpNmizaNons
LLVMIR
LLVMIR Target
LLVMOCamlBindings
PrinterParser
Coq
Syntax
OperaNonalSemanNcs
MemoryModel
TypeSystemandSSA
ProofTechniques&Metatheory
ExtractVerified
Transform
Plan
• IntroducNontoLLVM– staNcsingle-assignment
• Vminus:simplifiedSSAIR– OperaNonalSemanNcs– SSAProperNes– StaNcProperNes
• VerifiedCompilaNon:ImptoVminus– Parallel'sXavier'sImptostack-
machinecompiler– CasestudyforQuickChick– Monotonicstate(freshness!)
• Scalingup:Vellvm– TasteofthefullLLVMIR– OperaNonalSemanNcs– Metatheory+Proof
Techniques
• Casestudies:– SohBoundmemorysafety– mem2reg
• Conclusion:– challenges&research
direcNons
ExampleLLVMCode
• LLVMoffersatextualrepresentaNonofitsIR– filesendingin.ll
ZdancewicCIS341:Compilers 11
define @factorial(%n) { %1 = alloca %acc = alloca store %n, %1 store 1, %acc br label %start
start: %3 = load %1 %4 = icmp sgt %3, 0 br %4, label %then, label %else
then: %6 = load %acc %7 = load %1 %8 = mul %6, %7 store %8, %acc %9 = load %1 %10 = sub %9, 1 store %10, %1 br label %start
else: %12 = load %acc ret %12}
#include <stdio.h>#include <stdint.h>
int64_t factorial(int64_t n) { int64_t acc = 1; while (n > 0) { acc = acc * n; n = n - 1; } return acc;}
factorial64.c
factorial-pretty.ll
RealLLVM• Decoratesvalues
withtypeinformaNoni64i64*i1
• PermitsnumericidenNfiers
• HasalignmentannotaNons
• Keepstrackofentryedgesforeachblock:preds = %5, %0
ZdancewicCIS341:Compilers 12
; Function Attrs: nounwind sspdefine i64 @factorial(i64 %n) #0 { %1 = alloca i64, align 8 %acc = alloca i64, align 8 store i64 %n, i64* %1, align 8 store i64 1, i64* %acc, align 8 br label %2
; <label>:2 ; preds = %5, %0 %3 = load i64* %1, align 8 %4 = icmp sgt i64 %3, 0 br i1 %4, label %5, label %11
; <label>:5 ; preds = %2 %6 = load i64* %acc, align 8 %7 = load i64* %1, align 8 %8 = mul nsw i64 %6, %7 store i64 %8, i64* %acc, align 8 %9 = load i64* %1, align 8 %10 = sub nsw i64 %9, 1 store i64 %10, i64* %1, align 8 br label %2
; <label>:11 ; preds = %2 %12 = load i64* %acc, align 8 ret i64 %12}
factorial64-pretty.ll
BasicBlocks
• AsequenceofinstrucNonsthatisalwaysexecutedstarNngatthefirstinstrucNonandalwaysexitsatthelastinstrucNon.– Startswithalabelthatnamestheentrypointofthebasicblock.– Endswithacontrol-flowinstrucNon(e.g.branchorreturn)the“link”– Containsnoothercontrol-flowinstrucNons– Containsnointeriorlabelusedasajumptarget
• Basicblockscanbearrangedintoacontrol-flowgraph– ThereisadirectededgefromnodeAtonodeBifthecontrolflow
instrucNonattheendofbasicblockAmightjumptothelabelofbasicblockB.
CIS341:Compilers 13
ExampleControl-flowGraph
ZdancewicCIS341:Compilers 14
%1 = alloca %acc = alloca store %n, %1store 1, %accbr label %start
%3 = load %1%4 = icmp sgt %3, 0br %4, label %then, label %else
loop:
entry:
%6 = load %acc%7 = load %1%8 = mul %6, %7store %8, %acc%9 = load %1%10 = sub %9, 1store %10, %1br label %start
%12 = load %accret %12
body: post:
define @factorial(%n) {
}
OPTIMIZEDLLVMCODE
Seefactorial64.ll
StaNcSingleAssignment(SSA)• CompilerintermediaterepresentaNondevelopedinthelate
1980’searly1990’s:– DetecNngEqualityofValuesinPrograms
[Alpern,Wegman,Zadeck1988]– GlobalValueNumbersandRedundantComputaNons
[Rosen,Wegman,Zadeck1988]– AnEfficientMethodofCompuNngStaNcSingleAssignmentForm
[Cytron,Ferrante,+RWZ,1989]– EfficientlyCompuNngStaNcSingleAssignmentFormandtheControl
DependenceGraph[Cytron,et.al,TOPLAS1991]
• MakesopNmizingimperaNveprogramminglanguagescleanandefficient…bymakingitmorepurelyfuncNonal– Usedingcc,clang,intel,Jikes,HotSpot,Open64,…
INTUITIONABOUTSEMANTICS
Seefactorial.ml
SSAIR’sinPracNce
• SSAyieldsanefficientrepresentaNon– SimplifiesDef-UseinformaNonneededindataflowanalysis– ImperaNvedatastructuretomapadefiniNontoitsuses
• SSAenablesgoodregisterallocaNon:
– GoodregisterallocaNonis(arguably)themostimportantopNmizaNonforperformanceonmodernprocessors
– Theleh-handsidesofSSA"assignments"canbethoughtofas“registers”
– RegisterpromoNon–movestack-allocateddataintoregisters
LLVMIR⇒Vminus
• VastlySimplify!(Fornow…)
• Throwout:– types,complex&structureddata– localstorageallocaNon,complexpointers– funcNons– undefinedvalues&nondeterminism
• What’sleh?– basicarithmeNc– controlflow– global,preallocatedstate(asinImp)
VminusbyExampleentry: r0 = ... r1 = ... r2 = ...
Control-flowGraphs:+Labeledblocks
exit: r7 = ... r8 = r1 x r2 r9 = r7 + r8
loop: r3 = ... r4 = r1 x r2 r5 = r3 + r4 r6 = r5 ≥ 100
VminusbyExampleentry: r0 = ... r1 = ... r2 = ...
Control-flowGraphs:+Labeledblocks+BinaryOperaNons
exit: r7 = ... r8 = r1 * r2 r9 = r7 + r8
loop: r3 = ... r4 = r1 * r2 r5 = r3 + r4 r6 = r5 ≥ 100
VminusbyExampleentry: r0 = ... r1 = ... r2 = ... br r0 loop exit
Control-flowGraphs:+Labeledblocks+BinaryOperaNons+Branches/Return
exit: r7 = ... r8 = r1 * r2 r9 = r7 + r8 ret r9
loop: r3 = ... r4 = r1 * r2 r5 = r3 + r4 r6 = r5 ≥ 100 br r6 loop exit
VminusbyExampleentry: r0 = ... r1 = ... r2 = ... br r0 loop exit
Control-flowGraphs:+Labeledblocks+BinaryOperaNons+Branches/Return+StaNcSingleAssignment(eachlocaliden?fierassignedonlyonce,staNcally)localidenNfiera.k.a.uidorSSAvariable
exit: r7 = ... r8 = r1 * r2 r9 = r7 + r8 ret r9
loop: r3 = ... r4 = r1 * r2 r5 = r3 + r4 r6 = r5 ≥ 100 br r6 loop exit
VminusbyExampleentry: r0 = ... r1 = ... r2 = ... br r0 loop exit
Control-flowGraphs:+Labeledblocks+BinaryOperaNons+Branches/Return+StaNcSingleAssignment+φnodes
exit: r7 = φ[0;entry][r5;loop] r8 = r1 * r2 r9 = r7 + r8 ret r9
loop: r3 = φ[0;entry][r5;loop] r4 = r1 * r2 r5 = r3 + r4 r6 = r5 ≥ 100 br r6 loop exit
VminusbyExampleentry: r0 = ... r1 = ... r2 = ... br r0 loop exit
Control-flowGraphs:+Labeledblocks+BinaryOperaNons+Branches/Return+StaNcSingleAssignment+φnodes(choosevaluesbasedonpredecessorblocks)
exit: r7 = φ[0;entry][r5;loop] r8 = r1 * r2 r9 = r7 + r8 ret r9
loop: r3 = φ[0;entry][r5;loop] r4 = r1 * r2 r5 = r3 + r4 r6 = r5 ≥ 100 br r6 loop exit
VMINUSSYNTAX
Vminus.vCFG.vListCFG.v
VminusOperaNonalSemanNcs
• Only5kindsofinstrucNons:– BinaryarithmeNc– MemoryLoad– MemoryStore– Terminators– Phinodes
• WhatisthestateofaVminusprogram?
SubtletyofPhiNodes
• Phi-Nodesadmit“cyclic”dependencies:
pred: ... br loop
loop: %x = φ[0;pred][y;loop] %y = φ[1;pred][x;loop] %b = %x ≤ %y br %b loop exit
SemanNcsofPhiNodes
• ThevalueoftheRHSofaphi-defineduidisrelaNvetothestateattheentrytotheblock.
• OpNon1:– Requireallphinodestobeatthebeginningoftheblock– Executethem“atomically,inparallel”– (OriginalVellvmfollowedthismodel)
• OpNon2:– Keeptrackofthestateuponentrytotheblock– CalculatetheRHSofphinodesrelaNvetotheentrystate– (Vminusfollowsthismodel)