tools related to compiler backends manish vasani department of computer science columbia university...
TRANSCRIPT
Tools Related toCompiler Backends
Manish VasaniDepartment of Computer Science
Columbia UniversityCOMS W4115 – Programming Languages and Compilers
April 14, 2010
Outline• Compiler Backend Frameworks
– Purpose– Design Philosophy– Examples & Case study
• Pointer Analysis– Implementing using compiler frameworks
• Debuggers– High level working:
• Call stacks, breakpoints, locals/params, source view, etc.– Role of compiler backend
Additional Slides• Metrics of success for shipping compilers:
– Code Quality or Performance of target code– Build Throughput or Compile time
• Optimized Code Debugging
Let’s start with a simple program• #include "stdio.h"
• int main(int argc, char* argv[]) {• int x = argc;• int *y = &x;• while (argc != 10) {• printf("%d", *y);• ++argc;• }• return argc;• }
Can you point out an
optimization opportunity?
Loop hoist “*y”?
Let’s start with a simple program• #include "stdio.h"
• int main(int argc, char* argv[]) {• int x = argc;• int *y = &x;• tmp = *y;• while (argc != 10) {• printf("%d", tmp);• ++argc;• }• return argc;• }
Loop hoist optimization
• Goal: Move loop invariant expressions outside the loop
• What are the basic high-level steps for such an optimization?– Identify loops in a function– Iterate instructions in a loop– Look at operands, symbols and types– Identify loop invariant expressions– Modify IR (intermediate representation)
Our Focus for today
• Only Step 1: Identify loops in the program (Control Flow Analysis)
• Input: – Intermediate code for the program
• Output:– Number of loops in a program– For all loops (nested up to any level):
• Start source line for the loop• Function name
Identify loops in a program• Steps:
– Lex/Parse the input– Transform into format understood by the backend– Build a Control flow graph
• Nodes Basic blocks• Edges Control transfers
– Control Flow Analysis• Graph traversal: Iterate through Basic blocks
– Say Depth first order• Edge traversal: Iterate through successor/predecessor edges
– Edge properties• Forward, Back, Cross
– Instructions: Iterate through instructions/operands
Guess…
• How many lines of code would it take to implement it?– 1000+?– 100-1000?– Less than 100?
• Your surprise assignment for this semester: Implement it in your compiler backend and find out!
• Just kidding
Design
• How would you design it though?– Recommendation: Use Compiler frameworks– Your friends: You don’t need to implement most of
the building blocks!– Provides infrastructure for implementing:
• Entire Compiler backend• Specific parts of backend
– Optimization phases– Code Instrumentation phases
• Code Analysis tools• Binary Raise tools
Current Compiler Infrastructures
• Microsoft Phoenix Compiler Framework– Under development over the last decade– Phoenix framework based Code Analysis tools shipping in
Visual Studio 2010, compiler under development• LLVM: Low level virtual machine compiler
infrastructure– Open source– Under development over the last decade at UIUC– Widely used for compilers research at various universities
• SUIF, Rose, Etc.
Common Philosophy
• Libraries– Expose object model for compiler constructs– Expose commonly used compiler algorithms
• Modular• Extensible • Configurable
Philosophy
• Phase/Pass based architecture
• Plug-in architecture:– Write your custom pass– Plug-in the phase into
existing pass chain
• Researchers should do research, not plumbing!
Front End
IL Reader
TypeChecker
Inliner
RegisterAlloc
Emitter
LoopOpts
Case Study: PhoenixProgramUnit or ModuleUnit (whole program) (single compiland)
Symbol Table
Symbol Table
Instruction Stream
Flow Graph
Alias Info
Type Table
Region Graph
DataUnit
Data Instrs
FuncUnitFuncUnit
FuncUnitFuncUnit
FuncUnitFuncUnit
FuncUnitFuncUnit
FuncUnit
DataUnitDataUnit
Exception Handling Info
Phoenix Based Compiler And Tool Object Model
Delphi Cobol
HL
Opt
s
LL O
pts
Code
Gen
HL
Opt
s
LL O
pts
LL O
pts
HL
Opt
s
NativeImage
C#
Phoenix Core
AST IR Syms Types CFG SSA
Xlator
Formatter
Browser
Phx APIs
Profiler
Obfuscator
Visualizer
SecurityChecker
Refactor
Lint
VB
C++ IRassembly
C++
C++AST
PREfast
Profile
Eiffel
C++
Phx AST
Lex/Yacc
Tiger
Code
Gen
CompilersCompilers ToolsTools
Identifying loops in a program
• Second round of guesses. How many lines of code would it take to implement it?– 1000+?– 100-1000?– Less than 100?
• Let’s find it out!
Code• void MyCustomPhase::Execute(Unit unit) {• Phx.FunctionUnit functionUnit = unit.AsFunctionUnit;• functionUnit.BuildFlowGraph();• Phx.Graphs.FlowGraph cfg = functionUnit.FlowGraph;• cfg.BuildDepthFirstNumbers();• foreach (Phx.Graphs.BasicBlock bb in cfg.BasicBlocks) {• foreach (Phx.Graphs.FlowEdge edge in bb.SuccessorEdges) {• if (edge.IsBack) {• Phx.Graphs.BasicBlock headblock = edge.SuccessorNode;• Phx.IR.Instruction instr = headblock.FirstInstruction;• Console.WriteLine("Found loop: Function: {0}, File: {1}, Line: {2}",• Phx.Utility.Undecorate(functionUnit.NameString, false),• functionUnit.DebugInfo.GetFileName(instr.DebugTag),• functionUnit.DebugInfo.GetLineNumber(instr.DebugTag));• }• }• }• functionUnit.DeleteFlowGraph();• }
BB3
BB2
BB1
BB3
BB1
Pointer Analysis with LLVM
Pointer Analysis
• Implementing custom pointer analysis phase using LLVM: Extensibility
• Pointer Analysis is a static code analysis technique that establishes which pointers, or heap references, can point to which variables or storage locations
int x, *w, **z;z = &w;*z = &x;
z
w
x
Pointer Analysisint main() {
int x, y, *v, *w, **z;z = &w;*z = &x;z = &v;*z = &y;
}
z
w v
x y
Does single pass always work?
Pointer Analysisint main() {
int x, y, *v, *w, **z;z = &w;*z = &x;z = &v;while (…) {
*z = &y; z = &w;
}}
z
w v
x y
Flow SensitiveAnalysis
1) Precise2) Slow3) Points to set for every program point
Pointer Analysis
int main() {int x, y, *v, *w, **z;z = &w;*z = &x;z = &v;while (…) {
*z = &y; z = &w;
}}
z
w v
x y
Flow InsensitiveAnalysis
1) Fast2) Imprecise3) Conservative
Pointer Analysis Research
• Hybrid Approach– Start with a conservative points-to set using a fast
imprecise algorithm (e.g. flow insensitive)– Implement custom analysis phase that refines the
points-to setz
w v
x y
FlowInsensitiveCustom
LLVM (Low Level Virtual Machine)• A compilation strategy designed to enable effective program
optimization across the entire lifetime of a program. LLVM supports effective optimization at compile time, link-time (particularly interprocedural), run-time and offline (i.e., after software is installed).
• A virtual Instruction set: LLVM is a low-level object code representation that uses simple RISC-like instructions, but provides rich, language-independent, type information and dataflow (SSA) information about operands. This combination enables sophisticated transformations on object code, while remaining light-weight enough to be attached to the executable.
• A compiler infrastructure - LLVM is also a collection of source code that implements the language and compilation strategy
Pointer analysis with LLVM
• LLVM: Provides a framework for writing custom pointer analysis phases
• Custom phase only needs to implement minimal functionality:– Register phase– Plug-in phase– Initialize phase– Override the primary points-to function
Pointer Analysis with LLVM
• In the box: standard pointer analysis algorithms (flow insensitive analysis)
• Chaining: – Ability to invoke multiple pointer analysis phases
in sequence– Our custom phase only needs to worry about
refining the points-to set, not creating or maintaining it
Resources
• Phoenix: http://en.wikipedia.org/wiki/Phoenix_(compiler_framework)
• LLVM: http://llvm.org/• ROSE: http://
en.wikipedia.org/wiki/ROSE_compiler_framework
• SUIF: http://suif.stanford.edu/suif/suif2/
Debuggers
Our focus for today
• Basic working of source level debuggers:– Generating call stacks– Breakpoints– AddWatch for variables– Primary debugger event loop
Overview
• Dynamic Information (Run time: OS provided)– Current Instruction Pointer (IP)– Debuggee Process Info
• Process ID• Register Context• Process Memory• Loaded Modules/Libraries (exe, dll, etc.)
• Static Information (Compile time generated)– Compiler generated DebugInfo
DebugInfo• Information generated by compiler backend/linker for debugging
support
• Database of tables:– Types– Symbols– Locations– Source Line Numbers– Source File Info– Compilation environment, command line, etc.
• Stored in standard formats: e.g. DWARF is one of the standard debug file format used my many C/C++ compilers (gcc -g)
Sample test code// main.cpp main.exe (Module 1)__declspec(dllimport) int dll_method1(int i);int main(int argc) {
return dll_method1(argc); }------------------------------------------------------------------------------------------------------// dll1.cpp dll1.dll (Module 2)__declspec(dllexport) int dll_method1(int i) {
return dll_method2(i);}int dll_method2(int i) {
__debugbreak();return i;
}
main
dll_method1
dll_method2
Call Stackdll1.dll!dll_method2(int i=1) at line 7, dll1.cppdll1.dll!dll_method1(int i=1) at line 4, dll1.cppmain.exe!main(int argc=1) at line 5, main.cppmain.exe!mainCRTStartup at xxx bytes
• Components of each stack frame• Generating them from:
– Debuggee Runtime Info– Compiler generated Debug Info
Relative Virtual Address (RVA)
• Current IP or Virtual Address (VA) = 0x3600h
• Module Loaded at VA = dll1.dll
• Base Virtual address of module at IP = 0x3000h
• Current Relative Virtual Address (RVA) = 0x600h
Virtual Address Space
dll1.dll
main.exe
0x1000h
0x3000h
0x5000h
0x3600hIP
Relative Virtual Address (RVA)
• Importance– Used for referring to address offsets within a
module– Generated at compile time– RVAs act as primary keys for many DebugInfo
database tables
Example: Source Line table// dll1.cpp dll1.dll (Module 2)__declspec(dllexport) int dll_method1(int i) {
return dll_method2(i);}
00000010: push ebp 00000011: mov ebp,esp 00000013: mov eax,dword ptr [ebp+8] 00000016: push eax 00000017: call ?dll_method2@@YAHH@Z 0000001C: add esp,4 0000001F: pop ebp 00000020: ret
RVA SrcFile SrcLine SrcColumn
0x0010 1 2 0
0x0011 1 2 0
0x0013 1 3 0
0x0016 1 3 0
… … … …
1234
dll1.dll ! dll_method2 (int i=1) at line 7, dll1.cpp• Debuggee Runtime Info:
– Instruction Pointer (IP)– Module Name
• IP or Virtual address (VA) -> Module
– Module Base Virtual Address (Load address)• Module -> Base VA
– Base Pointer (BP), Stack Pointer (SP)– Register Context– Read Process Memory– Return Address to process next stack frame
• Compiler generated debug info– Function Name
• VA - Base VA -> Relative VA (RVA)• RVA, Module -> Function Symbol (from Symbol table)
– Type table, Symbol Table (per module/function)• Function Symbol -> Locals/Params Symbols & Types
– Location (register/stack)• Local Symbol -> Register ID/Base Register ID + Offset
– Source line number• RVA-> Source Line (from Line number table)
– Source file name• RVA -> Source File (from Source file table)
Breakpoints
SetBreakpoint (SourceFile, SourceLine) for each Module loaded in debuggee address space (RunTime Info)
for each SrcFile in SrcFileTable of the Module (CompileTime DebugInfo) if SourceFile == SrcFile (CompileTime DebugInfo)
SrcLineTable = SourceLineTable (SrcFile) (CompileTime DebugInfo) RVAList = Lookup (SrcLineTable, SourceLine) (CompileTime DebugInfo)
StartRVA = Head (RVAList) (CompileTime DebugInfo) VA = StartRVA + BaseVA (RunTime Info) WriteProcessMemory (VA, “int 3”) (RunTime Info)
// dll1.cpp dll1.dll (Module 2)__declspec(dllexport) int dll_method1(int i) {
return dll_method2(i);}
00000010: push ebp 00000011: mov ebp,esp 00000013: mov eax,dword ptr [ebp+8] 00000016: push eax 00000017: call ?dll_method2@@YAHH@Z 0000001C: add esp,4 0000001F: pop ebp 00000020: ret
RVA SrcFile SrcLine SrcCol
0x0013 1 3 0
0x0016 1 3 0
Another example: Watch window
• AddWatch(Local Variable Name)– IP or VA -> Module– If Module’s DebugInfo available AND not loaded
• Load DebugInfo (Module)– VA -> RVA– RVA -> Function Symbol– Function Symbol -> Local Symbol (By Name)– Local Symbol -> Type (Type Table)– Local Symbol -> Location -> Value
Debugger Main Loop• CreateProcess / AttachToProcess (Debuggee
FileName/ProcessID, DEBUG_PROCESS)• while (Wait For Debug Event != EXIT_PROCESS)
– Handle different debug events: Exceptions (Access violation), CreateThread, etc.
– Handle loader events: Load dynamic link library• Set/Clear breakpoints
– Handle Breakpoint Event• Read Debuggee RegisterContext• GenerateCallStack (IP)• Display Source File (IP)• Display locals/watch window
And lot more…
• Other Debugging features:– Edit & Continue debugging: Incremental Linking– Expression Evaluator– Disassembly level debugging– Conditional breakpoints/Tracepoints– Remote debugging– Native/Managed interop debugging– User mode vs Kernel mode debugging– Crash dump or Post-Mortem debugging
Code Quality and Throughput
Metrics of Success
• New Language/Compiler– Compiles valid programs– Generates correct target code– Generates helpful error/warning messages
• Shipping compilers – Code quality or Performance (code size & execution
time of target code)– Build throughput (compile time)– Memory footprint
Code Quality (CQ)• Code Quality measures how good the compiled binary is, in terms of the
execution time, code size, energy consumed, etc.
• CQ analysis serves two purposes: exposing optimization opportunities and addressing regressions in a timely manner.
• Benchmarks– SPEC (Standard Performance Evaluation Corporation) non-profit org to
establish and endorse benchmarks– Micro-benchmarks– Real world code
• C++ team at MS has a dedicated full time Performance team for measuring, analyzing and reporting CQ. Additionally, every developer needs to measure CQ impact of any significant code change prior to the check-in.
Build Throughput (TP)• Build Throughput is the time taken to compile and link the program
• TP is as important as CQ
• C++ compiler team at MS: Approx. half of the customer requests are to improve compiler/linker TP!
• Tests:– Daily benchmark runs for TP– Weekly TP builds of Windows, SQL, Office
• Greater than 1% TP regression blocks the check-in and needs to be analyzed
Relation between CQ and TP
• Inversely proportional– Adding more optimizations improves CQ, but
hurts the build TP• Need a fine balance of CQ gain vs TP overhead
– Even a perfectly good and useful optimization for a certain code base could be completely useless for another
– Challenge: Figuring out what optimizations to implement (or rather leave out) based on target customer usage
Importance of BE
• CQ and TP are mainly owned and affected by the backend.
• Front end (Parsing) takes up a significant chunk of build TP, but stabilizes over time.
• Can you guess the ratio of FE devs:BE devs in the C++ team at MS?– Around 1:5
• BE plays a significant role!
Optimized Code Debugging
Optimized Code Debugging
• Why debug optimized code?– Program crash in shipped product with no
concrete steps to reproduce the bug– Debug builds generate binaries and debug info
files which are twice as big as optimized retail builds
– Test passes in software companies happen on retail builds. Regenerating the same environment with patched debug builds is very painful and time consuming
Difficulties
• Target code is vastly different from source code due to optimizations. Leads to bad debugging experience:– Local variables/parameters optimized away, CSE, Dead
code elimination• Can’t trust locals/watch window
– Function call inlining• Can’t trust call stacks
– Code Motion, Code merge• Single stepping leads to cursor jumping around in the source file
– Loop unrolling, Scope merging• Can’t trust source level scopes: Optimized code doesn’t respect
source level scopes
Debugger Approaches• Don’t care!
– Used by lot of shipping debuggers!– There is no well defined end-to-end debugging experience
• Use the optimization info to generate a mapping from target code to source code– Virtual mapping– Generate a modified source file from target code using reverse
engineering
• Don’t de-optimize– Users made aware of optimization effects– Debugging has to be done at source + disassembly level
Resources
• DWARF: http://dwarfstd.org/• Optimize Code Debugging:
http://sourceware.org/gdb/current/onlinedocs/gdb/Optimized-Code.html