© 2006 andrew r. bernatmarch 2006generalized code relocation generalized code relocation for...
DESCRIPTION
– 3 –© 2006 Andrew R. Bernat Generalized Code Relocation Multitramps Whole-program instrumentation All instructions, including neighbors All control flow edges One trampoline per basic block Reduces number of extra branches Hierarchical code generation Extensible Allows for a variety of optimizationsTRANSCRIPT
Generalized Code Relocation© 2006 Andrew R. Bernat March 2006
Generalized Code Relocation for Instrumentation and Efficiency
Andrew R. BernatUniversity of Wisconsin
– 2 – Generalized Code Relocation© 2006 Andrew R. Bernat
Design Objectives Whole-program instrumentation
•Instrument every instruction in the program•… and all control flow edges as well
Efficient instrumentation•No traps!•Minimize extraneous jumps•Restrict register save/restores
Flexible, extensible instrumentation system•Laying the groundwork for binary rewriting
– 3 – Generalized Code Relocation© 2006 Andrew R. Bernat
Multitramps Whole-program instrumentation
•All instructions, including neighbors•All control flow edges
One trampoline per basic block•Reduces number of extra branches
Hierarchical code generation•Extensible•Allows for a variety of optimizations
– 4 – Generalized Code Relocation© 2006 Andrew R. Bernat
Function Relocation Efficient instrumentation
•Blocks too small for branch to instrumentation•Instrumentation too far away•No traps!
Shared functions•Copy to remove sharing
Function rewriting•Undo optimizations
– 5 – Generalized Code Relocation© 2006 Andrew R. Bernat
Old Instrumentation OverviewApplicationProgram
Function foo
Base Trampoline
Save Regs
instr2
Mini Trampolines
InstrumentationCode
InstrumentationCode
instr1instr2instr3
Restore Regs
Save RegsRestore Regs
– 6 – Generalized Code Relocation© 2006 Andrew R. Bernat
Old Instrumentation - Consecutive
ApplicationProgram
Function foo
Multiple BaseTrampolines
Mini Trampolines
instr2
instr1instr2instr3
instr1
– 7 – Generalized Code Relocation© 2006 Andrew R. Bernat
Old Instrumentation – Uninstrumentable Neighbors
ApplicationProgram
Function foo
Base Trampoline
Save Regs
instr2
Mini Trampolines
InstrumentationCode
InstrumentationCode
instr1instr2instr3
instr1
instr3
Restore Regs
Save RegsRestore Regs
– 8 – Generalized Code Relocation© 2006 Andrew R. Bernat
Edge instrumentationApplicationProgram
Function foo
Base Trampolines
branch
‘Edge’ Trampoline
save/restore
save/restore
save/restorebranch
Instrument edges via another level of
indirection (plus extra branches)
pre-branch
fallthrough
jump taken
– 9 – Generalized Code Relocation© 2006 Andrew R. Bernat
Limitations of Old Instrumentation Incomplete instrumentation coverage
•Often could not instrument “near-by” instructions
Inefficient instrumentation•Edges, consecutive instructions require
extra branches Platform specific implementation
•Inextensible and bug-prone
– 10 – Generalized Code Relocation© 2006 Andrew R. Bernat
Multitramp Principles Basic-block instrumentation
•One jump to/from per block•Efficient instrumentation of neighbor
instructions Logical view: a control flow graph
•Relocated instructions + instrumentation•Apply compiler techniques to dynamic
instrumentation
– 11 – Generalized Code Relocation© 2006 Andrew R. Bernat
MultitrampsApplicationProgram
Function foo
Multitramp
Basic Block
Base Tramp
Instruction
Instruction
Base Tramp
Branch
Fallthrough Target
– 12 – Generalized Code Relocation© 2006 Andrew R. Bernat
Multitramp Implementation A multitramp is a tree of code objects
Code objects provide the following:•Maximum space required (worst case)•Generate, install, and link callbacks•Map relocated to original address
Single mechanism for both instruction and edge instrumentation
– 13 – Generalized Code Relocation© 2006 Andrew R. Bernat
Multitramp Example
Base Tramp 1
Instruction
Base Tramp 2
Branch
Mini Tramp 4Base Tramp 3
Mini Tramp 3
Mini Tramp 1
Mini Tramp 2
save ; BT 1branch <MT 1>
restore ; BT 1<relocated instr>
branch <BT 3>save ; BT 2branch <MT 3>
restore ; BT 2returnsave ; BT 3branch <MT 4>
restore ; BT 3return
– 14 – Generalized Code Relocation© 2006 Andrew R. Bernat
In-Line Instrumentation Current out-of-line model is based on the
requirements of Paradyn•Frequent insertion/removal of
instrumentation
Limited opportunity for optimization•Particularly register saves and restores
What about long-lived instrumentation?
– 15 – Generalized Code Relocation© 2006 Andrew R. Bernat
In-Line Instrumentation In-line instrumentation into a single code
sequence:•Relocated instructions•Save/restore code•Instrumentation
Replace entire sequence when something changes!
BPatch::setMergeTramp(true)
– 16 – Generalized Code Relocation© 2006 Andrew R. Bernat
Multitramp Status Extensible implementation
•Can add new code objects to multitramp CFG:– Raw binary sections.– Control flow-altering code
In-line instrumentation•POWER, x86-64
Platform-independent design•Encapsulated platform-dependent sections•Included with all platforms in Dyninst 5.0
– 17 – Generalized Code Relocation© 2006 Andrew R. Bernat
Multitramp ResultsWhole-program instrumentation
Instrument every instruction in the program… and all control flow edges as well
Efficient instrumentation•No traps!Minimize extraneous jumpsRestrict register save/restores
Flexible, extensible instrumentation systemLaying the groundwork for binary rewriting
– 18 – Generalized Code Relocation© 2006 Andrew R. Bernat
Function Relocation The basic block may be too small to
contain a branch to instrumentation•IA-32, x86-64
We may not have the available registers to construct a long branch•POWER, SPARC
Solution: relocate on a function level•Sufficient space to fit large branches•Dead registers that can be used to branch
– 19 – Generalized Code Relocation© 2006 Andrew R. Bernat
Old Approach One-time relocation
•Preemptively expand possible instrumentation sites:– Function entry, exit, call sites; loop entry, exits– But what about everything else?
Linear scan of the function, ignoring control flow.•Dangerous with in-lined data
– 20 – Generalized Code Relocation© 2006 Andrew R. Bernat
Incremental Function Relocation A function is a list of basic blocks
Accumulate modifications to each block•Ex: block must be 5 bytes long
Generate relocated versions on-the-fly•Only modify what is necessary
Add instrumentation to the new function
– 21 – Generalized Code Relocation© 2006 Andrew R. Bernat
Function Relocation - Example
Block 2 is too small topatch in a jump
block 1
block 5
block 4
block 3block 2 block 2
block 1
block 5
block 4
block 3 block 2
1. Copy the function
2. Enlarge block 2
3. Replace
Addmodification
– 22 – Generalized Code Relocation© 2006 Andrew R. Bernat
Other Uses for Relocation Overlapping functions
•Relocation disambiguates code•Instrument unique per-function copy
Undo optimizations•Rewrite the function during relocation•Example: unwinding a tail call
– 23 – Generalized Code Relocation© 2006 Andrew R. Bernat
Function Relocation Status Platform-independent function relocation
engine•IA-32, x86-64, POWER, SPARC
Support for multiple relocated versions•On-the-fly code relocation
Extensible modification interface•Block must be 5 bytes long•Modify the instructions in the block
– 24 – Generalized Code Relocation© 2006 Andrew R. Bernat
Design ObjectivesWhole-program instrumentation
Instrument every instruction in the program… and all control flow edges as well
Efficient instrumentationNo traps!Minimize extraneous jumpsRestrict register save/restores
Flexible, extensible instrumentation systemLaying the groundwork for binary rewriting
– 25 – Generalized Code Relocation© 2006 Andrew R. Bernat
Conclusion Multitramps
•Whole-program instrumentation approach Function relocation
•Instrument everywhere (without traps) People
•Drew Bernat – Multitramps•Nate Rosenblum – Function relocation•Nick Rutar – Register optimizations