a framework for binary code analysis, and static and

22
Binary Code Analysis and Editing © 2007 Barton P. Miller July 2007 A Framework for Binary Code Analysis, and Static and Dynamic Patching Barton P. Miller University of Wisconsin [email protected]

Upload: others

Post on 16-Oct-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Framework for Binary Code Analysis, and Static and

Binary Code Analysis and Editing© 2007 Barton P. Miller July 2007

A Framework for Binary CodeAnalysis, and Static and

Dynamic Patching

Barton P. MillerUniversity of Wisconsin

[email protected]

Page 2: A Framework for Binary Code Analysis, and Static and

– 2 – Binary Code Analysis and Editing© 2007 Barton P. Miller

Motivation

• Multi-platform• Open architecture• Extensible• Open source

• Testable• Suitable for batch processing• Accurate• Efficient

Binary code analysis is a basic tool of securityanalysts, application developers, system designers andtool developers.

We are designing and building a new foundation tosupport such analysis.

Existing binary analysis tools have significantlimitations.

Page 3: A Framework for Binary Code Analysis, and Static and

– 3 – Binary Code Analysis and Editing© 2007 Barton P. Miller

Why Binary Code? Access to the source code often is not possible:

• Proprietary software packages.• Stripped executables.• Proprietary libraries: communication (MPI, PVM), linear

algebra (NGA), database query (SQL libraries). Binary code is the only authoritative version of the

program.• Changes occurring in the compile, optimize and link

steps can create non-trivial semantic differences fromthe source and binary.

Worms and viruses are rarely provided with sourcecode

Page 4: A Framework for Binary Code Analysis, and Static and

– 4 – Binary Code Analysis and Editing© 2007 Barton P. Miller

Our Starting Point: Dyninst A machine-independent library for machine level code

patching.• Functions for binary code analysis• Functions for binary code patching

Clean abstractions to encapsulate the tool complexity.

Originally designed as part of the Paradynperformance profiling tool, but now widely used inmany areas, including cyber-security.

Page 5: A Framework for Binary Code Analysis, and Static and

– 5 – Binary Code Analysis and Editing© 2007 Barton P. Miller

Dynamic Instrumentation Does not require recompiling or relinking

• Saves time: compile and link times aresignificant in real systems.

• Can instrument without the source code (e.g.,proprietary libraries).

• Can instrument without linking (relinking is notalways possible.

Instrument optimized code.

Page 6: A Framework for Binary Code Analysis, and Static and

– 6 – Binary Code Analysis and Editing© 2007 Barton P. Miller

Dynamic Instrumentation (con’d) Only instrument what you need, when you need

• No hidden cost of latent instrumentation.• Enables “one pass” tools.

Can instrument running programs (such asWeb or database servers)• Production systems.• Embedded systems.• Systems with complex start-up procedures.

Page 7: A Framework for Binary Code Analysis, and Static and

– 7 – Binary Code Analysis and Editing© 2007 Barton P. Miller

The Basic MechanismApplication

Program

Function foo

Trampoline

Instrumentation

RelocatedInstruction(s)

Page 8: A Framework for Binary Code Analysis, and Static and

– 8 – Binary Code Analysis and Editing© 2007 Barton P. Miller

The DynInst Interface Machine independent representation

Write-once, analyze/instrument-many (portable)

Object-based interface to insert new code: AbstractSyntax Trees (AST’s)

Hides most of the complexity in the API• Easy to build tools: e.g., an MPI tracer: 250 lines of C++

code.

Page 9: A Framework for Binary Code Analysis, and Static and

– 9 – Binary Code Analysis and Editing© 2007 Barton P. Miller

Basic DynInst Operations Code query routines:

• Find control-flow elements: modules,procedures, loops, basic blocks, instructions– For functions, find entry, exit, call sites.– For loops, find entry, exit, body.

• Find data elements: variables and parameters• Call graph (parent/child) queries• Intra-procedural control-flow graph

Page 10: A Framework for Binary Code Analysis, and Static and

– 10 – Binary Code Analysis and Editing© 2007 Barton P. Miller

Basic DynInst Operations Code modification routines:

• Remove Function Call– Disable an existing function call in the application

• Replace Function Call– Redirect a function call to a new function

• Replace Function– Redirect all calls (current and future) to a function to a new

function.• Replace Instruction

– Code snippet executes instead of specified instruction.• Wrap Function

– Allow the new function to call the replaced one (potentially withall its original parameters).

Page 11: A Framework for Binary Code Analysis, and Static and

– 11 – Binary Code Analysis and Editing© 2007 Barton P. Miller

Basic DynInst Operations Process control:

• Attach/create process• Monitor process status changes• Callbacks for fork/exec/exit

Inferior (application processor) operations:• Malloc/free

– Allocate heap space in application process• Inferior RPC

– Asynchronously execute a function in the application.• Load module

– Cause a new .so/.dll to be loaded into the application.

Page 12: A Framework for Binary Code Analysis, and Static and

– 12 – Binary Code Analysis and Editing© 2007 Barton P. Miller

Basic DynInst Operations Building AST code sequences:

• Control structures: if and goto• Arithmetic and Boolean expressions• Get effective address• Generate instruction with calculated address.• Get PID/TID operations• Read/write registers and global variables• Read/write parameters and return value• Function call

Page 13: A Framework for Binary Code Analysis, and Static and

– 13 – Binary Code Analysis and Editing© 2007 Barton P. Miller

incl ctr

sethi %hi(ctr)

ld [. . .],%o1

add %o1,%o1,1

st %o1,[. . .]

SPARC Code

Machine Independent CodeAbstract Syntax Trees:

cau r3,r0,hi%ctrl r4,lo%ctr(r3)addi r4,1(r4)st r4,lo%ctr(r3)

Power Code

IA32 Code

Page 14: A Framework for Binary Code Analysis, and Static and

– 14 – Binary Code Analysis and Editing© 2007 Barton P. Miller

BinaryCode

Symbol TableParser

PE

ELF

COFF

IA32

AMD64

Power Disassembly

Symbol Table Dump

CodeParser

InstructionDecoder

Code Queriesand

InstrumentationRequests

AST

InstrControl

CodeGen

ProcessControl

CallGraph

IntraProcCFG

IdiomSignatures

StackWalker

IdiomDetector

* *

* *

* *

Page 15: A Framework for Binary Code Analysis, and Static and

– 15 – Binary Code Analysis and Editing© 2007 Barton P. Miller

SymtabAPI Version 1.0 available as of June 5, 2007.

• Supports ELF, XCoff, PE (Linux, Solaris, AIX,Windows).

Debug information available in next release:line numbers, local variables, types.

Unstrip - SymtabAPI demo tool thatregenerates a stripped binary’s symbol table• Uses code parser to find function entry points• Uses SymtabAPI to write new symbol table into

binary.

Page 16: A Framework for Binary Code Analysis, and Static and

– 16 – Binary Code Analysis and Editing© 2007 Barton P. Miller

DynStackwalker Available soon on all Dyninst platforms. Cross-platform API for collecting first and

third party stackwalks. Callback interface allows users to plug in their

own stack walking mechanisms, e.g:• Walking through non-standard stack frames

created by optimized functions.• Use stackwalking debug information provided by

another library

Page 17: A Framework for Binary Code Analysis, and Static and

– 17 – Binary Code Analysis and Editing© 2007 Barton P. Miller

InstructionAPI Decodes machine code into abstract

instruction representation Interface allows straightforward data flow

and control flow analysis• Query interface is designed for analysis, e.g.:

– Control flow targets– Registers read/written– Memory addresses accessed

• Instructions can be annotated with analysisresults

Provides disassembly interface• Pluggable formatters

Page 18: A Framework for Binary Code Analysis, and Static and

– 18 – Binary Code Analysis and Editing© 2007 Barton P. Miller

BinInst Design Goals Tool-kit component architecture for binary

analysis and editing Open source Open data structure definitions Machine-independent abstract interfaces Batch-enabled analyses Static and dynamic code patching All major analysis products are exportable Enhanced testability and accompanying test

suites

Page 19: A Framework for Binary Code Analysis, and Static and

– 19 – Binary Code Analysis and Editing© 2007 Barton P. Miller

Disassembly

Symbol Table Dump

CallGraph

Intra-ProcCFG

BinaryDecode

andParsing

Code Queriesand

InstrumentationRequests

BinaryCode

AST

Static Editing Scenario (Binary Rewriting)

InstrControl

CodeGen

IdiomSignatures

Page 20: A Framework for Binary Code Analysis, and Static and

– 20 – Binary Code Analysis and Editing© 2007 Barton P. Miller

Disassembly

Symbol Table Dump

BinaryDecode

andParsing

BinaryCode

Interactive Editing Scenario (Static or Dynamic)

InstrControl

CodeGen

CallGraph

Intra-ProcCFG

IdiomSignatures

Page 21: A Framework for Binary Code Analysis, and Static and

– 21 – Binary Code Analysis and Editing© 2007 Barton P. Miller

Disassembly

Symbol Table Dump

BinaryDecode

andParsing

Code Queriesand

InstrumentationRequests

BinaryCode

AST

Dynamic Editing Scenario (Dynamic Instrumentation)

InstrControl

CodeGen

ProcessControl

UserProcess

CallGraph

Intra-ProcCFG

IdiomSignatures

StackWalker

Page 22: A Framework for Binary Code Analysis, and Static and

– 22 – Binary Code Analysis and Editing© 2007 Barton P. Miller

Disassembly

Symbol Table Dump

BinaryDecode

andParsing

BinaryCode

Analysis Scenario

Connector 2

CodeSurfer

VSA BufferOverrun

OtherTool

CallGraph

Intra-ProcCFG

IdiomSignatures