automatic discovery of api-level exploits vinod ganapathy, sanjit seshia, somesh jha, thomas reps...

Automatic Discovery of

API-Level ExploitsVinod Ganapathy, Sanjit Seshia,

Somesh Jha, Thomas Reps & Randal Bryant

{vg,jha,reps}@cs.wisc.edu{sanjit,bryant}@cs.cmu.edu

ICSE 2005 Automatic Discovery of API-Level Exploits 2

Two Definitions

ExploitA sequence of operations that attacks

the vulnerability

VulnerabilityAn error in a software package

that allows for unintended behavior


Motivation

//Format & enter into LOGvoid log(char *fmt,...){ fprintf(LOG,fmt,...); return;}

//Call log on user inputint foo(void){ char buf[LEN]; … fgets(buf,LEN-1,FILE); log(buf); …}

Format-string vulnerability

buf = “%s%s%s” fprintf(LOG,“%s%s%s”)

Insufficient arguments to fprintf.Possible outcomes

Unintelligible log entry. Program crash. Hacker takes over program!


MotivationFormat-string vulnerabilities:

Well-understood class of security vulnerabilities.

Commonly used by attackers [CERT]

Tools to find format-string vulnerabilities exist Examples: Percent-S [Shankar et al. USENIX Security

2001], Avots et al. [ICSE 2005 (later in this session!!)]

What about systematic exploit-finding tools?


Vulnerabilities versus Exploits

Classification

Vulnerability-finding tools Exploit-generation

Buffer-overflows

BOON, Archer, CSSV, Rats, Its4, Codesurfer, …

Solar-designer,Aleph-one,…

Format-strings

Percent-S, Avots et al.,…

Newsham, Thuemmel,…

Race-conditions

Eraser, RacerX, SVD, Atomicity,…

Several online manuals.

SQL-injection

JDBC-checker, … Several online manuals.



Classification


Buffer-overflows



Format-strings



Race-conditions

Atomicity, Eraser, RacerX,…,…


SQL-injection


•Huge body of research•Based upon systematic techniques



Classification


Buffer-overflows



Format-strings



Race-conditions

Atomicity, Eraser, RacerX,…,…


SQL-injection


•Ad-hoc (but clever!) techniques. •Mostly restricted to the hacker community.



All programs Programs withvulnerabilities




Existing workfinds this setof programs




Programs withexpoitable

vulnerabilities




Programs withexpoitable

vulnerabilities

This paper: What does it take

to identify these programs?


Main Message of This Talk

Exploit-finding can benefit,

and improve the quality of, vulnerability-finding

tools


Exploit-finding: Possible Benefits

Benefits to vulnerability-finding tools: Enhance vulnerability reports with exploits. Helps differentiate between exploitable and

benign vulnerabilities. Prioritizes bug-fixes.

Test cases for future versions of the software.

Derive signatures for signature-based misuse-detection systems (e.g., Snort, Bro)


Exploit-finding: Challenges Key difference: Modeling low-level details of the runtime-environment.

Format-string exploits

[CERT, Newsham, Thuemmel,…]

IBM CCA API exploit[Bond and Anderson, 2001]

Machine architecture. Layout of the runtime execution stack. Pointer movements along the runtime stack. Discussed in detail in the rest of this talk.

Intruder knowledge. Bit operations on cryptographic keys. Analysis is similar to protocol verification techniques. Paper has details.


Exploit-finding: Challenges Key difference: Modeling low-level details of the runtime-environment.

Format-string exploits

IBM CCA API exploit[Bond and Anderson, 2001]

Machine architecture. Layout of the runtime execution stack. Pointer movements along the runtime stack. Discussed in detail in the rest of this talk.

Intruder knowledge. Bit operations on cryptographic keys. Analysis is similar to protocol verification techniques. Paper has details.

Analysis of both case studies is done using the same generic framework.


Talk structureMotivation and Overview.Framework for finding API-level

exploits.Example: format-string exploit-detector.

Overview of printf and format-string exploits.

Instantiating printf in our framework. Results. Comparison with other tools.

Related work and Conclusions.


API-Level Exploits

ExploitA sequence of operations that attacks

the vulnerability

API-Level Exploit A formal way to think about any exploit.View steps of the exploit as operations

from an appropriately defined API.


API-Level Exploits

Example [Chen and Wagner, CCS 2002]

setuid(0) followed by execl can lead to root privileges. View system calls as API to UNIX.

API-Level Exploit View steps of the exploit as operations



API-Level Exploits

Example [Chen and Wagner, CCS 2002]

setuid(0) followed by execl can lead to root privileges. View system calls as API to UNIX.

API-Level Exploit View steps of the exploit as operations


This sequence of system calls is allowed by UNIX,

but compromises security


Framework for Modeling APIs

Goal: Find exploits at the API-Level.

Requirement 1: Model low-level details, reason about control and data. Stack layouts. Pointer movements.

Solution: Model using expressive logics.

Requirement 2: Only check allowed sequences.

Solution: Encode sets of allowed sequences.



System modeled using: Set of variables that describe its state. API operations that change state. Language of allowed API operations.

System S<v1,v2,…,vn>



System modeled using: Set of variables that describe its state. API operations that change state. Language of allowed API operations.

System S<v1,v2,…,vn>

OP1

OP2

OP3



Checker: Basic idea Specify what is Bad for the system. Check all allowed inter-leavings of API

operations.Problem: Arbitrarily long API operation

sequences.Solution: Only check finite prefixes.

Use bounded model checking. Implication: Only find exploits of bounded

length.See paper for details.


Talk structureMotivation and Overview.Framework for finding API-level






Format-string vulnerabilities

Allow intruder to assume privileges of the victim program.

Highly prevalent. [CERT]

Vulnerability-detection tools available. Example: Percent-S, Avots et al.

Goals of our tool: Systematically find exploits against such

vulnerabilities. Work with real-world applications.


Overview of printf


//Call log on user input int foo(void){ char buf[LEN]; … fgets(buf,LEN-1,FILE); log(buf); …}

Stackgrowth

Pointers

High addresses

Low addresses


Overview of printf



bufLEN


Overview of printf



bufLEN

Pointer to buf


Overview of printf



bufLEN

Stack frame of log


Overview of printf



bufLEN

Pointer to buf


Overview of printf



bufLEN

Stack frame of fprintf


Overview of printf



bufLEN

argptr

DIS

fmtptr


Overview of printf



bufLEN

argptr

DIS

fmtptr

buf = “%x%x%s”


Overview of printf



bufLEN

argptrDIS

fmtptr

buf = “%x%x%s”

4 bytes,integer


Overview of printf



bufLEN

argptrDIS

fmtptr

buf = “%x%x%s”

4 bytes,address


Format-string Exploits

bufLEN

argptr

DIS

fmtptrWhat if we move argptr into buf?

Remember, attacker can control buf!



LEN

argptr

DIS

fmtptr

Example exploit scenario:•fmtptr is at a “%s”•buf contains an attacker-chosen address.•argptr points to this location within buf

Can read from arbitrary memory location!

Writes also possible, paper has details.

%s

address



bufLEN

argptr

DIS

fmtptr

Exploit technique justdiscussed is well-

knownOur key observations:1.DIS and LEN completely characterize any printf call.

2. Each byte in buf instructs

printf what to do next.



bufLEN

argptr

DIS

fmtptrEach byte to printf is an instruction.

View format-string as sequence of API-ops.

Format-string exploits = API-level exploits.


Finding Format-string Exploits

Format-string = sequence of API operations.

Modeling printf: from the code in libc. Various flags and variables that describe printf.

API operations: all ASCII characters (|Σ| = 256).

Allowed API operations (Σ*).printf<flags>

% x

s

d

g

f

E.g., # of bytes printed, # of bytes to read off the

stack,…



bufLEN

argptr

DIS

fmtptr Construct model of printf (one-time activity)

Model is parameterized by DIS and LEN.

At each printf call site, DIS and LEN can be obtained by disassembly.



Specialize printf to call site using values of DIS and LEN.

Formulate Bad. Paper has examples of Bad that can be used to

read from, or write to, an arbitrary memory location.

Run the bounded model checker. Only need to check format-strings shorter than LEN

Have a bound for the model checker!



The model of printf: Requires precise reasoning about stack

locations, in particular, the format-string. Has integer operations: pointer arithmetic

to advance fmtptr and argptr.

Quantifier-free Presburger-arithmetic with theory of uninterpreted functions and arrays. UCLID tool. [Bryant et al. CAV 2002]


Format-string Exploit-detector

Finds exploits against vulnerabilities in real-world software packages.

Can find different kinds of exploits.Can find an arbitrary number of

variations of a given exploit.Can work on binary executables.Can improve the quality of format-string

vulnerability-detection tools.


Use ScenarioPercent-S [Shankar et al. USENIX Security 2001]

finds possibly vulnerable locations. No exploits.

Run our tool at each vulnerable location: Exploit generated: true vulnerability. No exploit generated: possibly a false alarm.

Format-stringvulnerabilityfinding tool

List ofvulnerabilities

Ourtool

ExploitPossible false alarmExploitExploit

Possible false alarmExploit


Results Exploits against vulnerabilities in real-world software:

See paper for details

Software DIS LEN Exploit description

php-3.0.16 24 1024 Overwrite memory location

qpopper-2.53

2120

1024 Read a memory location

wu-ftpd-2.6.0

9364

4096 Overwrite memory location


ResultsDIS LE

NRead exploit Write exploit

0 7 “a1a2a3a4%s” No exploit

4 7 No exploit No exploit

4 16 “a1a2a3a4%d%s” “%234Lg%na1a2a3a4”

4 16 “%Lx%ld%sa1a2a3a4” “a1a2a3a4%%%229x%n”

8 16 “a1a2a3a4%Lx%s” “a1a2a3a4%230g%n”

16 16 “%Lg%Lg%sa1a2a3a4” “a1a2a3a4%137g%93g%n”

20 20 “a1a2a3a4%Lg%g%s” “a1a2a3a4%210Lg%20g%n”

24 20 “a1a2a3a4%Lg%Lg%s” “a1a2a3a4%61Lg%169Lg%n”

32 24 “a1a2a3a4%g%Lg%Lg%s” “a1a2a3a4%78Lg%80g%72Lg%n”


ResultsDIS LE




4 16 “a1a2a3a4%d%s” “%234Lg%na1a2a3a4”


8 16 “a1a2a3a4%Lx%s” “a1a2a3a4%230g%n”





Ability to find false alarms


ResultsDIS LE




4 16 “a1a2a3a4%d%s” “%234Lg%na1a2a3a4”


8 16 “a1a2a3a4%Lx%s” “a1a2a3a4%230g%n”





Ability to find different kinds of exploits: Parameterized by

the predicate Bad


ResultsDIS LE




4 16 “a1a2a3a4%d%s” “%234Lg%na1a2a3a4”


8 16 “a1a2a3a4%Lx%s” “a1a2a3a4%230g%n”





Ability to find variants of an exploit


Talk StructureMotivation and Overview.Framework for finding API-level






Related Work Software Model Checking [Blast, SLAM, Magic, CBMC,

Saturn] Counter-example guided abstraction refinement. Exploits ≈ Concrete counter-examples.

Test generation [Beyer et al. ICSE04, Boyapati et al. ISSTA02]

Bounded exhaustive checking. Exploits can be used as test cases.

Supercompilers [Massalin ASPLOS87, Joshi et al. PLDI02]

Model low-level semantics. Exhaustive checking to generate compact code

sequences. Ad-hoc techniques [Thuemmel 2001, Newsham 2000]

No soundness guarantees. Cannot find variants.


Summary of Important Ideas

Exploit-finding requires modeling low-level details of the runtime system.

Exploit-finding can benefit vulnerability-finding tools.

Framework for producing API-level exploits.

Real-world instantiation: format-string exploits.

Automatic Discovery of

API-Level ExploitsVinod Ganapathy, Sanjit Seshia,

Somesh Jha, Thomas Reps & Randal Bryant

{vg,jha,reps}@cs.wisc.edu{sanjit,bryant}@cs.cmu.edu

automatic discovery of api-level exploits vinod ganapathy, sanjit seshia, somesh jha, thomas reps...

Documents

somesh jha

sanjit seshia

level exploits vinod

thomas reps randal bryant

automatic discovery