automatic diagnosis and response to memory corruption vulnerabilities authors: jun xu, peng ning,...

Automatic Diagnosis and Response to Memory Corruption

Vulnerabilities

Authors: Jun Xu, Peng Ning, Chongkyung Kil, Yan Zhai, Chris

Bookholt

In ACM CCS’05

Presenter: Tai Do

CDA 6938, Spring 2007

Introduction ([KJB+06])

• Memory Corruption Vulnerability– Popular means to take control of target

program– 49% of all attacks in 2006*– Successful attacks cause a remote code

execution– Attack techniques: stack overrun, heap

overflows, etc.

Why Randomization ([KJB+06])

• Most attacks use absolute memory addresses during memory corruption attacks.

• What is address space randomization?– randomizes the layout of process memory– makes the critical memory addresses unpredictable

and breaks the hard-coded address assumption.

• With address space randomization, a memory corruption attack will most likely cause a vulnerable program to crash, rather than allow the attacker to take control of the program.

Buffer Overflow Attack ([N02])

instruction pointer (IP or PC)

stack pointer

base pointer

Attacker’s code retAddrNOPNormal Process Layout

Attacker’s code retAddrNOP

Randomized Process Layout Adapt from [KJB+06]

How to identify security vulnerability from a program crash

• Open source web server ghttpd-1.4.

• Log(): server logging functionalities.

• Buffer overflow vulnerability: temp[]

• Address space randomization causes a working exploit to crash the server process.

• Manual Debug session (gdb) is time consuming and difficult: PC register and stack trace are both corrupted.

What are the goals of this paper?

• The proposed approach:– Automatically diagnose memory corruption

vulnerabilities (automatic backward tracing)– Automatic Response to Attacks (automatic

signature generation)

Outline

• System Description:– Modeling Memory Corruption Attacks– Diagnosing Memory Corruption Vulnerabilities

(Monitor and Diagnosis Engine)– Automatic Response to Attacks (Signature

Generator)

• Evaluation• Conclusion: strengths, weaknesses,

suggestions

System Overview

System Architecture

Modeling Memory Corruption Attacks

• Why modeling memory corruption attacks on a randomized program?– Useful abstract model to guide the later

discussion.– The model is just a finite state machine.– It shows possible cases that lead to program

crash.

Modeling: Two Cases with a Buffer Overflow Attack

Dereference a corrupted local variable

The corrupted return address is invalid, crash when the ret instruction is executed

Modeling: State Transition For A Memory Corruption

c: corrupting instructiont: takeover instructionf: faulting instruction

• Case 1 (green): Format String

• Case 2 and 3 (red and blue): buffer overflow

• Case 4 (purple): not sure

Diagnosing Memory Corruption Vulnerabilities

• Diagnosis means backward tracing to automatically locate the memory corruption vulnerabilities (the corrupting instruction)

• Similar to automated debugging process.

Diagnosis= Trace back

• Tracing back to the initial corrupting instruction consists of two steps:– Step 1: Convert Case-IV crashes to one of

three other cases– Step 2: Trace the corrupting instructions.

• Locating the faulting instruction is critical in both steps. This is the starting point of the backward tracing.

Diagnosis:Locating Faulting Instruction

PC points to the address of the next instruction to be executed should f complete Complex CaseSimple Case

The process image at the time of crash does not include the specific address of f

For the complex case:

use monitored re-execution and programmable breakpoints.

Flow chart for identifying the Faulting instruction in the Complex Case

Diagnosis:Converting Case-IV Crashes

• Convert case-IV to other cases

• Idea: re-execution with non-overlapping memory layout.

• Caveat:– Applicable to programs that uses no more

than 1/3 of the available address space, which is between 1 and 2 GB on a 32-bit system. Most network service applications have small memory fingerprints.

Diagnosis: Re-execution with non-overlapping memory layout

First Memory Layout

Second MemoryLayout

Third MemoryLayout

c1, t1, f1 c1, t1c1, t1

c2, t2 c2, t2, f2c2, t2

At least two of the three instances will be non-Case-IV crashes. These twoinstances must crash at the same faulting instruction

Diagnosis:Tracing the Corrupting Instructions• We are left with three cases: I, II and III

• We can use the faulting instruction and network malicious inputs to eliminate easier cases.

Diagnosis:Tracing the Corrupting Instructions

Case III and IIB• I don’t quite get the solution for tracing case III

and IIB yet!!!!!!• The idea:

– The solution can not guarantee to trace back to the initial corrupting instruction if the data corrupted by this instruction is transformed in arbitrary ways before being used as a faulty address.

– Nevertheless, current solution works well in the experimental evaluation.

Outline

• System Description:– Modeling Memory Corruption Attacks– Diagnosing Memory Corruption Vulnerabilities– Automatic Response to Attacks

• Evaluation

• Conclusion: strengths, weaknesses, suggestions

Automatic Response To Attacks

• Basic Message Signature:– the (invalid) address y that corrupting instruction c

tries to write and the value x that c is writing.– Use critical byte sequences from the attack for the

message filter: x and/or y.• Correlating Message Signature with Program

Execution State:– Improve false positive.– Attacks happen only at some specific server

execution states. – Use the application’s call stack trace as an indication

of the server protocol state (program counter + return addresses).

Correlating Message Signature with Program Execution State

Outline


• Evaluation


Experimental EvaluationEffectiveness of Diagnosis

• Compare call stack traces: from the diagnosis algorithm AND from manual code inspection and debugging

• Correctly identify ALL the vulnerable functions at the time of corruption.

Experimental EvaluationAutomatic Response

• Complex protocol (OpenSSH): correlated message filtering helps.

• For ghttpd: binary signature vs. plain text URLs??? (no match for signatures)

Outline


• Evaluation


Strengths

• Propose a reactive approach for handling memory corruptions.

• Supposedly much faster (lower overhead) than full program execution monitoring (TaintCheck).

Weaknesses

• No report on performance overhead due to implementation issues with the prototype system.

• False negatives are possible.• Tracing corrupting instructions is not

complete:– in theory, some are still untraceable after

program crashes.– in practice, current findings seem to work well

Suggestions

• Obvious To-Do lists:– Fine tune the system prototype. Report

performance overhead– Work on not-traceable-yet scenarios for

corrupting instructions

Keywords to take home

• Randomized program, memory corruption attacks.

• State transition, memory corruption attack modeling

• Monitored re-execution and programmable breakpoints

• Re execution with non overlapping memory

Thank you

Questions?

References

• [N02] Josef Nelißen. Buffer Overflows for Dummies, May 1, 2002.

• [KJB+06] Chongkyung Kil, Jinsuk Jun, Christopher Bookholt, Jun Xu, and Peng Ning. Address Space Layout Permutation (ASLP): Towards Fine-grained Randomization of Commodity Software, ACSAC 06 (Dec 14, 2006) (.pdf and .ppt, plus their short paper in DSN 2006)

automatic diagnosis and response to memory corruption vulnerabilities authors: jun xu, peng ning,...

Documents

memory corruptioncase

absolute memory addresses

server process

successful attacks

automatic diagnosis

randomization kjb

time of crash

buffer overflow vulnerability