automatic diagnosis and response to memory corruption vulnerabilities authors: jun xu, peng ning,...
TRANSCRIPT
Automatic Diagnosis and Response to Memory Corruption
Vulnerabilities
Authors: Jun Xu, Peng Ning, Chongkyung Kil, Yan Zhai, Chris
Bookholt
In ACM CCS’05
Presenter: Tai Do
CDA 6938, Spring 2007
Introduction ([KJB+06])
• Memory Corruption Vulnerability– Popular means to take control of target
program– 49% of all attacks in 2006*– Successful attacks cause a remote code
execution– Attack techniques: stack overrun, heap
overflows, etc.
Why Randomization ([KJB+06])
• Most attacks use absolute memory addresses during memory corruption attacks.
• What is address space randomization?– randomizes the layout of process memory– makes the critical memory addresses unpredictable
and breaks the hard-coded address assumption.
• With address space randomization, a memory corruption attack will most likely cause a vulnerable program to crash, rather than allow the attacker to take control of the program.
Buffer Overflow Attack ([N02])
instruction pointer (IP or PC)
stack pointer
base pointer
Attacker’s code retAddrNOPNormal Process Layout
Attacker’s code retAddrNOP
Randomized Process Layout Adapt from [KJB+06]
How to identify security vulnerability from a program crash
• Open source web server ghttpd-1.4.
• Log(): server logging functionalities.
• Buffer overflow vulnerability: temp[]
• Address space randomization causes a working exploit to crash the server process.
• Manual Debug session (gdb) is time consuming and difficult: PC register and stack trace are both corrupted.
What are the goals of this paper?
• The proposed approach:– Automatically diagnose memory corruption
vulnerabilities (automatic backward tracing)– Automatic Response to Attacks (automatic
signature generation)
Outline
• System Description:– Modeling Memory Corruption Attacks– Diagnosing Memory Corruption Vulnerabilities
(Monitor and Diagnosis Engine)– Automatic Response to Attacks (Signature
Generator)
• Evaluation• Conclusion: strengths, weaknesses,
suggestions
System Overview
System Architecture
Modeling Memory Corruption Attacks
• Why modeling memory corruption attacks on a randomized program?– Useful abstract model to guide the later
discussion.– The model is just a finite state machine.– It shows possible cases that lead to program
crash.
Modeling: Two Cases with a Buffer Overflow Attack
Dereference a corrupted local variable
The corrupted return address is invalid, crash when the ret instruction is executed
Modeling: State Transition For A Memory Corruption
c: corrupting instructiont: takeover instructionf: faulting instruction
• Case 1 (green): Format String
• Case 2 and 3 (red and blue): buffer overflow
• Case 4 (purple): not sure
Diagnosing Memory Corruption Vulnerabilities
• Diagnosis means backward tracing to automatically locate the memory corruption vulnerabilities (the corrupting instruction)
• Similar to automated debugging process.
Diagnosis= Trace back
• Tracing back to the initial corrupting instruction consists of two steps:– Step 1: Convert Case-IV crashes to one of
three other cases– Step 2: Trace the corrupting instructions.
• Locating the faulting instruction is critical in both steps. This is the starting point of the backward tracing.
Diagnosis:Locating Faulting Instruction
PC points to the address of the next instruction to be executed should f complete Complex CaseSimple Case
The process image at the time of crash does not include the specific address of f
For the complex case:
use monitored re-execution and programmable breakpoints.
Flow chart for identifying the Faulting instruction in the Complex Case
Diagnosis:Converting Case-IV Crashes
• Convert case-IV to other cases
• Idea: re-execution with non-overlapping memory layout.
• Caveat:– Applicable to programs that uses no more
than 1/3 of the available address space, which is between 1 and 2 GB on a 32-bit system. Most network service applications have small memory fingerprints.
Diagnosis: Re-execution with non-overlapping memory layout
First Memory Layout
Second MemoryLayout
Third MemoryLayout
c1, t1, f1 c1, t1c1, t1
c2, t2 c2, t2, f2c2, t2
At least two of the three instances will be non-Case-IV crashes. These twoinstances must crash at the same faulting instruction
Diagnosis:Tracing the Corrupting Instructions• We are left with three cases: I, II and III
• We can use the faulting instruction and network malicious inputs to eliminate easier cases.
Diagnosis:Tracing the Corrupting Instructions
Case III and IIB• I don’t quite get the solution for tracing case III
and IIB yet!!!!!!• The idea:
– The solution can not guarantee to trace back to the initial corrupting instruction if the data corrupted by this instruction is transformed in arbitrary ways before being used as a faulty address.
– Nevertheless, current solution works well in the experimental evaluation.
Outline
• System Description:– Modeling Memory Corruption Attacks– Diagnosing Memory Corruption Vulnerabilities– Automatic Response to Attacks
• Evaluation
• Conclusion: strengths, weaknesses, suggestions
Automatic Response To Attacks
• Basic Message Signature:– the (invalid) address y that corrupting instruction c
tries to write and the value x that c is writing.– Use critical byte sequences from the attack for the
message filter: x and/or y.• Correlating Message Signature with Program
Execution State:– Improve false positive.– Attacks happen only at some specific server
execution states. – Use the application’s call stack trace as an indication
of the server protocol state (program counter + return addresses).
Correlating Message Signature with Program Execution State
Outline
• System Description:– Modeling Memory Corruption Attacks– Diagnosing Memory Corruption Vulnerabilities– Automatic Response to Attacks
• Evaluation
• Conclusion: strengths, weaknesses, suggestions
Experimental EvaluationEffectiveness of Diagnosis
• Compare call stack traces: from the diagnosis algorithm AND from manual code inspection and debugging
• Correctly identify ALL the vulnerable functions at the time of corruption.
Experimental EvaluationAutomatic Response
• Complex protocol (OpenSSH): correlated message filtering helps.
• For ghttpd: binary signature vs. plain text URLs??? (no match for signatures)
Outline
• System Description:– Modeling Memory Corruption Attacks– Diagnosing Memory Corruption Vulnerabilities– Automatic Response to Attacks
• Evaluation
• Conclusion: strengths, weaknesses, suggestions
Strengths
• Propose a reactive approach for handling memory corruptions.
• Supposedly much faster (lower overhead) than full program execution monitoring (TaintCheck).
Weaknesses
• No report on performance overhead due to implementation issues with the prototype system.
• False negatives are possible.• Tracing corrupting instructions is not
complete:– in theory, some are still untraceable after
program crashes.– in practice, current findings seem to work well
Suggestions
• Obvious To-Do lists:– Fine tune the system prototype. Report
performance overhead– Work on not-traceable-yet scenarios for
corrupting instructions
Keywords to take home
• Randomized program, memory corruption attacks.
• State transition, memory corruption attack modeling
• Monitored re-execution and programmable breakpoints
• Re execution with non overlapping memory
Thank you
Questions?
References
• [N02] Josef Nelißen. Buffer Overflows for Dummies, May 1, 2002.
• [KJB+06] Chongkyung Kil, Jinsuk Jun, Christopher Bookholt, Jun Xu, and Peng Ning. Address Space Layout Permutation (ASLP): Towards Fine-grained Randomization of Commodity Software, ACSAC 06 (Dec 14, 2006) (.pdf and .ppt, plus their short paper in DSN 2006)