by david brumley, pongsin poosankam, dawn song, jiang zheng

SMU SRG reading by Tey Chee Meng:

Automatic Patch-Based Exploit Generation is Possible: Techniques and

Implications

by David Brumley, Pongsin Poosankam, Dawn Song, Jiang Zheng

What the paper is trying to achieve

Given 2 binaries

• Program Pif (input % 2 == 0) {

s = input + 2;

} else {

s = input + 3;

}

ptr = realloc (ptr, s);

/* use of ptr */

• Program P'if (input % 2 == 0) {

s = input + 2;

} else {

s = input + 3;

}

if (s <= input) {

/* exit with error */

}


/* use of ptr */

Create an 'exploit'

• Exploit as defined by paper:– input that crashes P– input causing information leakage– input that hijacks control flow

• Note: 'exploit' as defined by paper not the same 'exploit' as used in the security community which assumed– something usable– bypasses all counter measures

• Halvar Flake used the term "vulnerability trigger"

How it was done

Step 1: Compare the binary differences

• Program Pif (input % 2 == 0) {

s = input + 2;

} else {

s = input + 3;

}


/* use of ptr */


s = input + 2;

} else {

s = input + 3;

}

if (s <= input) {

/* exit with error */

}


/* use of ptr */

Step 2: Determine which is the vulnerable point

• Concerned with input sanitisation that is missing in P but added in P'

• Where there are many changes, use of heuristics:– minimal change => likely to be

added input sanitisation– lots of changes, maybe new

feature


s = input + 2;} else {

s = input + 3;}if (s <= input) {

/* exit with error */}ptr = realloc (ptr, s);/* use of ptr */

vul point

Step 3: Determine path(s) to the vulnerable point

• Path 1:– start point– (input % 2 == 0) is true– s = input + 2– (s <= input) is true– vulnerable point






• Path 2:– start point– (input % 2 == 0) is false– s = input + 3– (s <= input) is true– vulnerable point






• Not individual paths, but a graph of many paths:






• Single paths can be found via dynamic tracing, i.e. monitor the sequence of steps executed upon normal input

• Control flow graphs (CFG) determined via static analysis

• Combination:– find single path dynamically– choose any step in the path– determine statically the partial CFG from that step to

the vulnerable point

Step 4: Generate constraint formula

• From the start point to the vulnerable, the sequence of conditions that are met in P', but not in P– (input % 2 == 0) is true– s = input + 2– (s <= input) is true

• Constraint formula:– (input % 2 == 0) is true

AND (s <= input) is true AND s = input + 2

• Possible to generate constraint formula over a CFG





Step 5: Give constraint formula to solver for solution

• NP-hard problem => the larger the constraint formula, the longer (exponential time) it takes to solve

• Solution of example constraint formula:– (input % 2 == 0) is true AND (s <= input) is true– where s = input + 2– addition is mod 232

– possible answer: input = 232 - 2

• Polymorphic exploit: solve the new constraint formula:– (input % 2 == 0) is true AND (s <= input) is false AND (input != solutions_we_already_know)

– where s = input + 2– addition is mod 232

Step 6: Verify the 'exploit'

• There exists engines (TEMU) that can verify certain security policies, e.g. whether a return address on the stack is overwritten

• Verification:– Run software under engine with specified policy– Feed 'exploit' input– Examine results of engine– If negative, and other paths exists, try other paths

3rd party comments (Robert Graham, Halvar Flake)

• Exploit stated in paper not the same exploit used by others

• Able to generate input that triggers a vulnerability• Not yet a usable exploit that can:

– defeat security mechanisms (chk_esp (), safe_unlink ())– steal info for info-leakage or equivalent of shell code for hijack

control flow• Useful, but not yet ready to generate the equivalent of a

worm using this. Overstated the impact• Practical cases may involve large constraints beyond

capability of solver.• Automated part least time consuming of steps in

developing usable exploits

My comments

• Output of binary difference, which one is relevant ?

• For GDI vulnerability test case– vulnerable procedure: GetEvent ()– Static analysis start point: CopyMetaFileW ()– Remember solver cannot solve large constraints

quickly or it may run out of memory– How to automate finding of suitable start point for

static case ?

Conclusion

• Novel approach

• Overstated claims

by david brumley, pongsin poosankam, dawn song, jiang zheng

Documents