labeling library functions in stripped...
TRANSCRIPT
![Page 1: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/1.jpg)
PASTE 2011
Szeged, Hungary
September 5, 2011
Labeling Library Functions
in Stripped Binaries
Emily R. Jacobson, Nathan Rosenblum, and Barton P. Miller
Computer Sciences Department
University of Wisconsin - Madison
![Page 2: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/2.jpg)
Why Binary Code?
o Source code isn’t available
o Source code isn’t the right representation
2 Labeling Library Functions in Stripped Binaries
![Page 3: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/3.jpg)
Binary Tools Need Symbol Tables
o Debugging Tools
oGDB, IDA Pro…
o Instrumentation Tools
o PIN, Dyninst,…
o Static Analysis Tools
oCodeSurfer/x86,…
o Security Analysis Tools
o IDA Pro,…
3 Labeling Library Functions in Stripped Binaries
![Page 4: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/4.jpg)
Function locations
Complicated by:
oMissing symbol information
oVariability in function layout (e.g. code sharing,
outlined basic blocks)
oHigh degree of indirect control flow
program binary
Restoring Information
4 Labeling Library Functions in Stripped Binaries
targ80c3bd0 targ80c3df4 targ80c3df4
![Page 5: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/5.jpg)
What about semantic information?
o Program’s interaction with the operating system
(system calls) encapsulated by wrapper functions
Restoring Information
5 Labeling Library Functions in Stripped Binaries
Library fingerprinting: identify functions
based on patterns learned from exemplar libraries
program binary
targ80c3bd0 targ80c3df4 targ80c3df4
![Page 6: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/6.jpg)
stripped binary parsing
+
library fingerprinting
+
binary rewriting
unstrip
6 Labeling Library Functions in Stripped Binaries
targ80c3bd0 targ80c3df4 targ80c3df4
getpid accept
![Page 7: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/7.jpg)
<accept>:
mov %ebx, %edx
mov %0x66,%eax
mov $0x5,%ebx
lea 0x4(%esp),%ecx
int $0x80
mov %edx, %ebx
cmp %0xffffff83,%eax
jae syscall_error
ret
Set up system
call arguments
int $0x80 Invoke a system
call mov %edx, %ebx
cmp %0xffffff83,%eax
jae syscall_error
ret
Error check and
return
mov %ebx, %edx
mov %0x66,%eax
mov $0x5,%ebx
lea 0x4(%esp),%ecx
Save registers
![Page 8: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/8.jpg)
<accept>:
mov %ebx, %edx
mov %0x66,%eax
mov $0x5,%ebx
lea 0x4(%esp),%ecx
int $0x80
mov %edx, %ebx
cmp %0xffffff83,%eax
jae syscall_error
ret
int $0x80
mov %edx, %ebx
cmp %0xffffff83,%eax
jae syscall_error
ret
mov %ebx, %edx
mov %0x66,%eax
mov $0x5,%ebx
lea 0x4(%esp),%ecx
<accept>:
cmpl $0x0,%gs:0xc
jne 80f669c
mov %ebx, %edx
mov %0x66,%eax
mov $0x5,%ebx
lea 0x4(%esp),%ecx
call *0x814e93c
mov %edx, %ebx
cmp %0xffffff83,%eax
jae syscall_error
ret
push %esi
call enable_asyncancel
mov %eax,%esi
mov %ebx,%edx
mov $0x66,%eax
mov $0x5,%ebx
lea 0x8(%esp),%ecx
call *0x8181578
mov %edx, %ebx
xchg %eax,%esi
call disable_acynancel
mov %esi,%eax
pop %esi
cmp $0xffffff83,%eax
jae syscall_error
ret
<accept>:
cmpl $0x0,%gs:0xc
jne 80f669c
mov %ebx, %edx
mov %0x66,%eax
mov $0x5,%ebx
lea 0x4(%esp),%ecx
int $0x80
mov %edx, %ebx
cmp %0xffffff83,%eax
jae syscall_error
ret
push %esi
call enable_asyncancel
mov %eax,%esi
mov %ebx,%edx
mov $0x66,%eax
mov $0x5,%ebx
lea 0x8(%esp),%ecx
int $0x80
mov %edx, %ebx
xchg %eax,%esi
call disable_acynancel
mov %esi,%eax
pop %esi
cmp $0xffffff83,%eax
jae syscall_error
ret
glibc 2.5 on RHEL with GCC 3.4.4
The same function
can be realized in
a variety of ways
in the binary
glibc 2.5 on RHEL with GCC 4.1.2 glibc 2.2.4 on RHEL with GCC 2.95.3
![Page 9: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/9.jpg)
o Function inlining
o Code reordering
o Minor code changes
o Alternative code sequences
Binary-level Code Variations
9 Labeling Library Functions in Stripped Binaries
![Page 10: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/10.jpg)
Semantic Descriptors
o Rather than recording byte patterns, we take a
semantic approach
o Record information that is likely to be invariant
across multiple versions of the function
10
<accept>:
mov %ebx, %edx
mov %0x66,%eax
mov $0x5,%ebx
lea 0x4(%esp),%ecx
int $0x80
mov %edx, %ebx
cmp %0xffffff83,%eax
jae 8048300
ret
mov %esi,%esi
int $0x80
mov %0x66,%eax
mov $0x5,%ebx
{<socketcall >} , 5
Labeling Library Functions in Stripped Binaries
![Page 11: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/11.jpg)
Building Semantic Descriptors
11 Labeling Library Functions in Stripped Binaries
We parse an input binary, locate
system calls and wrapper function
calls, and employ dataflow analysis.
binary
reboot:
push %ebp
mov %esp,%ebp
sub $0x10,%esp
push %edi
push %ebx
mov 0x8(%ebp),%edx
mov $0xfee1dead,%edi
mov $0x28121969,%ecx
push %ebx
mov %edi,%ebx
mov $0x58,%eax
int $0x80
…
SYSTEM CALL
0x58 0x28121969
EAX EBX ECX
%edi
0xfee1dead
{<reboot, 0xfee1dead, 0x2812969>}
EAX
(reboot)
![Page 12: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/12.jpg)
Building Semantic Descriptors Recursively
12 Labeling Library Functions in Stripped Binaries
sethostid:
…
call open
…
call write
…
mov $0x6, eax
int $0x80
…
{ <close>}
open:
…
mov $0x5, eax
int $0x80
…
{<open, “/etc/hostid”, 577, 420>}
write:
…
mov $0x4, eax
int $0x80
…
{<write,?,?,4>}
{ <close>, <open, “/etc/hostid”, 577,420>, <write,?,?,4>}
![Page 13: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/13.jpg)
unstrip
Building a Descriptor Database
13 Labeling Library Functions in Stripped Binaries
Descriptor
Database
<accept>:
mov %ebx, %edx
mov %0x66,%eax
mov $0x5,%ebx
lea 0x4(%esp),%ecx
int $0x80
…
Locate wrapper functions
Build semantic
descriptors
{<socketcall, 5>}: accept
{<socketcall, 4>}: listen
{<getpid>}: getpid
…
glibc
reference
library
![Page 14: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/14.jpg)
glibc
reference
library glibc
reference
library glibc
reference
library glibc
reference
library
unstrip
Building a Descriptor Database
14 Labeling Library Functions in Stripped Binaries
Descriptor
Database Build semantic
descriptors
Locate wrapper functions
{<socketcall, 5>}: accept
{<socketcall, 4>}: listen
{<getpid>}: getpid
…
{<socketcall, 5>}: accept
{<socketcall, 4>}: listen
{<getpid>}: getpid
…
{<socketcall, 5>}: accept
{<socketcall, 4>}: listen
{<getpid>}: getpid
…
{<socketcall, 5>}: accept
{<socketcall, 4>}: listen
{<getpid>}: getpid
…
1
<accept>:
mov %ebx, %edx
mov %0x66,%eax
mov $0x5,%ebx
lea 0x4(%esp),%ecx
int $0x80
…
1
<accept>:
mov %ebx, %edx
mov %0x66,%eax
mov $0x5,%ebx
lea 0x4(%esp),%ecx
int $0x80
…
1
<accept>:
mov %ebx, %edx
mov %0x66,%eax
mov $0x5,%ebx
lea 0x4(%esp),%ecx
int $0x80
…
1
<accept>:
mov %ebx, %edx
mov %0x66,%eax
mov $0x5,%ebx
lea 0x4(%esp),%ecx
int $0x80
…
![Page 15: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/15.jpg)
o Two stages
1) Exact matches
2) Best match based on coverage criterion
o Handle minor code variations by allowing
flexible matches
Pattern Matching Criteria
15 Labeling Library Functions in Stripped Binaries
![Page 16: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/16.jpg)
Pattern Matching Criteria
16 Labeling Library Functions in Stripped Binaries
coverage(A,B) = 𝐴 𝐵
|𝐵|
A: {<socketcall,5>}
B: {<socketcall,5>, <socketcall,5>, <futex>} <socketcall,5> <socketcall,5>
coverage(A,B) = 2
3
<futex>
A B = { b ∈ B | b ∈ A }
fingerprint from the database
semantic descriptor from the code
![Page 17: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/17.jpg)
Multiple Matches
o It’s possible that two or more functions are
indistinguishable
o Policy decision: return set of potential matches
o In practice, we’ve observed 8% of functions have
multiple matches, but the size of the match set is
small (≤ 3)
17 Labeling Library Functions in Stripped Binaries
![Page 18: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/18.jpg)
unstrip
Identifying Functions in a Stripped Binary
18 Labeling Library Functions in Stripped Binaries
stripped
binary
unstripped
binary
Descriptor
Database
For each wrapper function
{
1. Build the semantic
descriptor.
2. Search the database
for a match (apply two-
stage matching process).
3. Add label to symbol
table.
}
![Page 19: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/19.jpg)
stripped binary parsing
+
library fingerprinting
+
binary rewriting
Implementation
19 Labeling Library Functions in Stripped Binaries
![Page 20: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/20.jpg)
Evaluation
o To evaluate across three dimensions of variation,
we constructed three data sets:
oGCC version
o glibc version
o distribution vendor
o In each set, compile statically-linked binaries, build
a DDB, compare unstrip to IDA Pro’s FLIRT
o Evaluation measure is accuracy
20 Labeling Library Functions in Stripped Binaries
![Page 21: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/21.jpg)
Evaluation Results: GCC Version Study
0
0.25
0.5
0.75
1
3.4.4 4.0.2 4.1.2 4.2.1
accu
racy
GCC 3.4.4 Patterns Predicting Each Library
unstrip
IDA Pro
21 Labeling Library Functions in Stripped Binaries
![Page 22: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/22.jpg)
Evaluation Results: glibc Version Study
0
0.25
0.5
0.75
1
2.2.4 2.3.2 2.3.4 2.5 2.11.1
accu
racy
glibc 2.2.4 Patterns Predicting Each Library
unstrip
IDA Pro
22 Labeling Library Functions in Stripped Binaries
![Page 23: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/23.jpg)
Evaluation Results: Distribution Study
0
0.25
0.5
0.75
1
Fedora Mandrivia OpenSuse Ubuntu
accu
racy
Fedora Patterns Predicting Each Library
unstrip
IDA Pro
23 Labeling Library Functions in Stripped Binaries
![Page 24: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/24.jpg)
24 Labeling Library Functions in Stripped Binaries
unstrip is available at
http://www.paradyn.org/html/tools/unstrip.html
![Page 25: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/25.jpg)
Backup slides follow
![Page 26: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/26.jpg)
Evaluation Results: GCC Version Study
(Temporal: backwards)
0
0.25
0.5
0.75
1
3.4.4 4.0.2 4.1.2 4.2.1
accu
racy
GCC 4.2.1 Patterns Predicting Each Library
unstrip
IDA Pro
26 Labeling Library Functions in Stripped Binaries
![Page 27: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/27.jpg)
Evaluation Results: glibc Version Study
(Temporal: backwards)
0
0.25
0.5
0.75
1
2.2.4 2.3.2 2.3.4 2.5 2.11.1
accu
racy
glibc 2.11.1 Patterns Predicting Each Library
unstrip
IDA Pro
27 Labeling Library Functions in Stripped Binaries
![Page 28: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/28.jpg)
Evaluation Results: Distribution Study
(one predicts the rest)
0
0.25
0.5
0.75
1
Fedora Mandrivia OpenSuse Ubuntu
accu
racy
Mandrivia Patterns Predicting Each Library
unstrip
IDA Pro
28 Labeling Library Functions in Stripped Binaries
![Page 29: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/29.jpg)
Evaluation Results: GCC Version Study
(one predicts the rest)
0
0.25
0.5
0.75
1
3.4.4 4.0.2 4.1.2 4.2.1
Accu
racy
GNU C Compiler Version
unstrip
IDA Pro
29 Labeling Library Functions in Stripped Binaries
![Page 30: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/30.jpg)
Evaluation Results: glibc Version Study
(one predicts the rest)
0
0.25
0.5
0.75
1
2.2.4 2.3.2 2.3.4 2.5 2.11.1
Accu
racy
glibc version
unstrip
IDA Pro
30 Labeling Library Functions in Stripped Binaries
![Page 31: Labeling Library Functions in Stripped Binariespages.cs.wisc.edu/~jacobson/pubs/jacobson-paste11-slides.pdf · Building Semantic Descriptors Labeling Library Functions in Stripped](https://reader030.vdocuments.us/reader030/viewer/2022040310/5d32573e88c9937a3b8d9613/html5/thumbnails/31.jpg)
Evaluation Results: Distribution Study
(one predicts the rest)
0
0.25
0.5
0.75
1
Fedora Mandrivia OpenSuse Ubuntu
Accu
racy
Distribution Vendor
unstrip
IDA Pro
31 Labeling Library Functions in Stripped Binaries