23 hack in sight 2014
TRANSCRIPT
informative purposes.
Dear Readers,
Yurichev we are able to give you the 23th
release of Hack Insight Mag that will
introduce to you a wide topic, that is Reverse
Engineering.
experienced Reverse Engineer and
will be able to understand the process of
discovering the technological principles of
a device, objec, or system through analysis of
its structure, function and operation.
As it turns out, (technical) writing takes a lot
of effort and work. This book is free and
available in source code form 17 (LaTeX), and
it will be so forever.
If you want us to continue publishing on all
these topics you may consider donation for
the author's work and subscribing to Hack
Insight Mag.
pages. There are ≈ 300 TEX - files, ≈ 90 C/C++
source codes, ≈ 350 various listings. Keep in
mind that the price of books on the same topic
varies between $25 and $50.
Ways to donate are available on this page.
Ways to subscribe to Hack Insight are
available here.
release and you will share it with your
colleagues and friends.
Hack Insight Team
annual Subscription!
Subscribe to Hack Insight and stay update with advanced hacking and security
techniques. Our single subscription costs $174 and includes:
--> 24 unique publications per one year.
--> Access to all the previous releases from the first HiS issue.
--> 2 Special issues concerning "Best of Hack Insight" in each year.
Hack Insight Subscription is prepared for IT Security professionals, enthusiasts, engineers,
managers and geeks who are willing to improve advanced technical knowledge thanks to
our articles written by world class experts.
Our subscription covers many different topics, like: Network Scanning, Malware, Cloud
Security, DDoS, Hacking ID/Passwords, Mobile and Cyber Security, Reverse Engineering,
WiFi Vulnerabilities and much more.
You can obviously download and read a few examples from our free content bookmark:
Read Hack Insight Free Content
II Impor tant f undamentals 312
III Finding impor tant /inter esting stu _ in the code 315
IV OS-specif ic 336
VII Other things 493
IX Exer cises 515
Appendix 570
0.1 Pr ef ace
Here are some of my notes about reverse engineering in English language f or those beginners who would like to learn to
understand x86 (which accounts f or almost all executable so_ware in the world) and ARM code created by C/C++ compilers. There are several popular meanings of the term“reverse engineering”: 1) reverse engineering of so_ware: researching of
compiled programs; 2) 3D model scanning and reworking in order to make a copy of it; 3) recreating DBMS 3 structure. These
notes are related to the f irst meaning.
Topics discussed
x86, ARM.
Topics touched
Oracle RDBMS (58), Itanium (66), copy-protection dongles (55), LD _PRELOAD (47.2), stack overf low, ELF 4, win32 PE f ile f or-mat (48.2),
x86-64 (23.1), critical sections (48.4), syscalls (46), TLS 5, position-independent code (PIC6) (47.1), prof ile-guided
optimization (68.1), C++ STL (29.4), OpenMP (65), SEH ().
Mini-FAQ 7
· Q : Should one learn to understand assembly language these days? A: Yes: in order to have deeper understanding of the internals and to debug your so_ware better and f aster.
· Q : Should one learn to write in assembly language these days?
A: Unless one writes low-level OS 8 code, probably no.
· Q : But what about writing highly optimized routines?
A: No, modern C/C++ compilers do this job better.
· Q : Should I learn microprocessor internals? A: Modern CPU9-s are very complex. If you do not plan to write highly optimized code or if you do not work on compiler’s code
generator then you may still learn internals in bare outlines. 10 . At the same time, in order to understand and analyze
compiled code it is enough to know only ISA11, register’s descriptions, i.e., the “outside” part of a CPU that is
available to an application programmer.
· Q : So why should I learn assembly language anyway? A: Mostly to better understand what is going on while debugging and f or reverse engineering without source code,
including, but not limited to, malware.
· Q : How would I search f or a reverse engineering job? A: There are hiring threads that appear f rom time to time on reddit devoted to RE
12 (2013 Q 3, 2014). Try to take a look there.
3 Database management systems
4 Executable f ile f ormat widely used in *NIX system including Linux
5Thread Local Storage 6Position Independent Code: 47.1 7 Frequently Asked Q uestions
8Operating System 9 Central processing unit
10 Very good text about it: [10] 11Instruction Set Architecture
12http://www.reddit.com/r/ReverseEngineering/
xv
About the author CONTENTS
Dennis Yur ichev is an exper ienced r ever se engineer and pr ogr ammer . His CV is avail- able on his website13.
Thanks
Andr ey her m1t Bar anovich, Slava Avid Kazakov, Stanislav Beaver Bobr ytskyy, Alexander Lysenko, Alexander Lstar
Cher nenkiy, Andr ew Zubinski, Vladimir Botov, Mar k Logxen Cooper , Shell Rocket, Yuan Jochen Kang, Ar naud Patar d (r tp
on #debian-ar m IRC), and all the f olks on github.com who have contr ibuted notes and corr ections.
A lot of LT X packages wer e used: I would thank their author s as well.
Pr aise f or Rever se Engineer ing f or Beginner s
· It‘s ver y well done .. and f or fr ee .. amazing. Daniel Bilar , Siege Technologies, LLC.
· ...excellent and fr ee5 Pete Finnigan, Or acle RDBMS secur ity gur u.
· ... book is inter esting, gr eat job! Michael Sikor ski, author of Pr actical Malwar e Analysis: The Hands-On Guide to Dis- secting Malicious So _war e.
· ... my compliments f or the ver y nice tutor ial! Her ber t Bos, f ull pr of essor at the Vr ije Univer siteit Amster dam.
· ... It is amazing and unbelievable. Luis Rocha, CISSP / ISSAP, Technical Manager , Networ k & Inf or mation Secur ity at Ver izon Business.
· Thanks f or the gr eatwor k and your book. Jor is van de Vis, SAP Netweaver & Secur ity specialist.
· ... r easonable intr o to some of the techniques.6 (Mike Stay, teacher at the Feder al Law Enf or cement Tr aining Center , Geor gia, US.)
Donate
As it tur ns out, (technical) wr iting takes a lot of e _or t and wor k.
This book is fr ee, available fr eely and available in sour ce code f or m 17 (LaTeX), and itwill be so f or ever .
My curr ent plan f or this book is to add lots of inf or mation about: PL ANS18. If you wantme to continue wr iting on all these topics you may consider donating.
I wor ked mor e than year on this book 19, ther e ar e mor e than 500 pages. Ther e ar e ≈ 300 T X-f iles, ≈ 90 C/C++ sour ce
codes, ≈ 350 var ious listings. Pr ice of other books on the same sub ject var ies between $20 and $50 on amazon.com. Ways to donate ar e available on the page: http://yur ichev.com/donate.html
Ever y donor ‘s name willbe included in the book! Donor s also have a r ight to ask me to r earr ange items in my wr iting plan. Why not tr y to publish? Because it‘s technical liter atur e which, as I believe, cannot be f inished or fr ozen in paper state.
Such technical r ef er ences akin to Wikipedia or MSDN20 libr ar y. They can evolve and gr ow indef initely. Someone can sit down
13http://yur ichev.com/Dennis _ Yur ichev.pdf 14https://twitter .com/daniel _bilar /status/436578617221742593 15https://twitter .com/petef innigan/status/400551705797869568 16http://www.r eddit.com/r /I Am A/comments/24nb6f/i _ was _a_pr of essional _passwor d_ cr acker _ who_ taught/ 17https://github.com/dennis714/RE-f or-beginner s
18https://github.com/dennis714/RE-f or-beginner s/blob/master /PL ANS 19Initial git commit f r om Mar ch 2013:
https://github.com/dennis714/RE-f or-beginner s/tr ee/1e57ef 540d827c7f 7a92f cb3a4626af 3e13c7ee4 20Micr oso_ Developer Networ k
xvi
0.1. PREFACE CONTENTS
and wr ite ever ything fr om the begin to the end, publish it and f or get about it. As it tur ns out, it‘s not me. I have ever yday
thoughts like that was wr itten badly and can be r ewr itten better , that was a bad example, I know a better one, that is
also a thing I can explain better and shor ter , etc. As you may see in commit histor y of this book‘s sour ce code, I make a lot of
small changes almost ever y day: https://github.com/dennis714/RE-f or-beginner s/commits/master . So the book will pr obably be a r olling r elease as they say about Linux distr os like Gentoo. No f ixed r eleases (and dead-
lines) at all, but continuous development. I don‘t know how long it will take to wr ite all I know. Maybe 10 year s or mor e. Of
cour se, it is not ver y convenient f or r eader s who want something stable, but all I can o_ er is a ChangeLog 21 f ile ser ving as a
what‘s new section. Those who ar e inter ested may check it fr om time to time, or my blog/twitter 2 .
Donor s
9 * anonymous, 2 * Oleg Vygovsky, DanielBilar , James Tr uscott, Luis Rocha, Jor is van de Vis, Richar d S Shultz, Jang Minchang, Shade Atlas, Yao Xiao, Pawel Szczur , Justin Simms, Shawn the R0ck, Ki Chan Ahn, Tr iop AB, Ange Alber tini, Ser gey Lukianov, Ludvig Gislason, Gér ar d Labadie.
About illustr ations
Those r eader s who ar e used to r ead a lot in the Inter net, expects seeing illustr ations at the places wher e they should be. It‘s
because ther e ar e no pages at all, only single one. It‘s not possible to place illustr ations in the book at the suitable context. So, in this book, illustr ations can be at the end of section, and a r ef er enceses in the textmay be pr esent, like f ig.1.1.
21https://github.com/dennis714/RE-f or-beginner s/blob/master /ChangeLog 22http://blog.yur ichev.com/ https://twitter .com/yur ichev
When I f ir st lear ned C and then C++, I wr ote small pieces of code, compiled them, and saw what was pr oduced in the
assembly language. This was easy f or me. I did itmany times and the r elation between the C/C++ code andwhat the compiler
pr oduced was impr inted in my mind so deep that I can quickly under stand what was in the or iginal C code when I look at pr oduced x86 code. Per haps this technique may be helpf ul f or someone else so I will tr y to descr ibe some examples her e.
Ther e ar e a lot of examples f or both x86/x64 and ARM. Those who alr eady f amiliar with one of ar chitectur es, may fr eely
skim over pages.
Chapter 1
Shor t intr oduction to the CPU
The CPU is the unitwhich executes all of the pr ogr ams. Shor t glossar y:
Instr uction : a pr imitive command to the CPU. Simplest examples: moving data between r egister s, wor king with memor y,
ar ithmetic pr imitives. As a r ule, each CPU has its own instr uction set ar chitectur e (ISA).
Machine code : code f or the CPU. Each instr uction is usually encoded by sever al bytes.
Assembly language : mnemonic code and some extensions like macr os which ar e intended to make a pr ogr ammer ‘s lif e
easier .
CPU r egister : Each CPU has a f ixed set of gener al pur pose r egister s (GPR1). ≈ 8 in x86, ≈ 16 in x86-64, ≈ 16 in ARM. The
easiestway to under stand a r egister is to think of it as an untyped tempor ar y var iable. Imagine you ar e wor king with a
high-level PL2 and you have only 8 32-bit var iables. A lot of things can be done using only these!
What is the di _ er ence between machine code and a PL? It is much easier f or humans to use a high-level PL like C/C++, Java, Python, etc., but it is easier f or a CPU to use a much lower level of abstr action. Per haps, itwould be possible to invent
a CPU which can execute high-level PL code, but itwould be much mor e complex. On the contr ar y, it is ver y inconvenient f or
humans to use assembly language due to its low-levelness. Besides, it is ver y har d to do itwithoutmaking a huge amount of
annoying mistakes. The pr ogr am which conver ts high-level PL code into assembly is called a compiler .
1Gener al Pur pose Register s 2Pr ogr amming language
3
Chapter 2
Hello, world!
Let‘s star twith the f amous example fr om the book The C pr ogr amming Language [17]:
#include <stdio.h>
r etur n 0;
cl 1.cpp /Fa1.asm
Listing 2.1: MSVC 2010
_ TEXT SEGMENT
xor eax, eax
_ TEXT ENDS
MSVC pr oduces assembly listings in Intel-syntax. The di _ er ence between Intel-syntax and AT&T-syntax will be discussed
her ea _ er . The compiler gener ated 1.ob j f ile will be linked into 1.exe. In our case, the f ile contain two segments: CONST (f or data constants) and _ TEXT (f or code). The str ing hello, wor ld in C/C++ has type const char *, however it does not have its own name. The compiler needs to dealwith the str ing somehow so it def ines the inter nal name $SG3830 f or it. So the example may be r ewr itten as:
4
#include <stdio.h>
int main()
};
Let‘s back to the assembly listing. As we can see, the str ing is ter minated by a zer o byte which is standar df or C/C++ str ings. Mor e about C str ings: 36.1.
In the code segment, _ TEXT, ther e is only one f unction so f ar : main().
The f unction main() star ts with pr ologue code and ends with epilogue code (like almost any f unction) 1. A _ er the f unction pr ologue we see the call to the pr intf () f unction: C ALL _pr intf . Bef or e the call the str ing addr ess (or a pointer to it) containing our gr eeting is placed on the stack with the help of the
PUSH instr uction.
When the pr intf () f unction r etur ns f low contr ol to the main() f unction, str ing addr ess (or pointer to it) is still in stack. Since we do not need it anymor e the stack pointer (the ESP r egister ) needs to be corr ected. ADD ESP, 4 means add 4 to the value in the ESP r egister .
Why 4? Since it is 32-bit code we need exactly 4 bytes f or addr ess passing thr ough the stack. It is 8 bytes in x64-code.
ADD ESP, 4 is e _ ectively equivalent to POP r egister butwithout using any r egister 2. Some compiler s (like Intel C++ Compiler ) in the same situation may emit POP ECX instead of ADD (e.g. such a patter n can
be obser ved in the Or acle RDBMS code as it is compiled by Intel C++ compiler ). This instr uction has almost the same e _ ect but the ECX r egister contents will be r ewr itten.
The Intel C++ compiler pr obably uses POP ECX since this instr uction‘s opcode is shor ter then ADD ESP, x (1 byte against
3). Read mor e about the stack in section (4).
A _ er the call to pr intf (), in the or iginal C/C++ code was r etur n 0 —r etur n 0 as the r esult of the main() f unction. In the gener ated code this is implemented by instr uction XOR EAX, EAX
XOR is in f act, just eXclusive OR but compiler s o_ en use it instead of MOV EAX, 0 —again because it is a slightly shor ter
opcode (2 bytes against 5). Some compiler s emit SUB EAX, EAX, which means SUBtr act the value in the EAX fr om the value in EAX, which in any case
will r esult zer o.
The last instr uction RET r etur ns contr ol f low to the caller . Usually, it is C/C++ CRT4 code which in tur n r etur ns contr ol to
the OS.
2.1.2 GCC—x86
Now let‘s tr y to compile the same C/C++ code in the GCC 4.4.1 compiler in Linux:gcc 1.c -o 1
Next, with the assistance of the ID A5 disassembler , let‘s see how themain() f unction was cr eated. (ID A, like MSVC, shows code in Intel-syntax). N.B. We could also have GCC pr oduce assembly listings in Intel-syntax by applying the options -S -masm=intel
Listing 2.2: GCC
mov [esp+10h+var _10], eax
call _pr intf
1Read mor e about it in section about f unction pr olog and epilog (3). 2CPU f lags, however , ar e modif ied 3http://en.wikipedia.or g/wiki/Exclusive_or 4C r untime libr ar y: sec:CRT
5Inter active Disassembler
mov eax, 0
r etn
main endp
The r esult is almost the same. The addr ess of the hello, wor ld str ing (stor ed in the data segment) is saved in theEAX
r egister f ir st and then it is stor ed on the stack. Also in the f unction pr ologue we see AND ESP, 0FFFFFFF0h —this instr uction
aligns the value in the ESP r egister on a 16-byte boundar y. This r esults in all values in the stack being aligned. (The CPU
perf or ms better if the values it is dealing with ar e located in memor y at addr esses aligned on a 4- or 16-byte boundar y)6. SUB ESP, 10h allocates 16 bytes on the stack. Although, as we can see her ea _ er , only 4 ar e necessar y her e. This is because the size of the allocated stack is also aligned on a 16-byte boundar y. The str ing addr ess (or a pointer to the str ing) is then wr itten dir ectly onto the stack space without using the PUSH instr uc-
tion. var _ 10 —is a local var iable and is also an ar gument f or pr intf (). Read about it below.
Then the pr intf () f unction is called. Unlike MSVC, when GCC is compiling without optimization tur ned on, it emits MOV EAX, 0 instead of a shor ter opcode. The last instr uction, LEAVE —is the equivalent of the MOV ESP, EBP and POP EBP instr uction pair —in other wor ds, this
instr uction sets the stack pointer (ESP) back and r estor es the EBP r egister to its initial state. This is necessar y since we modif ied these r egister values (ESP and EBP) at the beginning of the f unction (executing MOV
EBP, ESP / AND ESP, ...).
2.1.3 GCC: AT&T syntax
Let‘s see how this can be r epr esented in the AT&T syntax of assembly language. This syntax is much mor e popular in the
UNIX-wor ld.
gcc -S 1_1.c
We get this:
.text
pushl %ebp
.cf i _off set 5, -8
movl %esp, %ebp
andl $-16, %esp
subl $16, %esp
movl $.LC0, (%esp)
leave
.cf i _def _ cf a 4, 4
r et
.section .note.GNU-stack,"",@pr ogbits
6
2.2. X86-64 CH APTER 2. HELLO, WORLD!
Ther e ar e a lot of macr os (beginning with dot). These ar e not ver y inter esting to us so f ar . For now, f or the sake of sim-
plif ication, we can ignor e them (except the .str ing macr o which encodes a null-ter minated char acter sequence just like a
C-str ing). Then we‘ll see this :
Listing 2.5: GCC 4.7.3
main:
leave
r et
Some of the ma jor di _ er ences between Intel and AT&T syntax ar e:
· Oper ands ar e wr itten backwar ds.
In Intel-syntax: <instr uction> <destination oper and> <sour ce oper and>.
In AT&T syntax: <instr uction> <sour ce oper and> <destination oper and>.
Her e is a way to think about them: when you deal with Intel-syntax, you can put in equality sign (=) in your mind
between oper ands and when you dealwith AT&T-syntax put in a r ight arr ow (→) 8.
· AT&T: Bef or e r egister names a per cent sign must be wr itten (%) and bef or e number s a dollar sign ($). Par entheses ar e
used instead of br ackets.
· AT&T: A special symbol is to be added to each instr uction def ining the type of data:
– l — long (32 bits)
– b — byte (8 bits)
Let‘s go back to the compiled r esult: it is identical to what we saw in ID A. With one subtle di _ er ence:0FFFFFFF0h is
wr itten as $-16. It is the same: 16 in the decimal system is 0x10 in hexadecimal. -0x10 is equal to 0xFFFFFFF0 (f or a 32-bit data type).
One mor e thing: the r etur n value is to be set to 0 by using usual MOV, not XOR. MOV just loads value to a r egister . Its name
is not intuitive (data is notmoved). In other ar chitectur es, this instr uction has the name load or something like that.
2.2 x86-64
Listing 2.6: MSVC 2012 x64
$SG2989 DB ‘hello, wor ld‘, 00H
add r sp, 40
main ENDP
7This GCC option can be used to eliminate unnecessar y macr os:-f no-asynchr onous-unwind-tables 8 By the way, in some C standar d f unctions (e.g., memcpy(), str cpy()) ar guments ar e listed in the same way as in Intel-syntax: pointer to destination
memor y block at the beginning and then pointer to sour ce memor y block.
7
7
2.2. X86-64 CH APTER 2. HELLO, WORLD!
As of x86-64, all r egister s wer e extended to 64-bit and now have a R- pr ef ix. In or der to use the stack less o_ en (in other
wor ds, to access exter nal memor y less o_ en), ther e exists a popular way to pass f unction ar guments via r egister s (f astcall:
44.3). I.e., one par t of f unction ar guments ar e passed in r egister s, other par t—via stack. In Win64, 4 f unction ar guments ar e
passed in RCX, RDX, R8, R9 r egister s. That is whatwe see her e: a pointer to the str ing f or pr intf () is now passed not in stack, but in the RCX r egister .
Pointer s ar e 64-bit now, so they ar e passed in the 64-bit par t of r egister s (which have the R- pr ef ix). But f or backwar d
compatibility, it is still possible to access 32-bit par ts, using the E- pr ef ix. This is how R AX/EAX/ AX/ AL looks like in 64-bit x86-compatible CPUs:
7th (byte number )
RAXx64
AH AL
The main() f unction r etur ns an int-typed value, which is, in the C PL, f or better backwar d compatibility and por tability, still 32-bit, so that is why the EAX r egister is clear ed at the f unction end (i.e., 32-bit par t of r egister ) instead of R AX.
2.2.2 GCC—x86-64
Listing 2.7: GCC 4.4.6 x64
.str ing "hello, wor ld"
main:
xor eax, eax ; number of vector r egister s passed
call pr intf xor eax, eax
add r sp, 8
r et
A method to pass f unction ar guments in r egister s is also used in Linux, *BSD and Mac OS X [21]. The f ir st 6 ar guments ar e
passed in the RDI, RSI, RDX, RCX, R8, R9 r egister s, and other s—via stack. So the pointer to the str ing is passed in EDI (32-bit par t of r egister ). Butwhy not use the 64-bit par t, RDI?
It is impor tant to keep in mind that all MOV instr uctions in 64-bit mode wr iting something into the lower 32-bit r egister
par t, also clear the higher 32-bits [14]. I.e., the MOV EAX, 011223344h will wr ite a value corr ectly into R AX, since the higher
bits will be clear ed. If we open the compiled ob ject f ile (.o), we will also see all instr uction‘s opcodes :
Listing 2.8: GCC 4.4.6 x64
.text:00000000004004D0 main pr oc near
.text:00000000004004D0 48 83 EC 08 sub r sp, 8
.text:00000000004004D4 BF E8 05 40 00 mov edi, off set f or mat ; "hello, wor ld"
.text:00000000004004D9 31 C0 xor eax, eax
.text:00000000004004DB E8 D8 FE FF FF call _pr intf
.text:00000000004004E0 31 C0 xor eax, eax
.text:00000000004004E2 48 83 C4 08 add r sp, 8
.text:00000000004004E6 C3 r etn
.text:00000000004004E6 main endp
As we can see, the instr uction wr iting into EDI at 0x4004D4 occupies 5 bytes. The same instr uction wr iting a 64-bit value
into RDI will occupy 7 bytes. Appar ently, GCC is tr ying to save some space. Besides, it can be sur e that the data segment
containing the str ing will not be allocated at the addr esses higher than 4GiB. We also see EAX r egister clear ance bef or e pr intf () f unction call. This is done because a number of used vector r egister s
is passed in EAX by standar d: with var iable ar guments passes inf or mation about the number of vector r egister s used [21].
9This should be enabled in Options → Disassembly → Number of opcode bytes
8
9
2.3. ARM CH APTER 2. HELLO, WORLD!
ar e swapped (f or thumb and thumb-2 modes). For instr uctions in ARM mode, the or der is the f our th byte, then the thir d, then
the second and f inally the f ir st (due to di _ er ent endianness). So as we can see, the MOVW, MOVT.W and BLX instr uctions begin
with 0xFx.
One of the thumb-2 instr uctions is MOVW R0, #0x13D8 —itwr ites a 16-bit value into the lower par t of the R0 r egister . Also, MOVT.W R0, #0 wor ks just like MOVT fr om the pr evious example but itwor ks in thumb-2.
Among other di _ er ences, her e the BLX instr uction is used instead of BL. The di _ er ence is that, besides saving the RA27
in the LR r egister and passing contr ol to the puts() f unction, the pr ocessor is also switching fr om thumb mode to ARM (or
back). This instr uction is placed her e since the instr uction to which contr ol is passed looks like (it is encoded in ARM mode):
__ symbolstub1:00003FEC _puts ; CODE XREF: _hello_ wor ld+E
__ symbolstub1:00003FEC 44 F0 9F E5 LDR PC, = __ imp__puts
So, the obser vant r eader may ask: why not call puts() r ight at the point in the code wher e it is needed?
Because it is not ver y space-e _ icient. Almost any pr ogr am uses exter nal dynamic libr ar ies (like DLL in Windows, .so in *NIX or .dylib in Mac OS X). O _ en-used
libr ar y f unctions ar e stor ed in dynamic libr ar ies, including the standar d C-f unction puts(). In an executable binar y f ile (Windows PE .exe, ELF or Mach-O) an impor t section is pr esent. This is a list of symbols (f unc-
tions or global var iables) being impor ted fr om exter nalmodules along with the names of these modules. The OS loader loads allmodules it needs and, while enumer ating impor t symbols in the pr imar y module, deter mines the
corr ect addr esses of each symbol.
Inour case, __ imp __ puts is a 32-bit var iable wher e the OS loader willwr ite the corr ectaddr ess of the f unction inanexter nal libr ar y. Then the LDR instr uction just takes the 32-bit value fr om this var iable and wr ites it into the PC r egister , passing contr ol
to it.
So, in or der to r educe the time that an OS loader needs f or doing this pr ocedur e, it is good idea f or it to wr ite the addr ess
of each symbol only once to a specially-allocated place just f or it. Besides, as we have alr eady f igur ed out, it is impossible to load a 32-bit value into a r egister while using only one instr uc-
tion without a memor y access. So, it is optimal to allocate a separ ate f unction wor king in ARM mode with only one goal —to pass contr ol to the dynamic libr ar y and then to jump to this shor t one-instr uction f unction (the so-called thunk f unction) fr om thumb-code.
By the way, in the pr evious example (compiledf or ARM mode) contr olpassed by the BL instr uction goes to the same thunk
f unction. However the pr ocessor mode is not switched (hence the absence of an X in the instr uction mnemonic).
27Retur n Addr ess
Chapter 3
Function pr ologue and epilogue
A f unction pr ologue is a sequence of instr uctions at the star t of a f unction. It o_ en looks something like the f ollowing code
fr agment:
push ebp
mov ebp, esp
sub esp, X
What these instr uction do: saves the value in the EBP r egister , sets the value of the EBP r egister to the value of the ESP
and then allocates space on the stack f or local var iables. The value in the EBP is f ixed over a per iod of f unction execution and it is to be used f or local var iables and ar guments
access. One can use ESP, but it is changing over time and it is not convenient. The f unction epilogue fr ees allocated space in the stack, r etur ns the value in the EBP r egister back to initial state and
r etur ns the contr ol f low to callee:
mov esp, ebp
r et 0
Function pr ologues and epilogues ar e usually detected in disassembler s f or f unction delimitation fr om each other .
3.1 Recur sion
Epilogues and pr ologues can make r ecur sion perf or mance wor se. For example, once upon a time I wr ote a f unction to seek the corr ectnode in a binar y tr ee. As a r ecur sive f unction itwould
look stylish but since additional time is to be spend at each f unction call f or the pr ologue/epilogue, it was wor king a couple
of times slower than an iter ative (r ecur sion-fr ee) implementation.
By the way, that is the r eason compiler s use tail call.
13
Chapter 4
Stack
A stack is one of the most f undamental data str uctur es in computer science 1.
Technically, it is just a block of memor y in pr ocess memor y along with the ESP or RSP r egister in x86 or x64, or the SP
r egister in ARM, as a pointer within the block. The most fr equently used stack access instr uctions ar e PUSH and POP (in both x86 and ARM thumb-mode). PUSH subtr acts
4 in32-bitmode (or 8 in64-bitmode) fr om ESP/RSP/SP andthenwr ites the contents of its sole oper and to the memor y addr ess
pointed to by ESP/RSP/SP. POP is the r ever se oper ation: get the data fr om memor y pointed to by SP, put it in the oper and (o_ en a r egister ) and then
add 4 (or 8) to the stack pointer . A _ er stack allocation the stack pointer points to the end of stack. PUSH decr eases the stack pointer and POP incr eases it.
The end of the stack is actually at the beginning of the memor y allocated f or the stack block. It seems str ange, but that‘s the
way it is. Never theless ARM not only has instr uctions suppor ting descending stacks but also ascending stacks.
For example the STMFD2/LDMFD3, STMED4/LDMED5 instr uctions ar e intended to deal with a descending stack. The
STMFA6/LMDFA7, STME A8/LDME A9 instr uctions ar e intended to dealwith an ascending stack.
4.1 Why does the stack gr ow backwar d?
Intuitively, we might think that, like any other data str uctur e, the stack may gr ow upwar d, i.e., towar ds higher addr esses. The r eason the stack gr ows backwar d is pr obably histor ical. When computer s wer e big and occupied a whole r oom, it
was easy to divide memor y into two par ts, one f or the heap and one f or the stack. Of cour se, it was unknown how big the
heap and the stack would be dur ing pr ogr am execution, so this solution was the simplest possible.
Star t of heap Star t of stack
Heap Stack
In [26] we can r ead:
The user -cor e par t of an image is divided into thr ee logical segments. The pr ogr am text segment begins
at location 0 in the vir tual addr ess space. Dur ing execution, this segment is wr ite-pr otected and a single
copy of it is shar ed among all pr ocesses executing the same pr ogr am. At the f ir st 8K byte boundar y above
1http://en.wikipedia.or g/wiki/Call _ stack 2Stor e Multiple Full Descending 3Load Multiple Full Descending 4Stor e Multiple Empty Descending 5Load Multiple Empty Descending 6Stor e Multiple Full Ascending 7Load Multiple Full Ascending 8Stor e Multiple Empty Ascending 9Load Multiple Empty Ascending
14
4.2. WH AT IS THE ST ACK USED FOR? CH APTER 4. ST ACK
the pr ogr am text segment in the vir tual addr ess space begins a nonshar ed, wr itable data segment, the size
of which may be extended by a system call. Star ting at the highest addr ess in the vir tual addr ess space is a
stack segment, which automatically gr ows downwar d as the har dwar e‘s stack pointer f luctuates.
4.2 What is the stack used f or ?
4.2.1 Save the r etur n addr ess wher e a f unctionmust r etur n contr ol a _ er execution
x86
While calling another f unction with a C ALL instr uction the addr ess of the point exactly a _ er the C ALL instr uction is saved to
the stack and then an unconditional jump to the addr ess in the C ALL oper and is executed. The C ALL instr uction is equivalent to a PUSH addr ess _after _ call / JMP oper and instr uction pair . RET f etches a value fr om the stack and jumps to it —it is equivalent to a POP tmp / JMP tmp instr uction pair . Overf lowing the stack is str aightf or war d. Just r un eter nal r ecur sion:
void f ()
c:\tmp6>cl ss.cpp /Fass.asm
Micr osoft (R) 32-bit C/C++ Optimizing Compiler Ver sion 15.00.21022.08 f or 80x86
Copyr ight (C) Micr osoft Cor por ation. All r ights r eser ved.
ss.cpp
c:\tmp6\ss.cpp(4) : war ning C4717: ‘f ‘ : r ecur sive on all contr ol paths, f unction will cause >
Ç r untime stack over f low
. . .but gener ates the r ight code anyway:
?f @@YAXXZ PROC ; f
; File c:\tmp6\ss.cpp
?f @@YAXXZ ENDP ; f
. . . Also if we tur n on optimization (/Ox option) the optimized code will not overf low the stack but instead will wor k cor - r ectly10:
?f @@YAXXZ PROC ; f
; File c:\tmp6\ss.cpp
jmp SHORT $LL3@f ?f @@YAXXZ ENDP ; f
GCC 4.4.1 gener ates similar code in both cases, although without issuing any war ning about the pr oblem.
10ir ony her e
4.2. WH AT IS THE ST ACK USED FOR? CH APTER 4. ST ACK
4.2.3 Local var iable stor age
A f unction could allocate space in the stack f or its local var iables just by shi _ ing the stack pointer towar ds the stack bottom. It is also not a r equir ement. You could stor e local var iables wher ever you like, but tr aditionally this is how it‘s done.
4.2.4 x86: alloca() f unction
It is wor th noting the alloca() f unction.14.
This f unction wor ks like malloc() but allocates memor y just on the stack. The allocated memor y chunk does not need to be fr eed via a f r ee() f unction call since the f unction epilogue (3) will
r etur n ESP back to its initial state and the allocated memor y will be just annulled. It is wor th noting how alloca() is implemented. In simple ter ms, this f unction just shi _ s ESP downwar ds towar d the stack bottom by the number of bytes you need and
sets ESP as a pointer to the allocated block. Let‘s tr y:
#if def __ GNUC __
#if def __ GNUC __
snpr intf (buf, 600, "hi! %d, %d, %d\n", 1, 2, 3); // GCC
#else
_ snpr intf (buf, 600, "hi! %d, %d, %d\n", 1, 2, 3); // MSVC
#endif
};
( _ snpr intf () f unction wor ks just like pr intf (), but instead of dumping the r esult into stdout (e.g., to ter minal or con- sole), itwr ites to the buf bu_ er . puts() copies buf contents to stdout. Of cour se, these two f unction calls might be r eplaced
by one pr intf () call, but I would like to illustr ate small bu_ er usage.)
MSVC
Listing 4.1: MSVC 2010
add esp, 28 ; 0000001cH
14In MSVC, the f unction implementation can be f ound in alloca16.asm and chkstk.asm in C:\Pr ogr am Files (x86)\Micr osoft Visual Studio
10.0\VC\cr t\sr c\intel
17
...
The sole alloca() ar gument is passed via EAX (instead of pushing into stack) 15. A _ er the alloca() call, ESP points to
the block of 600 bytes and we can use it as memor y f or the buf arr ay.
GCC + Intel syntax
GCC 4.4.1 can do the same without calling exter nal f unctions:
Listing 4.2: GCC 4.7.3
f:
lea ebx, [esp+39]
and ebx, -16 ; align pointer by 16-bit bor der mov DWORD PTR [esp], ebx ; s
mov DWORD PTR [esp+20], 3
mov DWORD PTR [esp+16], 2
mov DWORD PTR [esp+12], 1
mov DWORD PTR [esp+8], OFFSET FL AT:.LC0 ; "hi! %d, %d, %d\n"
mov DWORD PTR [esp+4], 600 ; maxlen
call _ snpr intf mov DWORD PTR [esp], ebx ; s
call puts
leave
GCC + AT&T syntax
Let‘s see the same code, but in AT&T syntax:
Listing 4.3: GCC 4.7.3
f:
movl $1, 12(%esp)
movl $.LC0, 8(%esp)
movl $600, 4(%esp)
call puts
leave
15It is because alloca() is r ather compiler intr insic (63) than usual f unction. One of the r eason ther e is a separ ate f unction instead of couple instr uctions just in the code, because MSVC16 implementation of the alloca() f unction also
has a code which r eads f r om the memor y just allocated, in or der to letOS to map physicalmemor y to this VM17 r egion.
18
4.3. TYPIC AL ST ACK L AYOUT CH APTER 4. ST ACK
r et
The code is the same as in the pr evious listing. N.B. E.g. movl $3, 20(%esp) is analogous to mov DWORD PTR [esp+20], 3 in Intel-syntax —when addr essing mem-
or y in f or m r egister +o _ set, it is wr itten in AT&T syntax as off set(%r egister).
4.2.5 (Windows) SEH
SEH18 r ecor ds ar e also stor ed on the stack (if they pr esent).. Read mor e about it: (48.3).
4.2.6 Bu _ er overf low pr otection
Mor e about it her e (16.2).
4.3 Typical stack layout
. . . . . .
ESP-0xC local var iable #2, mar ked in ID A as var _8
ESP-8 local var iable #1, mar ked in ID A as var _4
ESP-4 saved value of EBP
ESP r etur n addr ess
. . . . . .
19
Chapter 5
pr intf ()with sever al ar guments
Now let‘s extend theHello, wor ld! (2) example, r eplacing pr intf () in the main() f unction body by this:
#include <stdio.h>
{
pr intf ("a=%d; b=%d; c=%d", 1, 2, 3);
r etur n 0;
5.1.1 MSVC
Let‘s compile it by MSVC 2010 Expr ess and we got:
$SG3830 DB ‘a=%d; b=%d; c=%d‘, 00H
...
call _pr intf add esp, 16 ; 00000010H
Almost the same, but now we can see the pr intf () ar guments ar e pushed onto the stack in r ever se or der . The f ir st
ar gument is pushed last. By the way, var iables of int type in 32-bit envir onment have 32-bitwidth, that is 4 bytes. So, we have her e 4 ar guments. 4 ∗ 4 = 16 —they occupy exactly 16 bytes in the stack: a 32-bit pointer to a str ing and 3
number s of type int. When the stack pointer (ESP r egister ) is corr ected by ADD ESP, X instr uction a _ er a f unction call, o_ en, the number
of f unction ar guments can be deduced her e: just divide X by 4.
Of cour se, this is r elated only to cdecl calling convention. See also the section about calling conventions (44). It is also possible f or the compiler to mer ge sever al ADD ESP, X instr uctions into one, a _ er the last call:
push a1
push a2
5.1. X86: 3 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
push a2
push a3
5.1.2 MSVC and OllyDbg
Now let‘s tr y to load this example in OllyDbg. It is one of the most popular user -land win32 debugger . We can tr y to compile
our example in MSVC 2012 with /MD option, meaning, to link against MSVCR*.DLL, so we will able to see impor ted f unctions
clear ly in debugger . Then load executable in OllyDbg. The ver y f ir st br eakpoint is in ntdll.dll, pr ess F9 (r un). The second br eakpoint is in
CRT-code. Now we should f ind main() f unction. Find this code by scr olling the code to the ver y bottom (MSVC allocates main() f unction at the ver y beginning of the code
section): f ig. 5.3.
Click on PUSH EBP instr uction, pr ess F2 (set br eakpoint) and pr ess F9 (r un). We need to do these manupulations in or der
to skip CRT-code, because, we don‘t r eally inter esting in it yet.
Pr ess F8 (step over ) 6 times, i.e., skip 6 instr uctions: f ig. 5.4.
Now the PC points to the C ALL pr intf instr uction. OllyDbg, like other debugger s, highlights value of r egister s which wer e
changed. So each time you pr ess F8, EIP is changing and its value looking r ed. ESP is changing as well, because values ar e
pushed into the stack.
Wher e ar e the values in the stack? Take a look into r ight/bottom window of debugger :
Figur e 5.1: OllyDbg: stack a _ er values pushed (I made r ound r ed mar k her e in gr aphics editor )
So we can see ther e 3 columns: addr ess in the stack, value in the stack and some additionalOllyDbg comments. OllyDbg
under stands pr intf ()-like str ings, so it r epor ts the str ing her e and 3 values attached to it. It is possible to r ight-click on the f or mat str ing, click on Follow in dump, and the f or mat str ing will appear in the window
at the le _ -bottom par t, wher e some memor y par t is always seen. These memor y values can be edited. It is possible to change
the f or mat str ing, and then the r esult of our example will be di _ er ent. It is pr obably not ver y usef ul now, but it‘s ver y good
idea f or doing it as exer cise, to get f eeling how ever ything is wor ks her e. Pr ess F8 (step over ). In the console we‘ll see the output:
Figur e 5.2: pr intf () f unction executed
Let‘s see how r egister s and stack state ar e changed: f ig. 5.5.
EAX r egister now contains 0xD (13). That‘s corr ect,pr intf () r etur ns number of char acter s pr inted. EIP value is changed: indeed, now ther e is addr ess of the instr uction a _ er C ALL pr intf . ECX and EDX values ar e changed as well. Appar ently, pr intf () f unction‘s hidden machiner y used them f or its own needs.
A ver y impor tant thing is that ESP value is not changed. And stack state too! We clear ly see that f or mat str ing and cor - r esponding 3 values ar e still ther e. Indeed, that‘scdecl calling convention, calling f unction doesn‘t clear ar guments in stack.
It‘s caller ‘s duty to do so. Pr ess F8 again to execute ADD ESP, 10 instr uction: f ig. 5.6.
21
5.1. X86: 3 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
ESP is changed, but values ar e still in the stack! Yes, of cour se, no one needs to f ill these values by zer o or something like
that. Because, ever ything above stack pointer (SP) is noise or gar bage, it has no value at all. It would be time consuming to
clear unused stack entr ies, besides, no one r eally needs to.
Figur e 5.3: OllyDbg: the ver y star t of the main() f unction
22
5.1. X86: 3 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
Figur e 5.4: OllyDbg: bef or e pr intf () execution
Figur e 5.5: OllyDbg: a _ er pr intf () execution
23
5.1. X86: 3 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
Figur e 5.6: OllyDbg: a _ er ADD ESP, 10 instr uction execution
5.1.3 GCC
Now let‘s compile the same pr ogr am in Linux using GCC 4.4.1 and take a look in ID A whatwe got:
main pr oc near
mov ebp, esp
and esp, 0FFFFFFF0h
sub esp, 10h
mov eax, off set a ADBDCD ; "a=%d; b=%d; c=%d"
mov [esp+10h+var _4], 3
mov [esp+10h+var _8], 2
mov [esp+10h+var _ C], 1
mov [esp+10h+var _10], eax
call _pr intf mov eax, 0
leave
r etn
main endp
It can be said that the di _ er ence between code fr om MSVC and code fr om GCC is only in the method of placing ar guments
on the stack. Her e GCC is wor king dir ectly with the stack without PUSH/POP.
24
5.1. X86: 3 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
5.1.4 GCC and GDB
Let‘s tr y this example also in GDB in Linux. -g mean pr oduce debug inf or mation into executable f ile.
$ gcc 1.c -g -o 1
Copyr ight (C) 2013 Fr ee Softwar e Foundation, Inc.
License GPLv3+: GNU GPL ver sion 3 or later <http://gnu.or g/licenses/gpl.html>
This is f r ee softwar e: you ar e f r ee to change and r edistr ibute it.
Ther e is NO W ARR ANTY, to the extent per mitted by law. Type "show copying"
and "show warr anty" f or details.
This GDB was conf igur ed as "i686-linux-gnu".
For bug r epor ting instr uctions, please see:
<http://www.gnu.or g/softwar e/gdb/bugs/>...
Listing 5.1: let‘s set br eakpoint onpr intf ()
(gdb) b pr intf Br eakpoint 1 at 0x80482f 0
Run. Ther e ar e no pr intf () f unction sour ce code her e, so GDB can‘t show its sour ce, butmay do so.
(gdb) r un
Star ting pr ogr am: /home/dennis/polygon/1
Br eakpoint 1, __pr intf (f or mat=0x80484f 0 "a=%d; b=%d; c=%d") at pr intf.c:29
29 pr intf.c: No such f ile or dir ector y.
Pr int 10 stack elements. Le _ column is an addr ess in stack.
(gdb) x/10w $esp
0xbffff 12c: 0x00000003 0x08048460 0x00000000 0x00000000
0xbffff 13c: 0xb7e29905 0x00000001
The ver y f ir st element is RA (0x0804844a). We can be sur e in it by disassembling the memor y at this addr ess:
(gdb) x/5i 0x0804844a
0x804844f <main+50>: leave
0x8048451: xchg %ax,%ax
0x8048453: xchg %ax,%ax
Two XCHG instr uctions, appar ently, is some r andom gar bage, which we can ignor e so f ar . The second element (0x080484f 0) is an addr ess of f or mat str ing:
(gdb) x/s 0x080484f 0
0x80484f 0: "a=%d; b=%d; c=%d"
Other 3 elements (1, 2, 3) ar e pr intf () ar guments. Other elements may be just gar bage pr esent in stack, but also may
be values fr om other f unctions, their local var iables, etc. We can ignor e it yet. Execute f inish. This mean, execute till f unction end. Her e itmeans: execute till the f inish of pr intf ().
(gdb) f inish
Run till exit f r om #0 __pr intf (f or mat=0x80484f 0 "a=%d; b=%d; c=%d") at pr intf.c:29
main () at 1.c:6
Value r etur ned is $2 = 13
1GNU debugger
5.1. X86: 3 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
GDB shows what pr intf () r etur ned in EAX (13). This is number of char acter s pr inted, just like in the example with Olly-
Dbg. We also see r etur n 0; and the inf or mation that this expr ession is in the1.c f ile at the line 6. Indeed, the 1.c f ile is located
in the curr ent dir ector y, and GDB f inds the str ing ther e. How GDB knows, which C-code line is being executed now? This is
r elated to the f act that compiler , while gener ating debugging inf or mation, also saves a table of r elations between sour ce
code line number s and instr uction addr esses. GDB is sour ce-level debugger , a _ er all. Let‘s examine r egister s. 13 inEAX:
(gdb) inf o r egister s
eax 0xd 13
ecx 0x0 0
edx 0x0 0
esi 0x0 0
edi 0x0 0
...
Let‘s disassemble curr ent instr uctions. Arr ow points to the instr uction being executed next.
(gdb) disas
0x0804841d <+0>: push %ebp
0x08048420 <+3>: and $0xfffffff 0,%esp
0x08048423 <+6>: sub $0x10,%esp
0x08048426 <+9>: movl $0x3,0xc(%esp)
0x0804842e <+17>: movl $0x2,0x8(%esp)
0x08048436 <+25>: movl $0x1,0x4(%esp)
0x0804843e <+33>: movl $0x80484f 0,(%esp)
0x08048445 <+40>: call 0x80482f 0 <pr intf @plt>
=> 0x0804844a <+45>: mov $0x0,%eax
0x0804844f <+50>: leave
0x08048450 <+51>: r et
End of assembler dump.
GDB shows disassembly in AT&T syntax by def ault. It‘s possible to switch to Intel syntax:
(gdb) set disassembly-f lavor intel (gdb) disas
Dump of assembler code f or f unction main:
0x0804841d <+0>: push ebp
0x0804841e <+1>: mov ebp,esp
0x08048423 <+6>: sub esp,0x10
=> 0x0804844a <+45>: mov eax,0x0
0x08048450 <+51>: r et
End of assembler dump.
Execute next instr uction. GDB shows ending br acket, meaning, this is ending block of f unction.
(gdb) step
7 };
Let‘s see r egister s a _ er MOV EAX, 0 instr uction execution. EAX her e is zer o indeed.
26
5.2. X64: 8 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
(gdb) inf o r egister s
eax 0x0 0
ecx 0x0 0
edx 0x0 0
esi 0x0 0
edi 0x0 0
...
5.2 x64: 8 ar guments
To see how other ar guments will be passed via the stack, let‘s change our example again by incr easing the number of ar gu- ments to be passed to 9 (pr intf () f or mat str ing + 8 int var iables):
#include <stdio.h>
{
pr intf ("a=%d; b=%d; c=%d; d=%d; e=%d; f =%d; g=%d; h=%d\n", 1, 2, 3, 4, 5, 6, 7, 8);
r etur n 0;
5.2.1 MSVC
As we saw bef or e, the f ir st 4 ar guments ar e passed in the RCX, RDX, R8, R9 r egister s in Win64, while all the r est—via the stack. That is what we see her e. However , the MOV instr uction, instead of PUSH, is used f or pr epar ing the stack, so the values ar e
wr itten to the stack in a str aightf or war d manner .
Listing 5.2: MSVC 2012 x64
mov r 9d, 3
mov r 8d, 2
call pr intf
5.2. X64: 8 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
5.2.2 GCC
In *NIX OS-es, it‘s the same stor y f or x86-64, except that the f ir st 6 ar guments ar e passed in theRDI, RSI, RDX, RCX, R8, R9
r egister s. All the r est—via the stack. GCC gener ates the code wr iting str ing pointer into EDI instead if RDI—we saw this thing
bef or e: 2.2.2. We also saw bef or e the EAX r egister being clear ed bef or e a pr intf () call: 2.2.2.
Listing 5.3: GCC 4.4.6 -O3 x64
.LC0:
.str ing "a=%d; b=%d; c=%d; d=%d; e=%d; f =%d; g=%d; h=%d\n"
main:
xor eax, eax ; number of vector r egister s passed
mov DWORD PTR [r sp+16], 8
mov DWORD PTR [r sp+8], 7
mov DWORD PTR [r sp], 6
call pr intf
$ gcc -g 2.c -o 2
Copyr ight (C) 2013 Fr ee Softwar e Foundation, Inc.
License GPLv3+: GNU GPL ver sion 3 or later <http://gnu.or g/licenses/gpl.html>
This is f r ee softwar e: you ar e f r ee to change and r edistr ibute it.
Ther e is NO W ARR ANTY, to the extent per mitted by law. Type "show copying"
and "show warr anty" f or details.
This GDB was conf igur ed as "x86_64-linux-gnu".
For bug r epor ting instr uctions, please see:
<http://www.gnu.or g/softwar e/gdb/bugs/>...
Reading symbols f r om /home/dennis/polygon/2...done.
Listing 5.4: let‘s set br eakpoint topr intf (), and r un
(gdb) b pr intf Br eakpoint 1 at 0x400410
(gdb) r un
Star ting pr ogr am: /home/dennis/polygon/2
Br eakpoint 1, __pr intf (f or mat=0x400628 "a=%d; b=%d; c=%d; d=%d; e=%d; f =%d; g=%d; h=%d\n") at >
Ç pr intf.c:29
29 pr intf.c: No such f ile or dir ector y.
28
5.2. X64: 8 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
Register s RSI/RDX/RCX/R8/R9 has the values which ar e should be ther e. RIP has an addr ess of the ver y f ir st instr uction of
the pr intf () f unction.
(gdb) inf o r egister s
r ax 0x0 0
r bx 0x0 0
r cx 0x3 3
r dx 0x2 2
r si 0x1 1
r di 0x400628 4195880
r 8 0x4 4
r 9 0x5 5
r 11 0x7ffff 7a65f 60 140737348263776
r 12 0x400440 4195392
r 14 0x0 0
r 15 0x0 0
...
Listing 5.5: let‘s inspect f or mat str ing
(gdb) x/s $r di 0x400628: "a=%d; b=%d; c=%d; d=%d; e=%d; f =%d; g=%d; h=%d\n"
Let‘s dump stack with x/g command this time—g meansgiantwor ds, i.e., 64-bitwor ds.
(gdb) x/10g $r sp
0x7fffffff df 48: 0x0000000000000007 0x00007fff 00000008
0x7fffffff df 58: 0x0000000000000000 0x0000000000000000
0x7fffffff df 68: 0x00007ffff 7a33de5 0x0000000000000000
0x7fffffff df 78: 0x00007fffffff e048 0x0000000100000000
The ver y f ir st stack element, just like in pr evious case, is RA. 3 values ar e also passed in stack: 6, 7, 8. We also see that 8
is passed with high 32-bits not clear ed: 0x00007fff 00000008. That‘s OK, because, values hasint type, which is 32-bit type. So, high r egister or stack element par tmay contain r andom gar bage.
If to take a look, wher e contr ol f low will r etur n a _ er pr intf () execution, GDB will show the whole main() f unction:
(gdb) set disassembly-f lavor intel (gdb) disas 0x0000000000400576
Dump of assembler code f or f unction main:
0x000000000040052d <+0>: push r bp
0x000000000040052e <+1>: mov r bp,r sp
0x0000000000400531 <+4>: sub r sp,0x20
0x0000000000400535 <+8>: mov DWORD PTR [r sp+0x10],0x8
0x000000000040053d <+16>: mov DWORD PTR [r sp+0x8],0x7
0x0000000000400545 <+24>: mov DWORD PTR [r sp],0x6
0x000000000040054c <+31>: mov r 9d,0x5
0x0000000000400552 <+37>: mov r 8d,0x4
0x0000000000400558 <+43>: mov ecx,0x3
0x000000000040055d <+48>: mov edx,0x2
0x0000000000400562 <+53>: mov esi,0x1
0x0000000000400567 <+58>: mov edi,0x400628
0x000000000040056c <+63>: mov eax,0x0
0x0000000000400576 <+73>: mov eax,0x0
5.3. ARM: 3 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
Let‘s f inishpr intf () execution, execute the instr uction zer oing EAX, take a notice that EAX r egister has exactly zer o. RIP
now points to the LEAVE instr uction, i.e., penultimate in main() f unction.
(gdb) f inish
Run till exit f r om #0 __pr intf (f or mat=0x400628 "a=%d; b=%d; c=%d; d=%d; e=%d; f =%d; g=%d; h=%>
Ç d\n") at pr intf.c:29
a=1; b=2; c=3; d=4; e=5; f =6; g=7; h=8
main () at 2.c:6
Value r etur ned is $1 = 39
r ax 0x0 0
r bx 0x0 0
r cx 0x26 38
r si 0x7fffff d9 2147483609
r di 0x0 0
r 8 0x7ffff 7dd26a0 140737351853728
r 9 0x7ffff 7a60134 140737348239668
r 10 0x7fffffff d5b0 140737488344496
r 11 0x7ffff 7a95900 140737348458752
r 12 0x400440 4195392
r 14 0x0 0
Dear Readers,
Yurichev we are able to give you the 23th
release of Hack Insight Mag that will
introduce to you a wide topic, that is Reverse
Engineering.
experienced Reverse Engineer and
will be able to understand the process of
discovering the technological principles of
a device, objec, or system through analysis of
its structure, function and operation.
As it turns out, (technical) writing takes a lot
of effort and work. This book is free and
available in source code form 17 (LaTeX), and
it will be so forever.
If you want us to continue publishing on all
these topics you may consider donation for
the author's work and subscribing to Hack
Insight Mag.
pages. There are ≈ 300 TEX - files, ≈ 90 C/C++
source codes, ≈ 350 various listings. Keep in
mind that the price of books on the same topic
varies between $25 and $50.
Ways to donate are available on this page.
Ways to subscribe to Hack Insight are
available here.
release and you will share it with your
colleagues and friends.
Hack Insight Team
annual Subscription!
Subscribe to Hack Insight and stay update with advanced hacking and security
techniques. Our single subscription costs $174 and includes:
--> 24 unique publications per one year.
--> Access to all the previous releases from the first HiS issue.
--> 2 Special issues concerning "Best of Hack Insight" in each year.
Hack Insight Subscription is prepared for IT Security professionals, enthusiasts, engineers,
managers and geeks who are willing to improve advanced technical knowledge thanks to
our articles written by world class experts.
Our subscription covers many different topics, like: Network Scanning, Malware, Cloud
Security, DDoS, Hacking ID/Passwords, Mobile and Cyber Security, Reverse Engineering,
WiFi Vulnerabilities and much more.
You can obviously download and read a few examples from our free content bookmark:
Read Hack Insight Free Content
II Impor tant f undamentals 312
III Finding impor tant /inter esting stu _ in the code 315
IV OS-specif ic 336
VII Other things 493
IX Exer cises 515
Appendix 570
0.1 Pr ef ace
Here are some of my notes about reverse engineering in English language f or those beginners who would like to learn to
understand x86 (which accounts f or almost all executable so_ware in the world) and ARM code created by C/C++ compilers. There are several popular meanings of the term“reverse engineering”: 1) reverse engineering of so_ware: researching of
compiled programs; 2) 3D model scanning and reworking in order to make a copy of it; 3) recreating DBMS 3 structure. These
notes are related to the f irst meaning.
Topics discussed
x86, ARM.
Topics touched
Oracle RDBMS (58), Itanium (66), copy-protection dongles (55), LD _PRELOAD (47.2), stack overf low, ELF 4, win32 PE f ile f or-mat (48.2),
x86-64 (23.1), critical sections (48.4), syscalls (46), TLS 5, position-independent code (PIC6) (47.1), prof ile-guided
optimization (68.1), C++ STL (29.4), OpenMP (65), SEH ().
Mini-FAQ 7
· Q : Should one learn to understand assembly language these days? A: Yes: in order to have deeper understanding of the internals and to debug your so_ware better and f aster.
· Q : Should one learn to write in assembly language these days?
A: Unless one writes low-level OS 8 code, probably no.
· Q : But what about writing highly optimized routines?
A: No, modern C/C++ compilers do this job better.
· Q : Should I learn microprocessor internals? A: Modern CPU9-s are very complex. If you do not plan to write highly optimized code or if you do not work on compiler’s code
generator then you may still learn internals in bare outlines. 10 . At the same time, in order to understand and analyze
compiled code it is enough to know only ISA11, register’s descriptions, i.e., the “outside” part of a CPU that is
available to an application programmer.
· Q : So why should I learn assembly language anyway? A: Mostly to better understand what is going on while debugging and f or reverse engineering without source code,
including, but not limited to, malware.
· Q : How would I search f or a reverse engineering job? A: There are hiring threads that appear f rom time to time on reddit devoted to RE
12 (2013 Q 3, 2014). Try to take a look there.
3 Database management systems
4 Executable f ile f ormat widely used in *NIX system including Linux
5Thread Local Storage 6Position Independent Code: 47.1 7 Frequently Asked Q uestions
8Operating System 9 Central processing unit
10 Very good text about it: [10] 11Instruction Set Architecture
12http://www.reddit.com/r/ReverseEngineering/
xv
About the author CONTENTS
Dennis Yur ichev is an exper ienced r ever se engineer and pr ogr ammer . His CV is avail- able on his website13.
Thanks
Andr ey her m1t Bar anovich, Slava Avid Kazakov, Stanislav Beaver Bobr ytskyy, Alexander Lysenko, Alexander Lstar
Cher nenkiy, Andr ew Zubinski, Vladimir Botov, Mar k Logxen Cooper , Shell Rocket, Yuan Jochen Kang, Ar naud Patar d (r tp
on #debian-ar m IRC), and all the f olks on github.com who have contr ibuted notes and corr ections.
A lot of LT X packages wer e used: I would thank their author s as well.
Pr aise f or Rever se Engineer ing f or Beginner s
· It‘s ver y well done .. and f or fr ee .. amazing. Daniel Bilar , Siege Technologies, LLC.
· ...excellent and fr ee5 Pete Finnigan, Or acle RDBMS secur ity gur u.
· ... book is inter esting, gr eat job! Michael Sikor ski, author of Pr actical Malwar e Analysis: The Hands-On Guide to Dis- secting Malicious So _war e.
· ... my compliments f or the ver y nice tutor ial! Her ber t Bos, f ull pr of essor at the Vr ije Univer siteit Amster dam.
· ... It is amazing and unbelievable. Luis Rocha, CISSP / ISSAP, Technical Manager , Networ k & Inf or mation Secur ity at Ver izon Business.
· Thanks f or the gr eatwor k and your book. Jor is van de Vis, SAP Netweaver & Secur ity specialist.
· ... r easonable intr o to some of the techniques.6 (Mike Stay, teacher at the Feder al Law Enf or cement Tr aining Center , Geor gia, US.)
Donate
As it tur ns out, (technical) wr iting takes a lot of e _or t and wor k.
This book is fr ee, available fr eely and available in sour ce code f or m 17 (LaTeX), and itwill be so f or ever .
My curr ent plan f or this book is to add lots of inf or mation about: PL ANS18. If you wantme to continue wr iting on all these topics you may consider donating.
I wor ked mor e than year on this book 19, ther e ar e mor e than 500 pages. Ther e ar e ≈ 300 T X-f iles, ≈ 90 C/C++ sour ce
codes, ≈ 350 var ious listings. Pr ice of other books on the same sub ject var ies between $20 and $50 on amazon.com. Ways to donate ar e available on the page: http://yur ichev.com/donate.html
Ever y donor ‘s name willbe included in the book! Donor s also have a r ight to ask me to r earr ange items in my wr iting plan. Why not tr y to publish? Because it‘s technical liter atur e which, as I believe, cannot be f inished or fr ozen in paper state.
Such technical r ef er ences akin to Wikipedia or MSDN20 libr ar y. They can evolve and gr ow indef initely. Someone can sit down
13http://yur ichev.com/Dennis _ Yur ichev.pdf 14https://twitter .com/daniel _bilar /status/436578617221742593 15https://twitter .com/petef innigan/status/400551705797869568 16http://www.r eddit.com/r /I Am A/comments/24nb6f/i _ was _a_pr of essional _passwor d_ cr acker _ who_ taught/ 17https://github.com/dennis714/RE-f or-beginner s
18https://github.com/dennis714/RE-f or-beginner s/blob/master /PL ANS 19Initial git commit f r om Mar ch 2013:
https://github.com/dennis714/RE-f or-beginner s/tr ee/1e57ef 540d827c7f 7a92f cb3a4626af 3e13c7ee4 20Micr oso_ Developer Networ k
xvi
0.1. PREFACE CONTENTS
and wr ite ever ything fr om the begin to the end, publish it and f or get about it. As it tur ns out, it‘s not me. I have ever yday
thoughts like that was wr itten badly and can be r ewr itten better , that was a bad example, I know a better one, that is
also a thing I can explain better and shor ter , etc. As you may see in commit histor y of this book‘s sour ce code, I make a lot of
small changes almost ever y day: https://github.com/dennis714/RE-f or-beginner s/commits/master . So the book will pr obably be a r olling r elease as they say about Linux distr os like Gentoo. No f ixed r eleases (and dead-
lines) at all, but continuous development. I don‘t know how long it will take to wr ite all I know. Maybe 10 year s or mor e. Of
cour se, it is not ver y convenient f or r eader s who want something stable, but all I can o_ er is a ChangeLog 21 f ile ser ving as a
what‘s new section. Those who ar e inter ested may check it fr om time to time, or my blog/twitter 2 .
Donor s
9 * anonymous, 2 * Oleg Vygovsky, DanielBilar , James Tr uscott, Luis Rocha, Jor is van de Vis, Richar d S Shultz, Jang Minchang, Shade Atlas, Yao Xiao, Pawel Szczur , Justin Simms, Shawn the R0ck, Ki Chan Ahn, Tr iop AB, Ange Alber tini, Ser gey Lukianov, Ludvig Gislason, Gér ar d Labadie.
About illustr ations
Those r eader s who ar e used to r ead a lot in the Inter net, expects seeing illustr ations at the places wher e they should be. It‘s
because ther e ar e no pages at all, only single one. It‘s not possible to place illustr ations in the book at the suitable context. So, in this book, illustr ations can be at the end of section, and a r ef er enceses in the textmay be pr esent, like f ig.1.1.
21https://github.com/dennis714/RE-f or-beginner s/blob/master /ChangeLog 22http://blog.yur ichev.com/ https://twitter .com/yur ichev
When I f ir st lear ned C and then C++, I wr ote small pieces of code, compiled them, and saw what was pr oduced in the
assembly language. This was easy f or me. I did itmany times and the r elation between the C/C++ code andwhat the compiler
pr oduced was impr inted in my mind so deep that I can quickly under stand what was in the or iginal C code when I look at pr oduced x86 code. Per haps this technique may be helpf ul f or someone else so I will tr y to descr ibe some examples her e.
Ther e ar e a lot of examples f or both x86/x64 and ARM. Those who alr eady f amiliar with one of ar chitectur es, may fr eely
skim over pages.
Chapter 1
Shor t intr oduction to the CPU
The CPU is the unitwhich executes all of the pr ogr ams. Shor t glossar y:
Instr uction : a pr imitive command to the CPU. Simplest examples: moving data between r egister s, wor king with memor y,
ar ithmetic pr imitives. As a r ule, each CPU has its own instr uction set ar chitectur e (ISA).
Machine code : code f or the CPU. Each instr uction is usually encoded by sever al bytes.
Assembly language : mnemonic code and some extensions like macr os which ar e intended to make a pr ogr ammer ‘s lif e
easier .
CPU r egister : Each CPU has a f ixed set of gener al pur pose r egister s (GPR1). ≈ 8 in x86, ≈ 16 in x86-64, ≈ 16 in ARM. The
easiestway to under stand a r egister is to think of it as an untyped tempor ar y var iable. Imagine you ar e wor king with a
high-level PL2 and you have only 8 32-bit var iables. A lot of things can be done using only these!
What is the di _ er ence between machine code and a PL? It is much easier f or humans to use a high-level PL like C/C++, Java, Python, etc., but it is easier f or a CPU to use a much lower level of abstr action. Per haps, itwould be possible to invent
a CPU which can execute high-level PL code, but itwould be much mor e complex. On the contr ar y, it is ver y inconvenient f or
humans to use assembly language due to its low-levelness. Besides, it is ver y har d to do itwithoutmaking a huge amount of
annoying mistakes. The pr ogr am which conver ts high-level PL code into assembly is called a compiler .
1Gener al Pur pose Register s 2Pr ogr amming language
3
Chapter 2
Hello, world!
Let‘s star twith the f amous example fr om the book The C pr ogr amming Language [17]:
#include <stdio.h>
r etur n 0;
cl 1.cpp /Fa1.asm
Listing 2.1: MSVC 2010
_ TEXT SEGMENT
xor eax, eax
_ TEXT ENDS
MSVC pr oduces assembly listings in Intel-syntax. The di _ er ence between Intel-syntax and AT&T-syntax will be discussed
her ea _ er . The compiler gener ated 1.ob j f ile will be linked into 1.exe. In our case, the f ile contain two segments: CONST (f or data constants) and _ TEXT (f or code). The str ing hello, wor ld in C/C++ has type const char *, however it does not have its own name. The compiler needs to dealwith the str ing somehow so it def ines the inter nal name $SG3830 f or it. So the example may be r ewr itten as:
4
#include <stdio.h>
int main()
};
Let‘s back to the assembly listing. As we can see, the str ing is ter minated by a zer o byte which is standar df or C/C++ str ings. Mor e about C str ings: 36.1.
In the code segment, _ TEXT, ther e is only one f unction so f ar : main().
The f unction main() star ts with pr ologue code and ends with epilogue code (like almost any f unction) 1. A _ er the f unction pr ologue we see the call to the pr intf () f unction: C ALL _pr intf . Bef or e the call the str ing addr ess (or a pointer to it) containing our gr eeting is placed on the stack with the help of the
PUSH instr uction.
When the pr intf () f unction r etur ns f low contr ol to the main() f unction, str ing addr ess (or pointer to it) is still in stack. Since we do not need it anymor e the stack pointer (the ESP r egister ) needs to be corr ected. ADD ESP, 4 means add 4 to the value in the ESP r egister .
Why 4? Since it is 32-bit code we need exactly 4 bytes f or addr ess passing thr ough the stack. It is 8 bytes in x64-code.
ADD ESP, 4 is e _ ectively equivalent to POP r egister butwithout using any r egister 2. Some compiler s (like Intel C++ Compiler ) in the same situation may emit POP ECX instead of ADD (e.g. such a patter n can
be obser ved in the Or acle RDBMS code as it is compiled by Intel C++ compiler ). This instr uction has almost the same e _ ect but the ECX r egister contents will be r ewr itten.
The Intel C++ compiler pr obably uses POP ECX since this instr uction‘s opcode is shor ter then ADD ESP, x (1 byte against
3). Read mor e about the stack in section (4).
A _ er the call to pr intf (), in the or iginal C/C++ code was r etur n 0 —r etur n 0 as the r esult of the main() f unction. In the gener ated code this is implemented by instr uction XOR EAX, EAX
XOR is in f act, just eXclusive OR but compiler s o_ en use it instead of MOV EAX, 0 —again because it is a slightly shor ter
opcode (2 bytes against 5). Some compiler s emit SUB EAX, EAX, which means SUBtr act the value in the EAX fr om the value in EAX, which in any case
will r esult zer o.
The last instr uction RET r etur ns contr ol f low to the caller . Usually, it is C/C++ CRT4 code which in tur n r etur ns contr ol to
the OS.
2.1.2 GCC—x86
Now let‘s tr y to compile the same C/C++ code in the GCC 4.4.1 compiler in Linux:gcc 1.c -o 1
Next, with the assistance of the ID A5 disassembler , let‘s see how themain() f unction was cr eated. (ID A, like MSVC, shows code in Intel-syntax). N.B. We could also have GCC pr oduce assembly listings in Intel-syntax by applying the options -S -masm=intel
Listing 2.2: GCC
mov [esp+10h+var _10], eax
call _pr intf
1Read mor e about it in section about f unction pr olog and epilog (3). 2CPU f lags, however , ar e modif ied 3http://en.wikipedia.or g/wiki/Exclusive_or 4C r untime libr ar y: sec:CRT
5Inter active Disassembler
mov eax, 0
r etn
main endp
The r esult is almost the same. The addr ess of the hello, wor ld str ing (stor ed in the data segment) is saved in theEAX
r egister f ir st and then it is stor ed on the stack. Also in the f unction pr ologue we see AND ESP, 0FFFFFFF0h —this instr uction
aligns the value in the ESP r egister on a 16-byte boundar y. This r esults in all values in the stack being aligned. (The CPU
perf or ms better if the values it is dealing with ar e located in memor y at addr esses aligned on a 4- or 16-byte boundar y)6. SUB ESP, 10h allocates 16 bytes on the stack. Although, as we can see her ea _ er , only 4 ar e necessar y her e. This is because the size of the allocated stack is also aligned on a 16-byte boundar y. The str ing addr ess (or a pointer to the str ing) is then wr itten dir ectly onto the stack space without using the PUSH instr uc-
tion. var _ 10 —is a local var iable and is also an ar gument f or pr intf (). Read about it below.
Then the pr intf () f unction is called. Unlike MSVC, when GCC is compiling without optimization tur ned on, it emits MOV EAX, 0 instead of a shor ter opcode. The last instr uction, LEAVE —is the equivalent of the MOV ESP, EBP and POP EBP instr uction pair —in other wor ds, this
instr uction sets the stack pointer (ESP) back and r estor es the EBP r egister to its initial state. This is necessar y since we modif ied these r egister values (ESP and EBP) at the beginning of the f unction (executing MOV
EBP, ESP / AND ESP, ...).
2.1.3 GCC: AT&T syntax
Let‘s see how this can be r epr esented in the AT&T syntax of assembly language. This syntax is much mor e popular in the
UNIX-wor ld.
gcc -S 1_1.c
We get this:
.text
pushl %ebp
.cf i _off set 5, -8
movl %esp, %ebp
andl $-16, %esp
subl $16, %esp
movl $.LC0, (%esp)
leave
.cf i _def _ cf a 4, 4
r et
.section .note.GNU-stack,"",@pr ogbits
6
2.2. X86-64 CH APTER 2. HELLO, WORLD!
Ther e ar e a lot of macr os (beginning with dot). These ar e not ver y inter esting to us so f ar . For now, f or the sake of sim-
plif ication, we can ignor e them (except the .str ing macr o which encodes a null-ter minated char acter sequence just like a
C-str ing). Then we‘ll see this :
Listing 2.5: GCC 4.7.3
main:
leave
r et
Some of the ma jor di _ er ences between Intel and AT&T syntax ar e:
· Oper ands ar e wr itten backwar ds.
In Intel-syntax: <instr uction> <destination oper and> <sour ce oper and>.
In AT&T syntax: <instr uction> <sour ce oper and> <destination oper and>.
Her e is a way to think about them: when you deal with Intel-syntax, you can put in equality sign (=) in your mind
between oper ands and when you dealwith AT&T-syntax put in a r ight arr ow (→) 8.
· AT&T: Bef or e r egister names a per cent sign must be wr itten (%) and bef or e number s a dollar sign ($). Par entheses ar e
used instead of br ackets.
· AT&T: A special symbol is to be added to each instr uction def ining the type of data:
– l — long (32 bits)
– b — byte (8 bits)
Let‘s go back to the compiled r esult: it is identical to what we saw in ID A. With one subtle di _ er ence:0FFFFFFF0h is
wr itten as $-16. It is the same: 16 in the decimal system is 0x10 in hexadecimal. -0x10 is equal to 0xFFFFFFF0 (f or a 32-bit data type).
One mor e thing: the r etur n value is to be set to 0 by using usual MOV, not XOR. MOV just loads value to a r egister . Its name
is not intuitive (data is notmoved). In other ar chitectur es, this instr uction has the name load or something like that.
2.2 x86-64
Listing 2.6: MSVC 2012 x64
$SG2989 DB ‘hello, wor ld‘, 00H
add r sp, 40
main ENDP
7This GCC option can be used to eliminate unnecessar y macr os:-f no-asynchr onous-unwind-tables 8 By the way, in some C standar d f unctions (e.g., memcpy(), str cpy()) ar guments ar e listed in the same way as in Intel-syntax: pointer to destination
memor y block at the beginning and then pointer to sour ce memor y block.
7
7
2.2. X86-64 CH APTER 2. HELLO, WORLD!
As of x86-64, all r egister s wer e extended to 64-bit and now have a R- pr ef ix. In or der to use the stack less o_ en (in other
wor ds, to access exter nal memor y less o_ en), ther e exists a popular way to pass f unction ar guments via r egister s (f astcall:
44.3). I.e., one par t of f unction ar guments ar e passed in r egister s, other par t—via stack. In Win64, 4 f unction ar guments ar e
passed in RCX, RDX, R8, R9 r egister s. That is whatwe see her e: a pointer to the str ing f or pr intf () is now passed not in stack, but in the RCX r egister .
Pointer s ar e 64-bit now, so they ar e passed in the 64-bit par t of r egister s (which have the R- pr ef ix). But f or backwar d
compatibility, it is still possible to access 32-bit par ts, using the E- pr ef ix. This is how R AX/EAX/ AX/ AL looks like in 64-bit x86-compatible CPUs:
7th (byte number )
RAXx64
AH AL
The main() f unction r etur ns an int-typed value, which is, in the C PL, f or better backwar d compatibility and por tability, still 32-bit, so that is why the EAX r egister is clear ed at the f unction end (i.e., 32-bit par t of r egister ) instead of R AX.
2.2.2 GCC—x86-64
Listing 2.7: GCC 4.4.6 x64
.str ing "hello, wor ld"
main:
xor eax, eax ; number of vector r egister s passed
call pr intf xor eax, eax
add r sp, 8
r et
A method to pass f unction ar guments in r egister s is also used in Linux, *BSD and Mac OS X [21]. The f ir st 6 ar guments ar e
passed in the RDI, RSI, RDX, RCX, R8, R9 r egister s, and other s—via stack. So the pointer to the str ing is passed in EDI (32-bit par t of r egister ). Butwhy not use the 64-bit par t, RDI?
It is impor tant to keep in mind that all MOV instr uctions in 64-bit mode wr iting something into the lower 32-bit r egister
par t, also clear the higher 32-bits [14]. I.e., the MOV EAX, 011223344h will wr ite a value corr ectly into R AX, since the higher
bits will be clear ed. If we open the compiled ob ject f ile (.o), we will also see all instr uction‘s opcodes :
Listing 2.8: GCC 4.4.6 x64
.text:00000000004004D0 main pr oc near
.text:00000000004004D0 48 83 EC 08 sub r sp, 8
.text:00000000004004D4 BF E8 05 40 00 mov edi, off set f or mat ; "hello, wor ld"
.text:00000000004004D9 31 C0 xor eax, eax
.text:00000000004004DB E8 D8 FE FF FF call _pr intf
.text:00000000004004E0 31 C0 xor eax, eax
.text:00000000004004E2 48 83 C4 08 add r sp, 8
.text:00000000004004E6 C3 r etn
.text:00000000004004E6 main endp
As we can see, the instr uction wr iting into EDI at 0x4004D4 occupies 5 bytes. The same instr uction wr iting a 64-bit value
into RDI will occupy 7 bytes. Appar ently, GCC is tr ying to save some space. Besides, it can be sur e that the data segment
containing the str ing will not be allocated at the addr esses higher than 4GiB. We also see EAX r egister clear ance bef or e pr intf () f unction call. This is done because a number of used vector r egister s
is passed in EAX by standar d: with var iable ar guments passes inf or mation about the number of vector r egister s used [21].
9This should be enabled in Options → Disassembly → Number of opcode bytes
8
9
2.3. ARM CH APTER 2. HELLO, WORLD!
ar e swapped (f or thumb and thumb-2 modes). For instr uctions in ARM mode, the or der is the f our th byte, then the thir d, then
the second and f inally the f ir st (due to di _ er ent endianness). So as we can see, the MOVW, MOVT.W and BLX instr uctions begin
with 0xFx.
One of the thumb-2 instr uctions is MOVW R0, #0x13D8 —itwr ites a 16-bit value into the lower par t of the R0 r egister . Also, MOVT.W R0, #0 wor ks just like MOVT fr om the pr evious example but itwor ks in thumb-2.
Among other di _ er ences, her e the BLX instr uction is used instead of BL. The di _ er ence is that, besides saving the RA27
in the LR r egister and passing contr ol to the puts() f unction, the pr ocessor is also switching fr om thumb mode to ARM (or
back). This instr uction is placed her e since the instr uction to which contr ol is passed looks like (it is encoded in ARM mode):
__ symbolstub1:00003FEC _puts ; CODE XREF: _hello_ wor ld+E
__ symbolstub1:00003FEC 44 F0 9F E5 LDR PC, = __ imp__puts
So, the obser vant r eader may ask: why not call puts() r ight at the point in the code wher e it is needed?
Because it is not ver y space-e _ icient. Almost any pr ogr am uses exter nal dynamic libr ar ies (like DLL in Windows, .so in *NIX or .dylib in Mac OS X). O _ en-used
libr ar y f unctions ar e stor ed in dynamic libr ar ies, including the standar d C-f unction puts(). In an executable binar y f ile (Windows PE .exe, ELF or Mach-O) an impor t section is pr esent. This is a list of symbols (f unc-
tions or global var iables) being impor ted fr om exter nalmodules along with the names of these modules. The OS loader loads allmodules it needs and, while enumer ating impor t symbols in the pr imar y module, deter mines the
corr ect addr esses of each symbol.
Inour case, __ imp __ puts is a 32-bit var iable wher e the OS loader willwr ite the corr ectaddr ess of the f unction inanexter nal libr ar y. Then the LDR instr uction just takes the 32-bit value fr om this var iable and wr ites it into the PC r egister , passing contr ol
to it.
So, in or der to r educe the time that an OS loader needs f or doing this pr ocedur e, it is good idea f or it to wr ite the addr ess
of each symbol only once to a specially-allocated place just f or it. Besides, as we have alr eady f igur ed out, it is impossible to load a 32-bit value into a r egister while using only one instr uc-
tion without a memor y access. So, it is optimal to allocate a separ ate f unction wor king in ARM mode with only one goal —to pass contr ol to the dynamic libr ar y and then to jump to this shor t one-instr uction f unction (the so-called thunk f unction) fr om thumb-code.
By the way, in the pr evious example (compiledf or ARM mode) contr olpassed by the BL instr uction goes to the same thunk
f unction. However the pr ocessor mode is not switched (hence the absence of an X in the instr uction mnemonic).
27Retur n Addr ess
Chapter 3
Function pr ologue and epilogue
A f unction pr ologue is a sequence of instr uctions at the star t of a f unction. It o_ en looks something like the f ollowing code
fr agment:
push ebp
mov ebp, esp
sub esp, X
What these instr uction do: saves the value in the EBP r egister , sets the value of the EBP r egister to the value of the ESP
and then allocates space on the stack f or local var iables. The value in the EBP is f ixed over a per iod of f unction execution and it is to be used f or local var iables and ar guments
access. One can use ESP, but it is changing over time and it is not convenient. The f unction epilogue fr ees allocated space in the stack, r etur ns the value in the EBP r egister back to initial state and
r etur ns the contr ol f low to callee:
mov esp, ebp
r et 0
Function pr ologues and epilogues ar e usually detected in disassembler s f or f unction delimitation fr om each other .
3.1 Recur sion
Epilogues and pr ologues can make r ecur sion perf or mance wor se. For example, once upon a time I wr ote a f unction to seek the corr ectnode in a binar y tr ee. As a r ecur sive f unction itwould
look stylish but since additional time is to be spend at each f unction call f or the pr ologue/epilogue, it was wor king a couple
of times slower than an iter ative (r ecur sion-fr ee) implementation.
By the way, that is the r eason compiler s use tail call.
13
Chapter 4
Stack
A stack is one of the most f undamental data str uctur es in computer science 1.
Technically, it is just a block of memor y in pr ocess memor y along with the ESP or RSP r egister in x86 or x64, or the SP
r egister in ARM, as a pointer within the block. The most fr equently used stack access instr uctions ar e PUSH and POP (in both x86 and ARM thumb-mode). PUSH subtr acts
4 in32-bitmode (or 8 in64-bitmode) fr om ESP/RSP/SP andthenwr ites the contents of its sole oper and to the memor y addr ess
pointed to by ESP/RSP/SP. POP is the r ever se oper ation: get the data fr om memor y pointed to by SP, put it in the oper and (o_ en a r egister ) and then
add 4 (or 8) to the stack pointer . A _ er stack allocation the stack pointer points to the end of stack. PUSH decr eases the stack pointer and POP incr eases it.
The end of the stack is actually at the beginning of the memor y allocated f or the stack block. It seems str ange, but that‘s the
way it is. Never theless ARM not only has instr uctions suppor ting descending stacks but also ascending stacks.
For example the STMFD2/LDMFD3, STMED4/LDMED5 instr uctions ar e intended to deal with a descending stack. The
STMFA6/LMDFA7, STME A8/LDME A9 instr uctions ar e intended to dealwith an ascending stack.
4.1 Why does the stack gr ow backwar d?
Intuitively, we might think that, like any other data str uctur e, the stack may gr ow upwar d, i.e., towar ds higher addr esses. The r eason the stack gr ows backwar d is pr obably histor ical. When computer s wer e big and occupied a whole r oom, it
was easy to divide memor y into two par ts, one f or the heap and one f or the stack. Of cour se, it was unknown how big the
heap and the stack would be dur ing pr ogr am execution, so this solution was the simplest possible.
Star t of heap Star t of stack
Heap Stack
In [26] we can r ead:
The user -cor e par t of an image is divided into thr ee logical segments. The pr ogr am text segment begins
at location 0 in the vir tual addr ess space. Dur ing execution, this segment is wr ite-pr otected and a single
copy of it is shar ed among all pr ocesses executing the same pr ogr am. At the f ir st 8K byte boundar y above
1http://en.wikipedia.or g/wiki/Call _ stack 2Stor e Multiple Full Descending 3Load Multiple Full Descending 4Stor e Multiple Empty Descending 5Load Multiple Empty Descending 6Stor e Multiple Full Ascending 7Load Multiple Full Ascending 8Stor e Multiple Empty Ascending 9Load Multiple Empty Ascending
14
4.2. WH AT IS THE ST ACK USED FOR? CH APTER 4. ST ACK
the pr ogr am text segment in the vir tual addr ess space begins a nonshar ed, wr itable data segment, the size
of which may be extended by a system call. Star ting at the highest addr ess in the vir tual addr ess space is a
stack segment, which automatically gr ows downwar d as the har dwar e‘s stack pointer f luctuates.
4.2 What is the stack used f or ?
4.2.1 Save the r etur n addr ess wher e a f unctionmust r etur n contr ol a _ er execution
x86
While calling another f unction with a C ALL instr uction the addr ess of the point exactly a _ er the C ALL instr uction is saved to
the stack and then an unconditional jump to the addr ess in the C ALL oper and is executed. The C ALL instr uction is equivalent to a PUSH addr ess _after _ call / JMP oper and instr uction pair . RET f etches a value fr om the stack and jumps to it —it is equivalent to a POP tmp / JMP tmp instr uction pair . Overf lowing the stack is str aightf or war d. Just r un eter nal r ecur sion:
void f ()
c:\tmp6>cl ss.cpp /Fass.asm
Micr osoft (R) 32-bit C/C++ Optimizing Compiler Ver sion 15.00.21022.08 f or 80x86
Copyr ight (C) Micr osoft Cor por ation. All r ights r eser ved.
ss.cpp
c:\tmp6\ss.cpp(4) : war ning C4717: ‘f ‘ : r ecur sive on all contr ol paths, f unction will cause >
Ç r untime stack over f low
. . .but gener ates the r ight code anyway:
?f @@YAXXZ PROC ; f
; File c:\tmp6\ss.cpp
?f @@YAXXZ ENDP ; f
. . . Also if we tur n on optimization (/Ox option) the optimized code will not overf low the stack but instead will wor k cor - r ectly10:
?f @@YAXXZ PROC ; f
; File c:\tmp6\ss.cpp
jmp SHORT $LL3@f ?f @@YAXXZ ENDP ; f
GCC 4.4.1 gener ates similar code in both cases, although without issuing any war ning about the pr oblem.
10ir ony her e
4.2. WH AT IS THE ST ACK USED FOR? CH APTER 4. ST ACK
4.2.3 Local var iable stor age
A f unction could allocate space in the stack f or its local var iables just by shi _ ing the stack pointer towar ds the stack bottom. It is also not a r equir ement. You could stor e local var iables wher ever you like, but tr aditionally this is how it‘s done.
4.2.4 x86: alloca() f unction
It is wor th noting the alloca() f unction.14.
This f unction wor ks like malloc() but allocates memor y just on the stack. The allocated memor y chunk does not need to be fr eed via a f r ee() f unction call since the f unction epilogue (3) will
r etur n ESP back to its initial state and the allocated memor y will be just annulled. It is wor th noting how alloca() is implemented. In simple ter ms, this f unction just shi _ s ESP downwar ds towar d the stack bottom by the number of bytes you need and
sets ESP as a pointer to the allocated block. Let‘s tr y:
#if def __ GNUC __
#if def __ GNUC __
snpr intf (buf, 600, "hi! %d, %d, %d\n", 1, 2, 3); // GCC
#else
_ snpr intf (buf, 600, "hi! %d, %d, %d\n", 1, 2, 3); // MSVC
#endif
};
( _ snpr intf () f unction wor ks just like pr intf (), but instead of dumping the r esult into stdout (e.g., to ter minal or con- sole), itwr ites to the buf bu_ er . puts() copies buf contents to stdout. Of cour se, these two f unction calls might be r eplaced
by one pr intf () call, but I would like to illustr ate small bu_ er usage.)
MSVC
Listing 4.1: MSVC 2010
add esp, 28 ; 0000001cH
14In MSVC, the f unction implementation can be f ound in alloca16.asm and chkstk.asm in C:\Pr ogr am Files (x86)\Micr osoft Visual Studio
10.0\VC\cr t\sr c\intel
17
...
The sole alloca() ar gument is passed via EAX (instead of pushing into stack) 15. A _ er the alloca() call, ESP points to
the block of 600 bytes and we can use it as memor y f or the buf arr ay.
GCC + Intel syntax
GCC 4.4.1 can do the same without calling exter nal f unctions:
Listing 4.2: GCC 4.7.3
f:
lea ebx, [esp+39]
and ebx, -16 ; align pointer by 16-bit bor der mov DWORD PTR [esp], ebx ; s
mov DWORD PTR [esp+20], 3
mov DWORD PTR [esp+16], 2
mov DWORD PTR [esp+12], 1
mov DWORD PTR [esp+8], OFFSET FL AT:.LC0 ; "hi! %d, %d, %d\n"
mov DWORD PTR [esp+4], 600 ; maxlen
call _ snpr intf mov DWORD PTR [esp], ebx ; s
call puts
leave
GCC + AT&T syntax
Let‘s see the same code, but in AT&T syntax:
Listing 4.3: GCC 4.7.3
f:
movl $1, 12(%esp)
movl $.LC0, 8(%esp)
movl $600, 4(%esp)
call puts
leave
15It is because alloca() is r ather compiler intr insic (63) than usual f unction. One of the r eason ther e is a separ ate f unction instead of couple instr uctions just in the code, because MSVC16 implementation of the alloca() f unction also
has a code which r eads f r om the memor y just allocated, in or der to letOS to map physicalmemor y to this VM17 r egion.
18
4.3. TYPIC AL ST ACK L AYOUT CH APTER 4. ST ACK
r et
The code is the same as in the pr evious listing. N.B. E.g. movl $3, 20(%esp) is analogous to mov DWORD PTR [esp+20], 3 in Intel-syntax —when addr essing mem-
or y in f or m r egister +o _ set, it is wr itten in AT&T syntax as off set(%r egister).
4.2.5 (Windows) SEH
SEH18 r ecor ds ar e also stor ed on the stack (if they pr esent).. Read mor e about it: (48.3).
4.2.6 Bu _ er overf low pr otection
Mor e about it her e (16.2).
4.3 Typical stack layout
. . . . . .
ESP-0xC local var iable #2, mar ked in ID A as var _8
ESP-8 local var iable #1, mar ked in ID A as var _4
ESP-4 saved value of EBP
ESP r etur n addr ess
. . . . . .
19
Chapter 5
pr intf ()with sever al ar guments
Now let‘s extend theHello, wor ld! (2) example, r eplacing pr intf () in the main() f unction body by this:
#include <stdio.h>
{
pr intf ("a=%d; b=%d; c=%d", 1, 2, 3);
r etur n 0;
5.1.1 MSVC
Let‘s compile it by MSVC 2010 Expr ess and we got:
$SG3830 DB ‘a=%d; b=%d; c=%d‘, 00H
...
call _pr intf add esp, 16 ; 00000010H
Almost the same, but now we can see the pr intf () ar guments ar e pushed onto the stack in r ever se or der . The f ir st
ar gument is pushed last. By the way, var iables of int type in 32-bit envir onment have 32-bitwidth, that is 4 bytes. So, we have her e 4 ar guments. 4 ∗ 4 = 16 —they occupy exactly 16 bytes in the stack: a 32-bit pointer to a str ing and 3
number s of type int. When the stack pointer (ESP r egister ) is corr ected by ADD ESP, X instr uction a _ er a f unction call, o_ en, the number
of f unction ar guments can be deduced her e: just divide X by 4.
Of cour se, this is r elated only to cdecl calling convention. See also the section about calling conventions (44). It is also possible f or the compiler to mer ge sever al ADD ESP, X instr uctions into one, a _ er the last call:
push a1
push a2
5.1. X86: 3 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
push a2
push a3
5.1.2 MSVC and OllyDbg
Now let‘s tr y to load this example in OllyDbg. It is one of the most popular user -land win32 debugger . We can tr y to compile
our example in MSVC 2012 with /MD option, meaning, to link against MSVCR*.DLL, so we will able to see impor ted f unctions
clear ly in debugger . Then load executable in OllyDbg. The ver y f ir st br eakpoint is in ntdll.dll, pr ess F9 (r un). The second br eakpoint is in
CRT-code. Now we should f ind main() f unction. Find this code by scr olling the code to the ver y bottom (MSVC allocates main() f unction at the ver y beginning of the code
section): f ig. 5.3.
Click on PUSH EBP instr uction, pr ess F2 (set br eakpoint) and pr ess F9 (r un). We need to do these manupulations in or der
to skip CRT-code, because, we don‘t r eally inter esting in it yet.
Pr ess F8 (step over ) 6 times, i.e., skip 6 instr uctions: f ig. 5.4.
Now the PC points to the C ALL pr intf instr uction. OllyDbg, like other debugger s, highlights value of r egister s which wer e
changed. So each time you pr ess F8, EIP is changing and its value looking r ed. ESP is changing as well, because values ar e
pushed into the stack.
Wher e ar e the values in the stack? Take a look into r ight/bottom window of debugger :
Figur e 5.1: OllyDbg: stack a _ er values pushed (I made r ound r ed mar k her e in gr aphics editor )
So we can see ther e 3 columns: addr ess in the stack, value in the stack and some additionalOllyDbg comments. OllyDbg
under stands pr intf ()-like str ings, so it r epor ts the str ing her e and 3 values attached to it. It is possible to r ight-click on the f or mat str ing, click on Follow in dump, and the f or mat str ing will appear in the window
at the le _ -bottom par t, wher e some memor y par t is always seen. These memor y values can be edited. It is possible to change
the f or mat str ing, and then the r esult of our example will be di _ er ent. It is pr obably not ver y usef ul now, but it‘s ver y good
idea f or doing it as exer cise, to get f eeling how ever ything is wor ks her e. Pr ess F8 (step over ). In the console we‘ll see the output:
Figur e 5.2: pr intf () f unction executed
Let‘s see how r egister s and stack state ar e changed: f ig. 5.5.
EAX r egister now contains 0xD (13). That‘s corr ect,pr intf () r etur ns number of char acter s pr inted. EIP value is changed: indeed, now ther e is addr ess of the instr uction a _ er C ALL pr intf . ECX and EDX values ar e changed as well. Appar ently, pr intf () f unction‘s hidden machiner y used them f or its own needs.
A ver y impor tant thing is that ESP value is not changed. And stack state too! We clear ly see that f or mat str ing and cor - r esponding 3 values ar e still ther e. Indeed, that‘scdecl calling convention, calling f unction doesn‘t clear ar guments in stack.
It‘s caller ‘s duty to do so. Pr ess F8 again to execute ADD ESP, 10 instr uction: f ig. 5.6.
21
5.1. X86: 3 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
ESP is changed, but values ar e still in the stack! Yes, of cour se, no one needs to f ill these values by zer o or something like
that. Because, ever ything above stack pointer (SP) is noise or gar bage, it has no value at all. It would be time consuming to
clear unused stack entr ies, besides, no one r eally needs to.
Figur e 5.3: OllyDbg: the ver y star t of the main() f unction
22
5.1. X86: 3 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
Figur e 5.4: OllyDbg: bef or e pr intf () execution
Figur e 5.5: OllyDbg: a _ er pr intf () execution
23
5.1. X86: 3 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
Figur e 5.6: OllyDbg: a _ er ADD ESP, 10 instr uction execution
5.1.3 GCC
Now let‘s compile the same pr ogr am in Linux using GCC 4.4.1 and take a look in ID A whatwe got:
main pr oc near
mov ebp, esp
and esp, 0FFFFFFF0h
sub esp, 10h
mov eax, off set a ADBDCD ; "a=%d; b=%d; c=%d"
mov [esp+10h+var _4], 3
mov [esp+10h+var _8], 2
mov [esp+10h+var _ C], 1
mov [esp+10h+var _10], eax
call _pr intf mov eax, 0
leave
r etn
main endp
It can be said that the di _ er ence between code fr om MSVC and code fr om GCC is only in the method of placing ar guments
on the stack. Her e GCC is wor king dir ectly with the stack without PUSH/POP.
24
5.1. X86: 3 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
5.1.4 GCC and GDB
Let‘s tr y this example also in GDB in Linux. -g mean pr oduce debug inf or mation into executable f ile.
$ gcc 1.c -g -o 1
Copyr ight (C) 2013 Fr ee Softwar e Foundation, Inc.
License GPLv3+: GNU GPL ver sion 3 or later <http://gnu.or g/licenses/gpl.html>
This is f r ee softwar e: you ar e f r ee to change and r edistr ibute it.
Ther e is NO W ARR ANTY, to the extent per mitted by law. Type "show copying"
and "show warr anty" f or details.
This GDB was conf igur ed as "i686-linux-gnu".
For bug r epor ting instr uctions, please see:
<http://www.gnu.or g/softwar e/gdb/bugs/>...
Listing 5.1: let‘s set br eakpoint onpr intf ()
(gdb) b pr intf Br eakpoint 1 at 0x80482f 0
Run. Ther e ar e no pr intf () f unction sour ce code her e, so GDB can‘t show its sour ce, butmay do so.
(gdb) r un
Star ting pr ogr am: /home/dennis/polygon/1
Br eakpoint 1, __pr intf (f or mat=0x80484f 0 "a=%d; b=%d; c=%d") at pr intf.c:29
29 pr intf.c: No such f ile or dir ector y.
Pr int 10 stack elements. Le _ column is an addr ess in stack.
(gdb) x/10w $esp
0xbffff 12c: 0x00000003 0x08048460 0x00000000 0x00000000
0xbffff 13c: 0xb7e29905 0x00000001
The ver y f ir st element is RA (0x0804844a). We can be sur e in it by disassembling the memor y at this addr ess:
(gdb) x/5i 0x0804844a
0x804844f <main+50>: leave
0x8048451: xchg %ax,%ax
0x8048453: xchg %ax,%ax
Two XCHG instr uctions, appar ently, is some r andom gar bage, which we can ignor e so f ar . The second element (0x080484f 0) is an addr ess of f or mat str ing:
(gdb) x/s 0x080484f 0
0x80484f 0: "a=%d; b=%d; c=%d"
Other 3 elements (1, 2, 3) ar e pr intf () ar guments. Other elements may be just gar bage pr esent in stack, but also may
be values fr om other f unctions, their local var iables, etc. We can ignor e it yet. Execute f inish. This mean, execute till f unction end. Her e itmeans: execute till the f inish of pr intf ().
(gdb) f inish
Run till exit f r om #0 __pr intf (f or mat=0x80484f 0 "a=%d; b=%d; c=%d") at pr intf.c:29
main () at 1.c:6
Value r etur ned is $2 = 13
1GNU debugger
5.1. X86: 3 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
GDB shows what pr intf () r etur ned in EAX (13). This is number of char acter s pr inted, just like in the example with Olly-
Dbg. We also see r etur n 0; and the inf or mation that this expr ession is in the1.c f ile at the line 6. Indeed, the 1.c f ile is located
in the curr ent dir ector y, and GDB f inds the str ing ther e. How GDB knows, which C-code line is being executed now? This is
r elated to the f act that compiler , while gener ating debugging inf or mation, also saves a table of r elations between sour ce
code line number s and instr uction addr esses. GDB is sour ce-level debugger , a _ er all. Let‘s examine r egister s. 13 inEAX:
(gdb) inf o r egister s
eax 0xd 13
ecx 0x0 0
edx 0x0 0
esi 0x0 0
edi 0x0 0
...
Let‘s disassemble curr ent instr uctions. Arr ow points to the instr uction being executed next.
(gdb) disas
0x0804841d <+0>: push %ebp
0x08048420 <+3>: and $0xfffffff 0,%esp
0x08048423 <+6>: sub $0x10,%esp
0x08048426 <+9>: movl $0x3,0xc(%esp)
0x0804842e <+17>: movl $0x2,0x8(%esp)
0x08048436 <+25>: movl $0x1,0x4(%esp)
0x0804843e <+33>: movl $0x80484f 0,(%esp)
0x08048445 <+40>: call 0x80482f 0 <pr intf @plt>
=> 0x0804844a <+45>: mov $0x0,%eax
0x0804844f <+50>: leave
0x08048450 <+51>: r et
End of assembler dump.
GDB shows disassembly in AT&T syntax by def ault. It‘s possible to switch to Intel syntax:
(gdb) set disassembly-f lavor intel (gdb) disas
Dump of assembler code f or f unction main:
0x0804841d <+0>: push ebp
0x0804841e <+1>: mov ebp,esp
0x08048423 <+6>: sub esp,0x10
=> 0x0804844a <+45>: mov eax,0x0
0x08048450 <+51>: r et
End of assembler dump.
Execute next instr uction. GDB shows ending br acket, meaning, this is ending block of f unction.
(gdb) step
7 };
Let‘s see r egister s a _ er MOV EAX, 0 instr uction execution. EAX her e is zer o indeed.
26
5.2. X64: 8 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
(gdb) inf o r egister s
eax 0x0 0
ecx 0x0 0
edx 0x0 0
esi 0x0 0
edi 0x0 0
...
5.2 x64: 8 ar guments
To see how other ar guments will be passed via the stack, let‘s change our example again by incr easing the number of ar gu- ments to be passed to 9 (pr intf () f or mat str ing + 8 int var iables):
#include <stdio.h>
{
pr intf ("a=%d; b=%d; c=%d; d=%d; e=%d; f =%d; g=%d; h=%d\n", 1, 2, 3, 4, 5, 6, 7, 8);
r etur n 0;
5.2.1 MSVC
As we saw bef or e, the f ir st 4 ar guments ar e passed in the RCX, RDX, R8, R9 r egister s in Win64, while all the r est—via the stack. That is what we see her e. However , the MOV instr uction, instead of PUSH, is used f or pr epar ing the stack, so the values ar e
wr itten to the stack in a str aightf or war d manner .
Listing 5.2: MSVC 2012 x64
mov r 9d, 3
mov r 8d, 2
call pr intf
5.2. X64: 8 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
5.2.2 GCC
In *NIX OS-es, it‘s the same stor y f or x86-64, except that the f ir st 6 ar guments ar e passed in theRDI, RSI, RDX, RCX, R8, R9
r egister s. All the r est—via the stack. GCC gener ates the code wr iting str ing pointer into EDI instead if RDI—we saw this thing
bef or e: 2.2.2. We also saw bef or e the EAX r egister being clear ed bef or e a pr intf () call: 2.2.2.
Listing 5.3: GCC 4.4.6 -O3 x64
.LC0:
.str ing "a=%d; b=%d; c=%d; d=%d; e=%d; f =%d; g=%d; h=%d\n"
main:
xor eax, eax ; number of vector r egister s passed
mov DWORD PTR [r sp+16], 8
mov DWORD PTR [r sp+8], 7
mov DWORD PTR [r sp], 6
call pr intf
$ gcc -g 2.c -o 2
Copyr ight (C) 2013 Fr ee Softwar e Foundation, Inc.
License GPLv3+: GNU GPL ver sion 3 or later <http://gnu.or g/licenses/gpl.html>
This is f r ee softwar e: you ar e f r ee to change and r edistr ibute it.
Ther e is NO W ARR ANTY, to the extent per mitted by law. Type "show copying"
and "show warr anty" f or details.
This GDB was conf igur ed as "x86_64-linux-gnu".
For bug r epor ting instr uctions, please see:
<http://www.gnu.or g/softwar e/gdb/bugs/>...
Reading symbols f r om /home/dennis/polygon/2...done.
Listing 5.4: let‘s set br eakpoint topr intf (), and r un
(gdb) b pr intf Br eakpoint 1 at 0x400410
(gdb) r un
Star ting pr ogr am: /home/dennis/polygon/2
Br eakpoint 1, __pr intf (f or mat=0x400628 "a=%d; b=%d; c=%d; d=%d; e=%d; f =%d; g=%d; h=%d\n") at >
Ç pr intf.c:29
29 pr intf.c: No such f ile or dir ector y.
28
5.2. X64: 8 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
Register s RSI/RDX/RCX/R8/R9 has the values which ar e should be ther e. RIP has an addr ess of the ver y f ir st instr uction of
the pr intf () f unction.
(gdb) inf o r egister s
r ax 0x0 0
r bx 0x0 0
r cx 0x3 3
r dx 0x2 2
r si 0x1 1
r di 0x400628 4195880
r 8 0x4 4
r 9 0x5 5
r 11 0x7ffff 7a65f 60 140737348263776
r 12 0x400440 4195392
r 14 0x0 0
r 15 0x0 0
...
Listing 5.5: let‘s inspect f or mat str ing
(gdb) x/s $r di 0x400628: "a=%d; b=%d; c=%d; d=%d; e=%d; f =%d; g=%d; h=%d\n"
Let‘s dump stack with x/g command this time—g meansgiantwor ds, i.e., 64-bitwor ds.
(gdb) x/10g $r sp
0x7fffffff df 48: 0x0000000000000007 0x00007fff 00000008
0x7fffffff df 58: 0x0000000000000000 0x0000000000000000
0x7fffffff df 68: 0x00007ffff 7a33de5 0x0000000000000000
0x7fffffff df 78: 0x00007fffffff e048 0x0000000100000000
The ver y f ir st stack element, just like in pr evious case, is RA. 3 values ar e also passed in stack: 6, 7, 8. We also see that 8
is passed with high 32-bits not clear ed: 0x00007fff 00000008. That‘s OK, because, values hasint type, which is 32-bit type. So, high r egister or stack element par tmay contain r andom gar bage.
If to take a look, wher e contr ol f low will r etur n a _ er pr intf () execution, GDB will show the whole main() f unction:
(gdb) set disassembly-f lavor intel (gdb) disas 0x0000000000400576
Dump of assembler code f or f unction main:
0x000000000040052d <+0>: push r bp
0x000000000040052e <+1>: mov r bp,r sp
0x0000000000400531 <+4>: sub r sp,0x20
0x0000000000400535 <+8>: mov DWORD PTR [r sp+0x10],0x8
0x000000000040053d <+16>: mov DWORD PTR [r sp+0x8],0x7
0x0000000000400545 <+24>: mov DWORD PTR [r sp],0x6
0x000000000040054c <+31>: mov r 9d,0x5
0x0000000000400552 <+37>: mov r 8d,0x4
0x0000000000400558 <+43>: mov ecx,0x3
0x000000000040055d <+48>: mov edx,0x2
0x0000000000400562 <+53>: mov esi,0x1
0x0000000000400567 <+58>: mov edi,0x400628
0x000000000040056c <+63>: mov eax,0x0
0x0000000000400576 <+73>: mov eax,0x0
5.3. ARM: 3 ARGUMENTS CH APTER 5. PRINTF() WITH SEVERAL ARGUMENTS
Let‘s f inishpr intf () execution, execute the instr uction zer oing EAX, take a notice that EAX r egister has exactly zer o. RIP
now points to the LEAVE instr uction, i.e., penultimate in main() f unction.
(gdb) f inish
Run till exit f r om #0 __pr intf (f or mat=0x400628 "a=%d; b=%d; c=%d; d=%d; e=%d; f =%d; g=%d; h=%>
Ç d\n") at pr intf.c:29
a=1; b=2; c=3; d=4; e=5; f =6; g=7; h=8
main () at 2.c:6
Value r etur ned is $1 = 39
r ax 0x0 0
r bx 0x0 0
r cx 0x26 38
r si 0x7fffff d9 2147483609
r di 0x0 0
r 8 0x7ffff 7dd26a0 140737351853728
r 9 0x7ffff 7a60134 140737348239668
r 10 0x7fffffff d5b0 140737488344496
r 11 0x7ffff 7a95900 140737348458752
r 12 0x400440 4195392
r 14 0x0 0