68184764 an attack resistant and rapid recovery desktop system

7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

1/160

CLARKSON UNIVERSITY

An Attack-Resistant and Rapid Recovery Desktop System

A Dissertation by

Todd Deshane

Coulter School of Engineering

Submitted in partial fulfillment of the requirements

for the degree of

Doctor of Philosophy

Engineering Science

August 2010

cTodd Deshane 2010

Accepted by the Graduate School

Date DEAN


2/160

UMI Number: 3428987

All rights reserved

INFORMATION TO ALL USERSThe quality of this reproduction is dependent upon the quality of the copy submitted.

In the unlikely event that the author did not send a complete manuscriptand there are missing pages, these will be noted. Also, if material had to be removed,

a note will indicate the deletion.

UMI 3428987Copyright 2010 by ProQuest LLC.

All rights reserved. This edition of the work is protected againstunauthorized copying under Title 17, United States Code.

ProQuest LLC789 East Eisenhower Parkway

P.O. Box 1346Ann Arbor, MI 48106-1346


3/160

The undersigned have examined the dissertation entitled An Attack-Resistant and

Rapid Recovery Desktop System presented by Todd Deshane, a candidate for the

degree of Doctor of Philosophy, Engineering Science and hereby certify that it is worthy of

acceptance.

Date

EXAMINING COMMITTEE

Dr. Susan Conry

Dr. Daqing Hou

Dr. Robert Meyer

Dr. Joachim Stahl

ADVISOR Dr. Jeanna Matthews

ii


4/160

CLARKSON UNIVERSITY

An Attack-Resistant and Rapid Recovery Desktop System

By: Todd Deshane

Advisor: Jeanna Matthews

Abstract

General-purpose computing devices, such as personal computers (PCs), and the operating

systems that run on them provide more functionality and capabilities than most users will

ever want or need. Too much of the burden of keeping these computer systems secure is

placed on the end users. Users are often required to keep the operating system, applica-

tions, security software, and anti-virus definitions up-to-date. Even with the latest security

updates, users are still susceptible to the newest exploits. When a system does become com-

promised, the process of then restoring it to a usable state can frequently result in the loss

of personal data stored on the system. Personal data can often only be recovered through

repeated effort and in some cases can never be recovered. Malicious software (malware)

is not the only source of problems on a computer system. Software bugs and conflicting

software packages can also cause system instability as well as data corruption.

In this dissertation, we present a unique desktop system architecture solution to the

pervasive problem of recovering from malware attacks. We demonstrate our architecture

with an open source implementation of our Rapid Recovery Desktop system that provides

resistance against attack and rapid recovery from broken system state and malware in-

festation. Our system combines a file server virtual machine (FS-VM), a network virtual

machine (NET-VM), a virtual machine contract system, and a virtualization security frame-

work (OSCKAR) to isolate, provide access control, and limit the privileges of applications.

We measured the systems performance overhead and evaluated the security and recovery

benefits.

iii


5/160

Acknowledgements

Id like to thank God for me giving the opportunity to get a PhD. His plans are always so

much greater than even my wildest dreams. Randy Pausch, in his Last Lecture said, luck

is where preparation meets opportunity. This is a very insightful quote. I like to thinkand live my life by an analogous definition, that of grace. Grace is where faith meets Gods

blessings. Grace is also defined as getting what you dont deserve. I feel that I am given

tremendous blessings every day and thank God that He is so gracious to me.

I wouldnt have made it this far in my PhD endeavors if it wasnt for my wife, Patty.

She is my everything and helps me so much each day. She has been by my side helping me

with all of the little details of the whole process. It truly was a journey, one that I am so

grateful that we have been able to take together. Not only is this the end of one chapter of

our lives, but it is the beginning of the next.

I have a wonderful family and great friends. They have been so supportive of me in

everything that I do. I know that they are so proud of me and my accomplishments, but

Im glad that I have their love and support. They make changing the world worth all of the

work.

My advisor, Jeanna Matthews, has been a true inspiration to me. Her passion for

making a difference, helping people, and her philosophy of leaving things better than you

found them have made a significant impact on my life in many ways. I am thankful for

her endless patience, guidance, and wisdom. I also thank her for helping me develop such

a strong interest in networking, systems, and open source.

The Applied Computer Science Labs at Clarkson, specifically the Clarkson Open Source

Institute (COSI) and the Internet Teaching Laboratory (ITL), have also been another won-

derful inspiration and help to me. I have met so many people in the labs and developed so

many great relationships over the years. I appreciate the many conversations and feedback

that I have received from them over the years on my research, life, and the world in general.

I would like to thank my PhD committee for their thorough and insightful feedback

on my dissertation. Also thanks to the many reviewers that read early drafts of chapters,

iv


6/160

paragraphs, and ideas. The conversations and feedback that I have received have turned

this work into something that I could not have come up with all by myself.

Id like to thank IBM for their generosity to higher education and specifically for funding

me for two years with consecutive IBM PhD Fellowships. My mentors during the program,

Sean Dague and Rick Harper, are still influential and impressive to me today. Im so

fortunate to have met both of them.

I know that by listing specific people I will inevitably leave someone out. However, I

would like to acknowledge by name several close friends and colleagues who I have met

during my PhD career. These guys have worked with me in one way or another over the

years and they deserve recognition. A big thanks to Eli M. Dow, Wenjin Hu, Patrick F.

Wilbur, Jim Owens, and Tao Yang. You guys are next for getting your PhDs. I pass that

torch onto you. Best of luck.

Last, but certainly not least, I would be remiss to forget to thank the open source

community for producing great free software, helping me understand things, and having

patience to help answer even the most trivial of questions. I hope that I can somehow give

back to a free and open source software (FOSS) community that is very deserving of my

efforts and thanks.

v


7/160

Contents

1 Introduction 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 The Current State of Malware . . . . . . . . . . . . . . . . . . . . . 3

1.1.2 Challenges to Change . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.1.3 An Overview of Our Approach . . . . . . . . . . . . . . . . . . . . . 7

1.2 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.3 Organization of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Related Work 11

2.1 Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.1.1 Virtualization Types . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.1.2 History and Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . 132.1.3 Virtual Appliances . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.1.4 Virtual Machine Contracts . . . . . . . . . . . . . . . . . . . . . . . 18

2.2 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.2.1 The Principle of Least Privilege . . . . . . . . . . . . . . . . . . . . . 20

2.2.2 Isolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.2.3 Access Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.3 Virtualization and Security . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.3.1 Virtualization and Isolation . . . . . . . . . . . . . . . . . . . . . . . 24

2.3.2 Virtualization and Access Control . . . . . . . . . . . . . . . . . . . 26

vi


8/160

2.4 Backup and Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.5 Network Security and Intrusion Detection . . . . . . . . . . . . . . . . . . . 30

2.6 Anti-virus Software and Host-based Intrusion Detection Systems . . . . . . 31

3 Architecture 333.1 Design Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.1.1 Virtualization Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.1.2 Security Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.1.3 Environment Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.1.4 Open Source and Open Standards . . . . . . . . . . . . . . . . . . . 35

3.1.5 Threat Model and Assumptions . . . . . . . . . . . . . . . . . . . . . 35

3.2 Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.3 Virtual Appliances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.3.1 Whole Desktop in a Single Appliance . . . . . . . . . . . . . . . . . 42

3.3.2 Grouping Applications Based on Access Needs . . . . . . . . . . . . 43

3.3.3 One Application Per Virtual Appliance . . . . . . . . . . . . . . . . 45

3.4 Virtual Machine Contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3.5 Virtualization Security Framework . . . . . . . . . . . . . . . . . . . . . . . 47

3.5.1 Virtualization Security Framework and Virtual Machine Contracts . 49

3.6 File System Level Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.7 Network Level Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.7.1 Hardening the Overall System . . . . . . . . . . . . . . . . . . . . . 58

4 Implementation 60

4.1 Virtualization Components . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.1.1 Hypervisor Comp onent . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.1.2 VMM Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.1.3 Builder Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.2 OSCKAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

vii


9/160

4.3 Virtual Machine Contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.4 FS-VM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.5 NET-VM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

4.6 Example Virtual Appliance . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4.6.1 Browser Virtual Appliance . . . . . . . . . . . . . . . . . . . . . . . 83

5 Evaluation 87

5.1 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

5.1.1 Virtualization Overhead . . . . . . . . . . . . . . . . . . . . . . . . . 87

5.1.2 Enforcement Overhead . . . . . . . . . . . . . . . . . . . . . . . . . . 99

5.2 Effectiveness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

5.2.1 Malware Classifcation Analysis . . . . . . . . . . . . . . . . . . . . . 102

5.2.2 Evaluation of Recovery Properties . . . . . . . . . . . . . . . . . . . 115

6 Conclusion 118

6.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

6.1.1 HCI-SEC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

6.1.2 Malware Collection and Analysis . . . . . . . . . . . . . . . . . . . . 119

6.1.3 Implementation-related Improvements . . . . . . . . . . . . . . . . . 120

6.1.4 Application to Other Environments . . . . . . . . . . . . . . . . . . 126

A Performance Results 147

A.1 Details of Performance Results for this Dissertation . . . . . . . . . . . . . . 147

A.2 Other Related Performance Evaluation . . . . . . . . . . . . . . . . . . . . . 148

viii


10/160

List of Figures

3.1 The architecture of our Rapid Recovery Desktop from a simplified network

view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.2 The Progression of Virtual Appliance Decomposition . . . . . . . . . . . . . 41

3.3 The Achitecture of our OSCKAR Virtualization Security Framework . . . . 50

3.4 The architecture of our Rapid Recovery Desktop from a file system view . . 55

3.5 The architecture of our Rapid Recovery Desktop from a network view . . . 57

4.1 Our Achitecture on a Integrated Hypervisor (such as KVM) . . . . . . . . . 63

4.2 Our Achitecture on a Stand-alone Hypervisor (such as Xen) . . . . . . . . . 64

4.3 VMM rule set that uses the generic vmm backend chosen by the VMM interface 67

4.4 VMM rule set that uses the specific qemu-spice backend . . . . . . . . . . . 68

4.5 Builder rule set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.6 A Sample Policy Manager Global Contract. Note that the $ARG is simply

the argument to the event. $ARG is replaced with the argument passed

during the event at runtime . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.7 Process of importing virtual machine contract (VMC) and starting VM . . 73

4.8 Overview of contract types, rule sets, and events supported by our contract

system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.9 General Virtual Machine Contract (VMC) Format . . . . . . . . . . . . . . 77

4.10 F S-VM example rule set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 794.11 NET-VM example rule set . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

ix


11/160

4.12 Rate-limiting NET-VM example rule set . . . . . . . . . . . . . . . . . . . . 82

4.13 Browser Appliance Virtual Machine Contract (VMC) (1 of 3) . . . . . . . . 84

4.14 Browser Appliance Virtual Machine Contract (VMC) Continued (2 of 3) . . 85

4.15 Browser Appliance Virtual Machine Contract (VMC) Continued (3 of 3) . . 86

5.1 Linux guest read performance . . . . . . . . . . . . . . . . . . . . . . . . . . 90

5.2 Linux guest write performance . . . . . . . . . . . . . . . . . . . . . . . . . 91

5.3 Windows guest read performance . . . . . . . . . . . . . . . . . . . . . . . . 92

5.4 Windows guest write performance . . . . . . . . . . . . . . . . . . . . . . . 93

5.5 Windows guest read performance using a variety of virtual disk backends . 94

5.6 Windows guest write performance using a variety of virtual disk backends . 95

5.7 Linux guest networking performance . . . . . . . . . . . . . . . . . . . . . . 97

5.8 Windows guest networking performance . . . . . . . . . . . . . . . . . . . . 98

5.9 Linux guest to FS-VM read performance . . . . . . . . . . . . . . . . . . . . 100

5.10 Linux guest to FS-VM write performance . . . . . . . . . . . . . . . . . . . 101

5.11 Linux guest networking performance . . . . . . . . . . . . . . . . . . . . . . 103

x


12/160

Chapter 1

Introduction

1.1 Motivation

General purpose computing devices, such as personal computers (PCs), and the operating

systems that run on them provide more functionality and capabilities than most users will

ever want or need. For example, these computing devices can send large quantities of emails

in seconds (on a scale proportional to the network bandwidth and computer power). A user

is unlikely to ever need to send as many emails in a lifetime as their computing device could

send in a day. However, malicious software (malware) seeks to take advantage of any spare

computing power that it can control, making full use of the frequently spare functionality

that general purpose computing devices and operating systems provide. One clear example

of this phenomenon is the commonly accepted and reported fact that over 90% of all email

is spam [78]. Although it is difficult to determine how much of this spam is sent by home

user PCs, it is estimated that 95% of all spam is sent by botnets [30], which are composed

of a variety of zombie computers including many home user PCs.

Too much of the burden of keeping a computer system secure is placed on the end users.

Users are often required to keep the operating system, applications, security software, and

anti-virus definitions up-to-date. Non-malicious or accidental incidents, such as system or

software updates, can cause more noticeable problems to users, since, unlike malware, these

1


13/160

incidents are not aiming to hide undetected in a users computer. These incidents can cause

system instability and, in the worse case, make the system unusable [33, 77, 108, 160]. As

a result, users often disable or refuse to perform updates [46, 56, 158]. Even with all the

latest security updates, users are still susceptible to zero day exploits, which are exploits

that have not been seen before and thus are not detected by traditional signature-based

security software.

When an end user falls victim to any sort of malware, such as a virus, a commonly

recommended course of action is to make backups of any critical data and then to wipe

the system completely and re-install. Throwing the computer away and buying a new one

is considered by some to be easier than getting rid of the malware through conventional

means [137,147].

Not only can malware take down the system, but it can cause the user to lose personal

data, such as pictures or documents. The most diligent of users will make sure the latest

updates are installed, keep backups of their personal data, and be careful not to click on

anything suspicious. Taking these defensive measures can reduce the chance of system

downtime and data loss, but require significant effort on the part of the user. Several

recent studies indicate that most users are unwilling to perform updates nor back up their

systems [2, 56,158]. Other studies indicate that many users are unable to adequately access

risk and will make poor security decision to attain their goals [107,153].

Fully restoring a compromised system can be an agonizing process often involving re-

installing the operating system and user applications. This can take hours or days even with

all the proper materials readily on hand. For average users, even assembling the installation

materials (for example, CDs, manuals, and configuration settings) may be an overwhelming

task, not to mention correctly installing and configuring each piece of software. Hiring

a professional to restore the system and applications can be expensive and may require

purchasing new software licenses.

To make matters worse, the process of restoring a compromised system to a usable

state can frequently result in the loss of any personal data stored on the system. From

2


14/160

the users perspective, this is often the worst outcome of an attack. System data may be

challenging to restore, but it can be restored from public sources. Personal data, however,

can only be restored from private backups and the vast majority of personal computer users

do not routinely back up their data. Once lost, personal data can only be recovered through

repeated effort (for example, rewriting a report) and in some cases can never be recovered

(for example, digital photos of a one time event).

1.1.1 The Current State of Malware

A trusting, naive design of the Internet and powerful, general purpose, commodity computer

systems have led to wide-spread security problems. The Internet was originally developed

by and for the government and universities and it was used in a trusting manner to share

information. The explosive growth of the world wide web, starting in the 1990s [118],

brought with it millions of Internet users, not all of whom had benign intentions. Malicious

hackers1 exploited an Internet that wasnt built with security in mind. To make matters

worse, the default configuration of the most popular operating system of the time was for

users to run with full administrative privileges. Thus, a virus that ran as the user had

full access to the system. Since that time, many security measures, such as public-key

cryptography, firewalls, and intrusion detection systems, have been added to the Internet

infrastructure. Commodity operating systems have also added security features, such as

built-in firewalls, user access control, and system restore.

Despite these efforts, global scale security problems, such as widespread malware and

botnet activity [13,62,71,90, 102,110], still exist. The first well-documented computer worm

of 1988 [149] exploited several widely used programs. Due to limited system security and

an overall trusting Internet infrastructure, it was able to spread quickly across much of

the Internet causing much disruption. Computer malware in the early days was often the

work of curious or playful individuals seeking to exploit for experimentation, exploration,

prank, or vandalism. This early malware generally led to minor disruption and annoyance,1Malicious hackers are more accurately defined as crackers (http://catb.org/jargon/html/C/cracker.html),

but the term hacker is commonly (mis)-used.

3


15/160

but rarely led to much damage or loss to individuals or organizations. However, modern

malware, particularly over the past decade or so, has been primarily used by organized

crime to exploit and profit from users and all kinds of organizations [79,120].

Organized crime has set up shop all across the Internet, often in the form of botnets.

Botnets are a distributed network of computers controlled remotely by malicious hackers

that can be put into action on demand to perform distributed denial-of-service (DDoS)

attacks on targeted websites, engage in mass e-mailer spam campaigns to sell pharmaceuti-

cals, or promote the page rank of other hijacked sites. All of these actions can be taken by

malicious hackers in an effort to exploit more users and systems in order to increase profits

and the size of their botnets.

There is a growing black market of professional malware and exploit kits, which may even

come with tech support [120]! These exploit kits consist of various automated tools that can

be used to trick and exploit users. An example exploit kit might include a tool that does

automated account creation and performs fake use of a popular social networking websites,

such as Facebook or Twitter. The tool could then also have support for controlling fake or

stolen accounts by a botnet. These accounts could then be used carefully and deceitfully to

gain trust among real social networking users in order to serve them targeted spam links or

hijack their personal information with phishing techniques. Profit can then be made by, for

example, using the personal information for identity theft, link-referral affiliate programs,

and tricking users to click on the various spam links [120].

Another common method used to spread malware is by using drive-by-downloads. Drive-

by-downloads are a method of attack that tricks users into visiting sites by using spam

links or typo-squatting (registering commonly misspelled domains) and then automatically

installing malicious binaries. Another related attack method is tricking users to click to

install fake plugins or fake anti-virus that are actually malware [131]. These malware-

infected computers are then used in the botnets to exploit more users and systems.

Over time, security of software improves and users are trained to be on the lookout, thus

forcing malicious hackers to find new ways to spread their malware. For instance, a more

4


16/160

recent trend is to use search engine optimization (SEO) tricks to promote malware sites to

the top of the search results for trendy and popular search terms. A user searching for Bill

Clinton to find out about his recent heart operation would likely have found themselves

downloading fake anti-virus that is actually malware [63]. In a recent study [133], fake

anti-virus was found to account for 15% of all malware detected on the web using Googles

malware detection infrastructure. The specific details of these various types of attacks have

changed, but the general nature of the attacks has not. Attackers rely on exploiting systems

or tricking users to spread their malware. The attacks will target whatever is popular, which

might mean Facebook or Twitter today, but could mean other popular technologies, such

as smart phones or new web-based applications, in the near future. These attacks are not

slowing down and are likely to only get more sophisticated [66,114,179].

1.1.2 Challenges to Change

The same basic exploit techniques have been used by malicious hackers for quite some time.

One of the main reasons that these techniques still work in practice is because general

purpose operating systems are designed to allow applications to run with the full privileges

of the user. For instance, if a user has access to read, write, or delete a file, then any

application run by that user has access to read, write, or delete that file. This is not a new

problem. Researchers realized this problem over 20 years ago [79], yet the vast majority of

users still dont have their applications restricted in an effective and usable way.

One of the main reasons that there is still no solution to this problem used in practice

is that a solution to this problem does not seem to fit in anybodys business model. Fixing

the problem does not sell new computers, it does not sell new versions of operating systems,

and it definitely does not sell new versions of security software, such as anti-virus and anti-

malware. Software companies, such as Microsoft, publish studies that recommend the use

of automatic updates as one of the most effective things that an organization can use to

help prevent the spread of malware [46]. Some companies in the malware defense business

recommend that users follow the same old security best practices [35] even in the face of

5


17/160

new and more subtle threats. However, it is well-known that threats can affect people even

those visiting legitimate websites [163]. Other anti-virus companies tend to take a band-aid

type approach, promoting their products as an effective way to keep ahead of the attackers

with automated security updates and by employing new technologies [55].

Another reason that there is still a security problem on desktop computers is that even

when reasonable solutions exist, they are rarely used in common practice. For example,

mandatory access control (MAC) systems, such as SELinux [87], AppArmor [12], and Win-

dows Mandatory Integrity Control (MIC) [97], go a long way toward solving the problem,

but are not commonly used in practice. These protections are hard to use [82] and tend

to produce too many false positives, which often leads to them being disabled. In a study

by Sunshine, et al., it was shown that users often incorrectly understand the risk involved

with SSL warnings in the browser. Another study by Motiee, et al., showed that users made

incorrect security decisions when using Windows access control protections [107]. These two

studies are examples that demonstrate users tendencies to do whatever it takes (despite

the security risk) to complete their task [153].

Another challenge to adoption of viable solutions is that the solution must reach a crit-

ical mass of users to be effective. As we will show in Chapter 2 on related work, there

are many proposed solutions in research that are unlikely to ever be used in practice. The

critical issue is that the software needs to be usable for a wide variety of users. Even if

the software both solves the problem and is usable, that does not imply that it is easily

distributed. An effective means of distribution may require original equipment manufacture

(OEM) agreements or resource and time investment in infrastructure and staff to develop

and foster user and developer communities. In any case, implementing ideas that funda-

mentally change and improve how a large majority of computer users work is a significant

undertaking.

Despite these challenges to change, we hope that the approach described in this disserta-

tion changes the way users, developers, and security professionals think about the security

of computer systems. The solution proposed in this dissertation uses well-understood secu-

6


18/160

rity practices and makes use of some of the latest innovations in virtualization technology

on commodity desktop systems. We demonstrate a desktop system that provides resis-

tance against attack, fast recovery from exploits, and minimizes the impact that any single

exploited application can have on the system and user-specific data.

1.1.3 An Overview of Our Approach

Our solution is based on separating user data into a file server virtual machine (FS-VM)

and accessing that data with virtual machine appliances, or simply virtual appliances, which

encapsulate one or more applications. Furthermore, we associate a contract with each

virtual appliance that describes its specific behavior in terms of basic resource requirements,

user data access needs, and network access specifications. Contracts restrict the virtual

appliances to the task that they were designed to do and all other access is denied by

default.

This architecture creates a situation in which a virtual appliance that is infected with

malware is not able to take over the whole system. The malware will only be able to access

a very limited set of the users personal data and only in the manner specified by the the

virtual appliances contract. By placing applications in virtual appliances, recovery from

various system problems, such as malware or malfunctioning applications, is a straightfor-

ward process. For example, it is safe to roll back the disk image of a virtual appliance

without affecting the users data, since user data is stored in the FS-VM.

From a network perspective, our architecture has a set of virtual switches that isolate

virtual appliances and the FS-VM from internal and external attacks. A network virtual

machine (NET-VM) component manages the virtual switches to enforce network policy.

Just as the FS-VM only allows access to particular data, the NET-VM only allows access

to specific network segments and only allows traffic flows that are explicitly specified in the

virtual appliances contracts. All other traffic is denied by default at the virtual switch level,

which reduces the amount of network processing done on the individual virtual appliances.

The real benefit with this type of network architecture is that any incoming connection

7


19/160

attempts or outgoing connections that are not explicitly allowed by contract rules are denied

by the NET-VM at the virtual switch level. This means that even if malware compromises

a virtual appliance and opens up a port not specified in the virtual appliance contract,

the NET-VM will not allow any incoming traffic to flow to that port. This is a significant

improvement compared to firewall-based protection, since the firewall inside the virtual

appliance can be disabled, and yet the virtual appliances networking remains protected by

the NET-VM that is controlling the virtual switch or switches that the virtual appliance is

connected to. Having this NET-VM enforcement outside of the virtual appliance presents

a significant deterrent to traditional attacks.

We tie together our architecure with a virtualization security framework that we de-

veloped, called OSCKAR. OSCKAR is used to manage the interactions between virtual

appliances and the enforcement elements (the FS-VM and NET-VM) based on virtual ma-

chine contracts and global policy. Even if a malicious entity is able to gain control of a

virtual appliance, OSCKAR policy enforcement, from outside of the virtual appliance, can

protect the rest of the system. Response to contract violations can lead to restarting a

virtual appliance or restoring it to a known good state. OSCKAR provides the framework

to provide effective interaction between the underlying virtualization technologies and the

higher level file server and network components of our architecture.

1.2 Contribution

This dissertation presents a unique desktop system architecture solution to the pervasive

problem of recovering from malware attacks. We borrow concepts from local area network

(LAN) and data center environments and apply them in a novel way to a single desktop

system. We contribute the design and implementation of several techniques that are not

available in common practice. These techniques include: a novel way of associating meta-

data with virtual machines, separating system and user data for the purposes of recovery,

supporting the rapid rollback of system state for a system under attack, and preserving user

data during the recovery process.

8


20/160

In this dissertation, we show that the current best practices do not solve the problem

addressed by our solution. We demonstrate the feasibility of our approach with the design,

implementation, and evaluation of our open source Rapid Recovery Desktop system. As

a consequence of its design, our system is also effective for recovering from non-malicious

incidents (such as system updates) that cause system instability or otherwise make the

software system unusable.

We restructure the desktop as a set of virtual machine appliances (virtual appliances)

and associate contracts with each. At the heart of our system is a virtual machine contract

(VMC) system and a virtualization security framework (OSCKAR). We construct and in-

tegrate a file server virtual machine (FS-VM) to store and protect the user personal data

store. We also construct and integrate a network virtual machine (NET-VM) to create

internal private network segments and to protect the system from external and internal

network attacks. This architecture brings many of the advantages of a well-managed local

area network (LAN) to end user desktop systems.

Using an architecture based on virtualization extends the capabilities of a LAN, since

it allows us to attach metadata (contracts) to virtual machines to manage and administer

then in ways that are not possible with an isolated desktop system. For example, virtual

appliances can be rolled back to a known good state nearly instantaneously, while at the

same time preserving the users personal data in the FS-VM. On a physical desktop, rolling

back a system would require the operating system and applications to re-installed or the

hard drive to be re-imaged. Also, with a physical desktop, the process of protecting the

users personal data requires storing the data on a distinct physical location (for example, a

spare internal or external drive). Virtual machines as a digital object are particularly well-

suited for managing, segregating, and protecting a desktop system. We demonstrate the

feasibility of our approach with an open source implementation and evaluate our prototype

in terms of performance and effectiveness.

An initial prototype implementation of our Rapid Recovery Desktop system and an

initial FS-VM can be found in our previous work [93]. In this dissertation, we do not

9


21/160

seek to innovate on the implementation of the FS-VM component itself, but do contribute

some alternative deployment strategies and provide an effective integration of an FS-VM

component into our Rapid Recovery Desktop system. In section 6.1, we describe some

potential directions that a more advanced FS-VM component could take. The NET-VM

component is a new Rapid Recovery Desktop component contributed by this dissertation.

It makes use of some of the latest advances in open source virtual switch technology. Also,

we show how the NET-VM can be integrated into our Rapid Recovery Desktop system.

Further, we have developed a prototype implementation of a generic virtualization security

framework, called OSCKAR, that supports our virtual machine appliance contract system.

We describe the generic design of the OSCKAR framework and show its application to the

specific application of our Rapid Recovery Desktop system. Finally, we evaluate our current

rapid recovery desktop system prototype in terms of performance and effectiveness against

attack.

1.3 Organization of the Dissertation

The rest of this dissertation is organized as follows. In Chapter 2, we discuss related work

in the areas of virtualization and security. Then, we discuss the design of our system in

Chapter 3, followed by the implementation details in Chapter 4. Next, in Chapter 5, we

present evaluation in terms of performance and effectiveness. Finally, in Chapter 6, we

conclude and present future work.

10


22/160

Chapter 2

Related Work

In this chapter, we build upon and complement a rich array of related work. First, we

consider the long history of virtualization, both on server-class systems, such as the IBM

mainframe, and on commodity desktop systems. Lessons learned and concepts applied over

the years have provided a great foundation on which this work lies. Next, we consider the

long history of the field of computer security, primarily from a network and information

security standpoint, and further limiting our scope to focus primarily on the factors that

have a direct impact on commodity desktop computing. Then, we describe the broad spec-

trum of work that has combined virtualization and security techniques to address various

security issues.

There is also a substantial amount of related work that shares some of the goals of

our work. First, there has been lot of effort in the network security and network intrusion

detection system (NIDS) space that is often complementary to our NET-VM. Second there

is a body of work in the backup and recovery space that is often complementary to our FS-

VM. Finally, there is a set of related work that deals with malware and host-based intrusion

detection systems (HIDS) that is often complementary to our system design as a whole.

The related work and concepts with respect to malware classification will be covered

in section 5.2, since it will be more helpful to have that discussion available to evaluate

effectiveness against attacks. Similarly, some of the more advanced concepts in the FS-VM

11


23/160

related work will be included in section 6.1. Finally, there is a growing body of work in the

area of human computer interaction (HCI) as it relates to security (SEC) in the specific

field of HCI-SEC that will also be addressed in section 6.1.

2.1 Virtualization

2.1.1 Virtualization Types

There are many different types of virtualization options. A common breakdown is into

the categories of emulation, full virtualization, paravirtualization, operating system level

virtualization, library virtualization, and application virtualization. Emulation is when

a different architecture is being created (virtualized) often in order to simulate hardware

that is not available (for example, some legacy applications require old hardware) or fordevelopment on new platforms that hardware is still being developed (for example, mobile

platform emulators). Emulators typically run much slower than other types of virtulization,

since all of the virtualization/emulation is done is software.

Full virtualization is creating a virtualized version of the same platform (for example,

x86 on x86). Full virtualization is one of the most common types of virtualization. It

performs relatively well, since most of (or sometimes all of) the operation can be performed

on platform being virtualized. We rely on hardware support for virtualization in order to

provide full virtualization support on Xen and KVM.

Paravirtualization is when the guest operating system kernel is modified in order to

support virtualization. The virtualized platform is the same and the p erformance of this

type of virtualization is fast, since the guest is able to be made virtualization-aware. The

limitation to this type of virtualization is that the operating system kernel source must be

available. We use this type of virtualization for open source operating systems running on

Xen.

Operating System level virtualization (also commonly referred to as container-based),

is when the virtual guests share the kernel of the host system. This is a very fast option,

12


24/160

but limits the guest type to be the same as the base system (for example, Linux guests

must run on a Linux base). Also, providing performance isolation (one guest consuming

lots of resources not affected other guests) has traditionally been difficult to implement with

operating system level virtualization.

The last two types of virtualization, library and application that we mention are not used

to virtualize operating system instances, but instead run at the application layer. Library

virtualization is typically done to emulate an operating system or subsystem (for example,

Wine provides a subset of the Win32 API to allows Windows applications to run on Linux).

Finally, application virtualization provides a managed runtime environment in order to have

cross platform application mobility (for example, the Java runtime environment).

2.1.2 History and Evolution

Virtual machine technology, including the virtual machine monitor (VMM) or hypervisor

was pioneered by IBM in the 1960s [34]. The original IBM VMM was designed for IBM

System/370 machines and has co-evolved over the years with the IBM hardware on which

it runs. The current iteration of the IBM VMM is now known as the z/VM hypervisor

and runs on IBM System z (zSeries) server hardware. Other VMM software and hardware

co-evolutions include IBMs pSeries hypervisors for pSeries hardware [4, 5] and Sun Mi-

crosystems Logical Domains (now Oracle VM for SPARC) for SPARC hardware [100,101].

There also exists other hardware, such as the Alpha processor, that was specifically designed

to support virtualization [70].

The history of virtual machine technology for mainstream PC platforms is an interesting

one. Since x86 hardware was not originally designed to be virtualizable [128], this introduced

additional overhead and complexity in virtualization. Also, early personal computers were

not typically configured with sufficient memory to support multiple, simultaneously running

VMs. As PCs increased in power and memory prices fell, virtualization b ecame more

feasible for commodity platforms and a number of commercial and open source virtualization

products were introduced. The Disco project [17] was the first to create a VMM that

13


25/160

ran on experimental commodity ccNUMA hardware. Members of the Disco team later

founded VMware, which is the commercial pioneer of virtual machine technology on x86

hardware [3, 168].

In this dissertation, we focus on two types of virtual machine technology: (1) Paravirtu-

alization, which requires minor modifications to an operating system, making it aware that

there is an underlying VMM and (2) Hardware-assisted full virtualization, which allows

running unmodified operating systems on top of the VMM.

The first approach, paravirtualization, a term first coined by the developers of De-

nali [175] and then popularized by Xen [10], brought with it evidence that virtualization

benefits could be achieved with low overhead. With full virtualization, the guest VM is un-

aware that it is running on simulated hardware because the interface presented is the same

as the physical hardware. With paravirtualization, however, the guest VM is aware that

it is being virtualized since it is modified to make system (or hyper) calls directly into the

hypervisor. The paravirtual modifications are usually small and are intended to improve

performance by avoiding the use of the non-virtualizable instructions [128] and optimizing

expensive operations. The paravirtualization approach has the advantage of better perfor-

mance, but since some modification to the guest is required, it is ill-suited for use with

closed source operating systems. When Xen released their performance numbers at SOSP

2003, a team of us at Clarkson University published independent verification of these re-

sults and extended the comparison to Linux running on an IBM zServer, we demonstrated

that virtualization benefits could be realized on older hardware with low performance over-

head [26].

The second approach, hardware-assisted full virtualization, first showed up for com-

modity hardware in 2005 in the form of the Intels VT-x virtualization extension, which

was followed shortly after by AMDs AMD-V virtualization extensions [162]. This marked

the beginning of a new era of virtualization software and hardware co-evolution. This co-

evolution era, which we are still in the midst of, involves the cooperation of commodity

market virtualization players such as the developers of software hypervisors (such as Xen,

14


26/160

KVM, and VMware) and hardware vendors (such as Intel and AMD). These virtualization

hardware extensions allow for unmodified guest operating systems to run more effectively

on a wider variety of virtualization platforms, such as Xen and KVM. The hardware ex-

tensions are required for full virtualization support (for example, Windows guests) on Xen

and is a requirement to use the Linux Kernel-based Virtual Machine (KVM) [74].

First generation hardware support for virtualization (VT-x and AMD-V) made proper

virtualization [128] of the x86 hardware possible, but it did not always achieve performance

gains compared to existing software approaches (such as binary re-writing) to virtualize the

x86 architecture [3]. The virtualization software and hardware co-evolution had only just

begun. In an effort to shed some light on the initial mediocre performance of x86 hardware

virtualization, Karger [70] described some performance and security lessons learned from

virtualizating the Alpha processor and compared that architecture to x86 virtualization

hardware. The Alpha processor, which is based on a reduced instruction set computing

(RISC) architecture, was specifically designed to support virtualization. This architecture

had advantages, such as the way it handled sensitive instructions, page tables, and trans-

lation lookaside buffer (TLB) misses, which made it easier to implement high performance

support for virtualization. Karger suggests that Intel and AMD should learn from the

lessons of this and other architectures that were designed to support virtualization. The

Karger paper and the general history of virtualization suggest that the co-evolution of hard-

ware and software virtualization is a process that often needs to be refined over time (like,

for example, the IBM z/VM and z Series) and that it is difficult for the hardware to be

designed to support high performance virtualization from the beginning.

The software hypervisors, such as Xen and KVM, are evolving to make better use of the

x86 hardware virtualization extensions. At the same time, the x86 hardware is evolving and

vendors, such as Intel and AMD, have released second and third generation virtualization

hardware extensions to add performance and security benefits. Second generation hardware

extensions target performance improvements for switching between guest operation systems

by adding hardware support for handling guest page tables (also referred to as shadow

15


27/160

page tables). The specific technologies released are Intel Extended Page Tables (EPT)

and AMD Nested Page Tables (NPT). Third generation hardware extensions, in the form

of input/output memory management units (IOMMUs), seek to improve the security of

virtual device direct memory access (DMA) and the performance of virtual I/O devices,

such as graphics, disk, and network. Specific IOMMU hardware releases include Intels

VT-d and AMDs IOMMU. Other hardware virtualization technologies include Intel vPro

and AMD DASH, which use on chip management capabilities and trusted platform module

(TPM) technology to provide various security and manageability opportunities.

The overall virtualization hardware and software co-evolution process is making virtual-

ization a ubiquitous part of commodity computing both in the server and desktop markets.

Further evidence that virtualization is making an impact on a wider audience is the XP

mode feature that was added to Windows 7. This feature uses virtual machine technology

to run Windows XP applications or a full Windows XP environment inside of Windows

7 [176].

2.1.3 Virtual Appliances

A more recent trend in the virtualization space is toward virtual machine appliances (or

simply virtual appliances). Virtual appliances are pre-configured virtual machine instances

that are designed for specific tasks. For example, appliances exist for user-level software,

such as browsers, and server software, such as web and database servers. The ability to

quickly deploy a pre-configured virtual appliance is a clear and compelling advantage of

virtualization and is becoming an increasingly popular method for software distribution.

Virtual appliances, at a high level, are analogous to household appliances that are used

for one particular task. An even better analogy for virtual appliances is a comparison

to information appliances. The term information appliance was coined by Jeff Raskin

and was further described in the book The Invisible Computer by Don Norman [112].

The basic idea of information appliances is that the Personal Computer (PC) is a general

purpose device and since it tries to be everything to everyone, it fails at being usable.

16


28/160

Normans purposed solution is to replace the PC with information appliances, or single

purpose devices, such as digital cameras, printers, document writers, etc. that each do

one job and do it well. He argues that special-purpose devices can be made more usable.

Information appliances together would then make up all the functions of the PC and the

computer itself would become invisible (behind the scenes). Computers already play this

role in part, but getting the computer industry to make the last big leap to a world of

information appliances is a challenging one. The Invisible Computer goes into many

aspects of the problem, from the market and business side of things to the complexities of

large programming projects and operating systems.

Sapuntzakis, et al., first introduced the concept of a virtual appliance [145], which

they described as a virtual machine that replaces a physical computer appliance (such as a

firewall). Their vision for virtual appliances was in the Collective architecture, which they

described as a compute utility that provides virtual appliances as a service. Further, they

explained that the virtual appliances would send their displays to a remote display on a thin

client. Their concept is basically what we know of today as a cloud service that provides

load balancing of infrastructure as a service (IaaS) or perhaps more closely analogous to

desktop as a service (DaaS).

Virtual appliances have been a key component of architectures developed at Clarkson.

For example, virtual computer appliances were mentioned as a component of the architec-

ture proposed by Evanchik [47]. This was followed shortly by work in which we described

how virtual machine appliances fit into a Rapid Recovery Desktop System [93]. The virtual

machine appliance concept as proposed in our paper described creating virtual appliances

by placing one or more applications that have similar data and network access needs into

a virtual machine. Further we recommended that appliances come with a virtual machine

contract that explicitly specifies those needs.

VMware further popularized the virtual appliance concept with marketing and virtual

appliance development contests with large cash prizes [61, 169]. Many other open source

and commercial vendors are also distributing virtual appliances [69, 140, 151, 165]. The

17


29/160

associated virtual machine contracts have not gained as much traction however. Virtual

machines, and therefore also virtual appliances, often come with configuration files that

specify the basic hardware needs (CPU, memory, disk, etc.) of the virtual machine, but

there is still a need for contracts that allow for the configuration of more fine-grained data

and network resource access needs. This dissertation presents a basic, extensible contract

system implementation in order to address this need.

2.1.4 Virtual Machine Contracts

To the best of our knowledge, the concept of virtual machine contracts (VMCs) in the

context of virtual appliances was first developed at Clarkson. The VMCs put forth in

Evanchiks masters thesis [47] were based on the concept of having a virtual appliance

specify a set of very specific system calls that it would be allowed to make. For example,

any read or write system calls to files or directories would need to be specified. Further,

network-based system calls, such as bind and listen, would need to be specified in the

virtual appliance contract. The contract methodology of explicit allow and default deny is

an approach that this dissertation builds upon. The contract enforcement proposed in that

thesis was based on modifying the kernel of the virtual appliance and replacing system calls

with hypercalls (system calls into the hypervisor) that are intercepted and validated by a

contract enforcement element running in the hypervisor.

Differences between that work and this dissertation are the contract specification and

enforcement aspect. In this dissertation, we implement the contract system in a much

more general and effective way, such that enforcement can occur outside of the virtual

appliance where it is harder for attackers to subvert. We are not limited to relying on

enforcement within the kernel of the guest VM, as is proposed in the architecture proposed

by Evanchik [47]. We also developed OSCKAR to give more flexibility and control to

the virtual appliance and enforcement element designers. For example, any type of virtual

appliance contract rules can be specified as long as there is an enforcement element that can

respond to them. System call-based contracts could be employed with our OSCKAR system

18


30/160

(provided that appropriate enforcement elements are implemented), but that contract style

is not used in our current Rapid Recovery Desktop implementation. Chapters 3 and 4

present the design and implementation details of OSCKAR.

In Data Protection and Rapid Recovery From Attack With A Virtual Private File

Server and Virtual Machine Appliances [93], our focus of the virtual machine contracts

was on file system contract rules in which a dedicated file server virtual machine (FS-VM)

stored user data and allowed virtual appliance to mount specific portions of the data in read,

write, or append-only fashion. The FS-VM in that paper supported read and write rate-

limiting with a modified NFS server. Here, we extend the work done in that paper to add

two new components, the OSCKAR virtualization security framework and the NET-VM.

In [92], Matthews, et al., proposed a contract system and architecture, including the

concept of enforcement elements, very similar to the architecture presented in this paper.

In that paper, they demonstrate the feasibility and approach of such a system in a data

center environment and described extending the Open Virtualization Format (OVF) [11],

which is an open standard for packaging and distributing virtual appliances, to support

more advanced data and network access rules. We extend that work to a Rapid Recovery

Desktop system and implement an open source virtualization security framework that sup-

ports custom contract rules and enforcement elements, which could include support for the

OVF standard in the future.

2.2 Security

The security principles employed in this dissertation have been well-studied and applied

in general, but we suggest using them, in combination with other technologies, such as

virtualization, in a way that it not commonly seen in practice today. Specifically, we apply

the principle of least privilege, isolation, and access control to virtual appliances.

19


31/160

2.2.1 The Principle of Least Privilege

A number of security principles were first formally describe by Saltzer and Schroeder in [143].

Among those principles was the principle of least privilege, which states that Every pro-

gram and every user of the system should operate using the least set of privileges necessary

to complete the job. Saltzer and Schroeder explain that the rationale behind this principle

is to limit the damage that can occur from an accident or error, to limit the interaction

among privileged programs, and to provide a rationale for where to place protection mecha-

nisms. The goal of the system described in this dissertation is that virtual appliances adhere

to the principle of least privilege. Virtual appliances are an effective and practical way to

implement the principle of least privilege. We are not arguing that our design perfectly

achieves least privilege. Doing so would require perfect virtual appliance contracts, a very

detail-level processing of virtual appliance operations, and would likely be intolerably hard

to use for any user. However, as we will describe in the sections that follow, access control

methods that attempt to apply the principle of least privilege to various degrees are often

disabled by users. The fact that perfect adherence may not be possible is no excuse not

to apply reasonable constraints on virtual appliances. Some examples of mechanisms that

help to apply the principle of least privilege include isolation and access control, which we

discuss in the next sections.

2.2.2 Isolation

Complete isolation, as described in [143], is a protection system that separates principals

into compartments between which no flow of information or control is possible. Two

approaches for achieving complete isolation described by Saltzer and Schroeder include

isolated virtual machines and authentication mechanisms. Virtual machines, as described

earlier in this chapter, have been used for many years on mainframe hardware, but until

the Disco project [17] in 1997 were considered computationally prohibitive on commodity

systems. So, traditional isolation has relied on authentication mechanisms, such as username

and password system login. This dissertation makes use of virtual machines to provide

20


32/160

isolation between applications stored in virtual appliances. We are certainly not the first to

make use of virtualization for this purpose, but are among a growing list of systems using

virtualization for security purposes. Examples of other research systems will be described

later in this chapter.

Although using virtual machines to provide isolation is very common in research, it is

more challenging to apply virtualization to real production systems, especially on the desk-

top. As part of this dissertation we hope to encourage taking research ideas and converting

them into real systems. This concept is exemplified by a recent alpha release of the Qubes

operating system [142], which is a new operating system based on the Xen hypervisor. We

will describe Qubes in more detail in the Virtualization and Security section of this chapter,

but for now we note that Qubes uses virtual machines and various virtualization hardware

extensions to provide isolation for applications.

2.2.3 Access Control

Discretionary Access Control

Early mechanisms for access control were described by Lampson in [80]. The principles

and mechanisms described in that paper provide the foundation for the discretionary access

control (DAC) [144] that is in common use today. The basic idea of DAC is an access control

matrix with domains (users, groups, etc.) labeling the rows, objects (files, directories,processes, etc.) labeling the columns, and capabilities or access permissions (read, write,

execute, etc.) as the entries within the matrix. The typically implementation is done with

access control lists. One major weakness with DAC is that the granularity of access is too

coarse. More specifically, if a user has access to a file, then any program running as that user

has access to that file. Our approach to mitigating this problem is to apply access control

at the virtual appliance level, specifically with the explicit allow, default deny policy (as

was described in [47]).

21


33/160

Mandatory Access Control

Recognizing the limitations of DAC are not a new revelation. One common alternative

to DAC is mandatory access control (MAC), which is sometimes referred to as rule-based

access control [85] or lattice-based access control [41]. A traditional view of MAC associates

it with multi-level security (MLS), but it has been recognized that the MLS-based approach

is too limiting to meet many security requirements [87]. The basic idea behind MAC is that

interactions between subjects (users, programs, etc.) and objects (files, programs, etc) are

handled by a set of system-wide security policies. The basic implementation is usually

that all subjects and objects are labeled and policy logic is separated from the enforcement

mechanism. MAC is significantly more sophisticated than DAC, but at a higher cost of

complexity.

The contract system presented in this dissertation shares many of the goals of MAC

(for example, limiting user and application access, and separating mechanism from policy),

but since our system is designed and implemented at the virtualization layer and applied

to virtual appliances, we are able to specify resource restrictions at a relatively high level

of abstraction. For example, we are able to write contract rules in terms of the virtual

CPUs, memory, disks, and network resources of the virtual appliance. MAC policy, on the

other hand, is typically specified in terms of lower level constructs, such as system calls and

operation system objects (i.e. files and processes). Further, virtual appliances are likelyto provide more isolation than MAC, since malware could potentially disable MAC that is

running on a traditional operating system. However, malware within a virtual appliance

would not be able to turn off our contract system unless it was somehow able to subvert

the virtual machine by, for example, breaking out of the VM and into the hypervisor or

breaking into a component of our trusted computing base.

It is also worth mentioning the various implementations of MAC found in practice.

These include SELinux [87, 148], and AppArmor [12]. Microsoft has also added a MAC

model into its operating systems with the addition of Mandatory Integrity Control starting

in Windows Vista [97, 98]. Another interesting policy enforcement tool, called Systrace

22


34/160

[130], is touted as a lightweight replacement for MAC. This tool generates system call

signatures in a learning mode and then enforces those policies in real time. A tool such

as this could be used to generate system call-based contracts for applications. The output

of such a tool could have been used directly with the implementation proposed in [47].

Finally, SELinux Sandboxes [106, 173], based on SELinux, are an attempt to further limit

applications by making them run in a temporary sandbox directory that is cleaned after

the application exits. SELinux-sandboxed applications can also be run within their own X

server environment.

Although MAC systems are becoming more powerful and easier to use, the most ad-

vanced features that they provide, such as SELinux Sandboxes, are generally only available

for Linux applications. Using virtualization, as is described in this dissertation, allows ap-

plications from other operating systems, such as Windows, to be supported. MAC concepts

and policies should also be considered complementary to our system for two reasons. First,

existing MAC application policy rules (for example, the application-specific confinement

rules that are written for existing MAC systems) could be used to help virtual appliance de-

signers build better virtual appliance contracts. Second, MAC support could also be added

to the virtualization layer of our system, the technologies that enabled this, sVirt [155] and

XSM [28], will be discussed in the next section.

2.3 Virtualization and Security

There is a vast amount of related work that attempts to apply virtualization techniques to

solve security problems. The VMM layer can be used to monitor the guest from below and,

often times, without the guest OS knowing it is being watched1. Some popular applications

that make use of this unique perspective are intrusion detection systems [48, 54, 68, 76, 81,

178,180], fault tolerance systems [15], virtual machine record and playback systems [44,73],

malware analysis tools [13], honeypots [6], secure desktop systems [96, 142, 181], trusted

1The red pill program can be used to detect if you are running in a virtual machine, see:ttp://invisiblethings.org/papers/redpill.html

23


35/160

computing platforms [53], and sandboxes [159].

Some of the early work that applied virtualization to security include the following:

Bressoud and Schneider developed fault-tolerant systems using virtual machine technology

to replicate the state of a primary system to a backup system [15]. Dunlap, et al., used

virtual machines to provide secure logging and replay [44]. King and Chen used virtual

machine technology and secure logging to determine the cause of an attack after it had

occurred [73]. Reed et al. used virtual machine technology to bring untrusted code safely

into a shared computing environment [134]. Zhao et al. used virtual machines to provide

protection against root kits [181].

2.3.1 Virtualization and Isolation

In this section we highlight two systems that use virtualization for isolation of applications.

The first system, called Isolated Execution [159], which has been released by Intel in alpha

form as an open source sandbox system that allows a user to right click on a binary exe-

cutable file and run it in a sandbox VM. Although the Isolated Execution system is in an

early development stage, it does demonstrate useful concepts that could be applied to the

system described in this dissertation.

The second system is an operating system, called Qubes [142], that is built on top of

the Xen hypervisor. At a high level, Qubes shares many of the components of our Rapid

Recovery Desktop system. Specifically, they include a network domain, which is similar to

our NET-VM component, and a storage domain, which is similar to our FS-VM component.

The overall goal of their system, like ours, is to isolate desktop applications from each other

using virtual appliances. However, their approach differs from ours is several interesting

ways. First, their virtual appliances, which they refer to as AppVMs, are assumed to be

based on a common base file system so as to be able to make use of a set of read-only core

system files, which has the benefit of being able to do updates once to that shared system

core. Due to this architectural choice, the current release only supports Linux as the base,

but they are investigating ways to support other operating systems, such as Windows, in

24


36/160

the future. Architecturally we choose to make a different choice for our Rapid Recovery

Desktop system. By creating virtual appliances that store their own system state, we are

able to more easily support a variety of base operating systems.

Another difference in the Qubes architecture is that AppVMs store user data within the

AppVMs themselves, which is in contrast to our Rapid Recovery Desktop system that stores

user data in a dedicated FS-VM. This design choice exemplifies the different approaches

in terms of recovery and threat model between the Qubes system and our Rapid Recovery

Desktop system. The Qubes system uses the storage domain to store and backup user

and application data in an encrypted file system, thus treating the storage domain as an

untrusted entity and not part of the trusted computing base. Our system, on the other

hand, uses the FS-VM to store user data and allows virtual appliances to mount specific

parts of it. In this way, we treat our FS-VM as a part of our trusted computing base and do

file system enforcement and protection outside of the virtual appliance in order to provide

an easy way to roll back virtual appliances without affecting user data. Their threat model

is specifically based around reducing the trusting computing base (they apply the concepts

of disaggregation of the Xen management domain as described in [109]), so that malware

that compromises a particular component is not able to affect other parts of the system.

In contrast, our threat model is based on distrusting virtual appliances, so that malware

that compromises a virtual appliance is not able to compromise other appliances nor user

data that is stored in a isolated, hardened, and carefully protected FS-VM. In Chapter 3,

we will describe the methods and architectural decisions we use to protect our FS-VM.

Similar to the storage domain in the Qubes architecture, the network domain is removed

from the trusted computing base of their architecture and network policy enforcement is

done within each of the AppVMs. Their reasoning for this goes back to their overall threat

model concept of reducing the size of the trusted computing base and the assumption that

having an external network component cannot provide additional security to a compromised

AppVM. As before, our Rapid Recovery Desktop system architecture is in direct contrast

to theirs in that we treat our NET-VM component as part of the trusted computing base

25


37/160

and by placing it outside of the virtual appliances we use it to protect against malicious

network activity, even in the case that a virtual appliance is compromised. We believe that

by distrusting the virtual appliances, we can limit their ability to do harm to the rest of

the system and the rest of the world.

A final difference between the Qubes architecture and ours is that theirs relies on hard-

ware support for virtualization. Specifically, they make use of the IOMMU support to give

direct access to the network card to their network domain and the storage controller to

their storage domain. Further they make use of the Trusted eXecution Technology (TXT)

and the trusted platorm module (TPM) included in Intels vPro to do crypographic sign-

ing of of boot and disk images. We plan to make use of the IOMMU capabilities for our

NET-VM and FS-VM to improve performance and security, but we do not strictly rely on

them to complete our threat model like Qubes does. This difference allows our system to

be deployable on more hardware than Qubes.

Despite the differences in architecture between Qubes and our Rapid Recovery Desktop,

there are still ways that we could make use of some of their techniques for specific use cases.

We will describe aspects of the Qubes architecture that we would like to integrate in section

6.1.

2.3.2 Virtualization and Access Control

In the Mandatory Access Control (MAC) section earlier in this chapter, we mentioned that

MAC could be added to the virtualization layer. One interesting approach taken by Quynh

et al. [132] in their VMAC system was to add a special service VM that provides central

management of MAC policies for other virtual machines. A VM like the one in that paper

might be able to integrated, as future work, into our Rapid Recovery Desktop system.

MAC was added to the Xen hypervisor in the form of Xen Security Modules (XSM) [28,

29], which were implemented by the National Information Assurance Research Lab within

the National Security Agency (NSA). XSM provide various MAC policies to be enforced

at the Xen hypervisor level. MAC has also been integrated into the libvirt virtualization

26


38/160

toolkit [83] in the form of sVirt [155]. As will be described in Chapter 4 on implementation,

libvirt is used to interact with the various virtualization capabilities on Linux (and other

OSes). sVirt allows for MAC policy enforcement for the various virtualization systems that

run on Linux, which does not yet (and may not necessarily ever completely) include support

for Xen, since Xen is a stand alone hypervisor that is not intended to be integrated into

Linux itself.

MAC policies at the hypervisor level could allow for much of the basic enforcement

that our Rapid Recovery Desktop system needs along with various other more complicated

scenarios. For example, it could be used to assign labels to VMs and enforce various policies,

such as VM A is only allowed to run if VM B is not running, at the hypervisor level. Adding

MAC support at the hypervisor level of the Rapid Recovery Desktop could be an interesting

area of future work.

2.4 Backup and Recovery

Our Rapid Recovery Desktop system is not intended to be a replacement for making back-

ups, but instead it should be considered complementary. Having backups is still required

in the case of hardware failure, for example. In this section, we consider the relationship

of backup and recovery systems to our Rapid Recovery Desktop system. With our Rapid

Recovery Desktop system, we focus on the problems of rapid system restoration and pro-

tection of user data. We are unaware of another system that has separated user data and

system data in the way that we are proposing. We optimize the handling of each to provide

rapid system restoration after an attack.

Our system also helps streamline the backup process by allowing efforts to focus on

the irreplaceable personal data rather than on the recoverable system data. This allows

backup efforts to be customized to the differing needs of system data and personal data.

The differing rates of change for system data and user data imply different backup needs

for each of these data types. Specifically, there is a mismatch between the overall rate of

change in system data and the user-visible rate of change.

27


39/160

System data changes at clearly predictable points (for example, when a new application

is installed or a patch is applied). Between these points, new system data may be written

(such as system logs), but often this activity is of little interest to users as long as the

system continues to function. For example, if a months worth of system logs were lost,

most users would be perfectly happy as long as the system was returned to an internally

consistent and functioning state. Therefore, there is little need to protect this new system

data between change points.

With user data, however, even small changes are important. For example, a user may

only add 1 page of text to a report in an 8 hour workday, but the loss of that one day of

data would be immediately visible. This means that efforts to protect user data can be

effective even if targeted at a small percentage of overall data. Users also tend to retain a

large body of personal data that is not actively being changed. Incremental backups can

be kept much smaller when focused on changes to user data rather than system data.

One common approach to providing data protection and recovery from attack is making

full backups of all data on the physical machine both personal and system data. There are

several ways to backup a system including copying all files to alternate media that can be

mounted as a separate file system (for example, a data DVD) or making an exact bootable

image of the drive with a utility such as Clonezilla [27].

Burning data to DVD or other removable media creates a portable backup that is well-

suited to restoring personal data and transporting it to other systems. Mounting the backup

is also an easy way to verify its correctness and completeness. However, backups of this

type are rarely bootable and typically require system state to be restored via re-installation

of the operating system and applications. For example, even if all of the files associated

with a program are backed up, the program may still not run correctly from the backup

(for example, if it requires registry changes, specific shared libraries or kernel support).

Making an exact image of the drive with a utility such as Clonezilla is a better way to

backup system data. It maintains all dependencies between executables and the operating

system. Images such as this can typically be either booted directly or used to re-image the

28


40/160

damaged system to a bootable state. However, images such as this are not always portable

to other systems as they may contain dependencies on the hardware configuration (such

as CPU architecture). They are also not as convenient for mounting on other systems to

extract individual files or to verify the completeness of the backup. In contrast, backing up

virtual appliances makes it easier to test backups without disturbing the system state.

Despite the limitations of backup facilities, our system is designed to complement rather

than replace backup. One goal of our system is to avoid the need for restoration from backup

by preventing damage to personal data and providing rapid recovery of system data from

known-good checkpoints. While it is still important to make backups, in many cases using

our systems built in features can mean that users do not need to make use of their backups

as often. Restoring a system from backups is often a cumbersome and manual process not

to mention an error-prone one. Given the small percentage of users that regularly backup

their system (and the even smaller percentage that test the correctness of their backups), it

is important to reduce the number of situations in which restoring from backup is required.

Our virtual machine appliances also make backups of system data that are portable to

other machines. System data is made portable by the checkpoints of the virtual machine ap-

pliances. The virtualization system handles abstracting details of the underlying hardware

platform so that guests will run on any machine.

When restoring a traditional system from a backup, users are typically forced to choose

between returning their system to a usable state immediately or preserving the corrupted

system for analysis of the failure or attack and to possibly recover data. With our ar-

chitecture, users can save the corrupted system image while still immediately restoring

a functional image. These images are also much smaller than full backups because they

contain only system data, not personal data, such as a users music collection.

A key advantage of our system relative to backups is that our architecture allows com-

promis

68184764 an attack resistant and rapid recovery desktop system

Documents