virtualization and virtual machines...virtualization and virtual machines – an introduction 4 (for...

22
Virtualization and Virtual Machines An Introduction to Their History, Theory and Application By Jack Edward Heald

Upload: others

Post on 17-Jul-2020

28 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines An Introduction to Their History, Theory and Application

By

Jack Edward Heald

Page 2: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

2

Table of Contents Introduction ................................................................................................................................................. 3 Definition & Description .......................................................................................................................... 3 Origin and Development .......................................................................................................................... 6

MULTICS ............................................................................................................................................... 6 IBM System 360/67 .............................................................................................................................. 7 UNIX ....................................................................................................................................................... 8 Microsoft ................................................................................................................................................. 9 Java & the Java Virtual Machine ......................................................................................................... 9

Advantages & Disadvantages of Virtual Machines ............................................................................ 12 Inter-Virtual Machine Communications .............................................................................................. 16 Summary Review of Popular VMs ........................................................................................................ 20

JVM ........................................................................................................................................................ 20 VMWare ................................................................................................................................................ 20 Virtual PC .............................................................................................................................................. 21 VirtualBox ............................................................................................................................................. 21 Parallels .................................................................................................................................................. 21

Page 3: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

3

Introduction

According to the wise folks at Princeton, the word “Virtual” has two meanings:

1. “Being actually such in almost every respect”.

2. “Existing in essence or effect though not in actual fact.”

When we say that a friend celebrated his 21st birthday by getting virtually blotto, what we

generally mean is that he was actually blotto in almost every respect – definition number One.

But, if he exhibited every sign of being drunk, yet lacked any of the verifiable physical

indicators of intoxication – elevated blood alcohol levels, dilated pupils, lagging eye-tracking -

then we could say he was virtually drunk – definition number Two.

As it pertains to the computer industry, the second definition of the word “virtual” is used,

most often in reference to virtual, (as opposed to actual), machines. And that is the subject of

this paper.

Definition & Description So what is a “virtual” machine? According to author Norman Hardy:

“A virtual machine is the construct of a program (such as CP/370) that behaves so much like a real machine

that an OS, or other program written to run alone on a real machine, is fooled into thinking that it is running

on a real bare machine by itself!” 1

In other words, a virtual machine is software that makes one kind of hardware look like a

different kind of hardware. “Full virtualization requires that every salient feature of the

hardware be reflected into one of several virtual machines – including the full instruction set,

input/output operations, interrupts, memory access, and whatever other elements are used by

the software that runs on the bare machine, and that is intended to run in a virtual machine.

The obvious test of virtualization is whether an operating system intended for stand-alone use

can successfully run inside a virtual machine.”2

Perhaps an illustration would help.

A normal computer, (without virtualization) is composed of the layers illustrated in Figure 1:

Page 4: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

4

(For illustration purposes only, I am describing an x86-based hardware platform running the Macintosh

operating system and Macintosh software for the “actual” computer, and the same hardware setup running a

Windows “virtual” machine for the “virtual” computer.)

4. APPLICATION SOFTWARE LAYER

(Macinstosh)

3. OPERATING SYSTEM LAYER

(Macintosh)

2. BIOS LAYER

(x86)

1. HARDWARE LAYER

(x86)

Computer System without

Virtualization Layer

Figure 1

The “actual” computer, or the computer without virtualization, is composed of four “layers”

of systems:

Layer 1. Hardware – CPU/MPU/GPU etc. This is the bit you can hold in your hands, dunk

in the river or throw out your window.

Layer 2. BIOS – basic instruction set designed for that particular piece of hardware. This

exists only as a sequence of open or closed logic gates in the silicon of the hardware. You can’t

see it, you only see its effects.

Layer 3. Operating system – the set of instructions that communicates with the BIOS,

which then communicates with the hardware, which actually executes the commands of the

Operating system. These instructions are stored, (usually), on the computer’s hard disk. When

the computer is powered on, the BIOS knows to go get this particular set of instructions and

load them into memory. From that point on, the Operating System is pretty much in charge of

Page 5: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

5

things. You as the user seldom interact directly with the operating system, other than when

you load a new program or tell the computer to shut down.

Layer 4. Application Software – the set of instructions that communicate between the user

and the operating system, (which passes the instructions along to the BIOS which passes them

along to the hardware, and back up the chain to reverse the process.). This is what we use when

we are using our computers – browsers, word processors, games, spreadsheets. This is the

“face” of the computer and how the user primarily interacts with the hardware.

By comparison, when we are talking about a virtual machine, we are talking about a “machine

inside a machine” that is constructed, (at least schematically), more like the one illustrated in

Figure 2:

7. APPLICATION for VIRTUAL MACHINE (Windows)

6. VIRTUAL OPERATING SYSTEM (Windows)

5. VIRTUAL BIOS LAYER (Windows)

4. VIRTUAL HARDWARE LAYER (Windows)

3. OPERATING SYSTEM LAYER (Macintosh)

2. BIOS LAYER (x86)

1. HARDWARE LAYER (x86)

Computer System with

Virtualization Layer

Figure 2: Computer System with Virtualization Layer

Page 6: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

6

As you can see, the virtual machine sits between the operating system of the original hardware

machine and the actual application that runs on the virtual machine. The virtual machine

resides on top of the operating system layer and communicates with the hardware not through

the O/S but through the various layers of the virtual machine.

Layers 1, 2 and 3 are the same; it is at layer 4, where the application software resides on an

actual machine, that we find the virtual machine. Note that the virtual machine itself includes

hardware, bios and operating system layers just like the actual machine on which it runs, (layers

4, 5 and 6). Layer 7, the application software, thinks it is running on a Windows machine. In

the sense that it is making calls to the Windows operating system, it can be said with accuracy

that it is indeed running on a Windows system. But that is not the entire story, for the entire

Windows system is hosted inside an actual Macintosh computer. Where the Windows BIOS

would normally be communicating directly with the hardware, it is instead communicating

with the virtual machine, which in turn communicates to the Mac O/S.

In fairness, this is only one implementation of a virtual machine; a multitude of variations on

this theme of a “machine with a machine” are possible, but at a theoretical level, this is an

accurate representation of most virtual machine implementation.

Origin and Development

MULTICS Multics (Multiplexed Information and Computing Service) was a mainframe timesharing

operating system that began at MIT as a research project in 1965. In 1964, MIT made a

request to IBM to develop a next-generation time-sharing system. IBM failed to respond, so

MIT decided to develop such a system themselves.3 The project, known as “Project MAC”,

was a joint venture with MIT, General Electric, and Bell Labs.4 Although Multics did not

implement a virtual machine environment, the type of services promised by Multics time-

sharing was so enticing to various corporate customers that they signed new contracts for

Multics systems rather than renew their existing contracts with IBM.

IBM had been promising similar services would be available on its ground-breaking System

360, but had – in true IBM fashion – been somewhat laggard in bringing it to market. The loss

Page 7: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

7

of major contracts spurred IBM to get serious about delivering the same sorts of functionality

promised by Multics.

Interestingly, the GE/Bell Labs/MIT consortium failed to deliver on the promises made to

their corporate customers before IBM desigined, developed and delivered their promised

system, but it is likely that without the competition from Multics, IBM’s development of time-

sharing services as embodied in their virtual machine would have been significantly delayed.

An interesting historical footnote is that not only did Multics play a significant role in the

development of IBM’s virtual machine, but a number of people who worked on Multics in the

60s – most significantly, Ken Thomson and Dennis Ritchie5 - ended up working on that 800-

pouind gorilla of virtual systems – Unix – in the 70s.6

IBM System 360/67 The first instance I could find of a virtual machine implementation was in 1967 when IBM

implemented a “virtual machine” on System 360 model 67. According to Tom Van Vleck7,

IBM was slow to respond to requests from various customers for a machine that could

provide time sharing. In response to the loss of significant contracts to General Electric and

the MIT Multics team, IBM engineers at the IBM Cambridge Scientific Center in

Massachusetts created a software solution to the time-sharing problem: a program for the

IBM System/360 Model 67 called CP 67 provided the illusion of several standard 360s. They

also created a single-user operating system – CMS – that would run on one of the virtual

machines provided by CP 67.

Unlike their competition, which time sliced all the hardware for each user, the IBM 360/67

with the new CP/CMS, (as it came to be called), “provided each user with a virtual IBM 360.

CP made a real 360/67 look like multiple virtual 360’s.”8

In other words, IBM engineers delivered a solution much better than mere “next-generation”

time slicing – they created a way for every user to have his or her own dedicated System/360.

(Admittedly, each was a virtual 360, not an actual implemented-in-hardware System/360, but

the user could not tell the difference.)

Page 8: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

8

As testament to the robust design of CP-67, its direct software descendant – VM/370 - was

still in commercial use on IBM’s Systems 370 and 390 as late as 2006.

UNIX In 1969, the Multics team at Bell Labs – Dennis Ritchie, Ken Thompson, M.D. McIlroy and

J.F. Ossanna - realized that the Multics project was dying a slow death. The Multics project

had not – at that point anyway – delivered on the promise of commercial time-sharing.

Nevertheless, the Bell Labs Multics team was enjoying some level of time-sharing, although it

was far too expensive in terms of processor time to be commercially viable. Recognizing that

they would lose their time-sharing computer system once the Multics project shut down, the

men that ultimately became the creators of Unix began working on an alternative operating

system strictly for their own internal use.

Strangely, a game called “Space Travel” played a pivotal role in the origin of Unix. Originally

written on Multics, “Space Travel” simulated the movement of the major bodies of the solar

system. The player controlled a ship that could fly through the solar system and attempt

landings on the various planets and moons. According to Ritchie, it cost $75 of CPU time to

play “Space Travel” on the big General Electric 6359, and the graphics processing was so slow

that the display was jerky and the game was hard to control. For those reasons, Ritchie and

Thompson completely rewrote “Space Travel” to run on a DEC PDP-7, a much smaller mini-

computer with much better graphics processing.

Their experience of programming for the PDP-7 directly led them to develop a new file

system for the machine. That was followed by a small set of user-level utilities – copy, print,

delete, edit – and finally a simple command processor. This was the primordial Unix. Brian

Kernighan suggested the name “Unix” as a bit of joke as well as a tribute to Multics. Unix, he

said, was “one of whatever Multics was many of”.)

In 1970, the “Unix team” acquired a DEC PDP-11 with a charter to create a system

specifically designed for editing and formatting text – in other words, a word processor. They

delivered a rudimentary word processor for internal use by the patent department in the last

half of 1971, and – as a result of the success of that effort – Bell Labs

Page 9: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

9

Bell Labs adopted Unix for internal use across the organization and in 1974 introduced the

UNIX operating system to the world.

One of the outstanding features of UNIX is that the concept of virtual machines is imbedded

into its DNA; each UNIX user was presented with a full virtual machine as part and parcel of

the operating system. The utility of the system was evident to many in the computer industry,

such that UNIX or some variant of it was adopted by companies such as DEC and Data

General as the O/S of choice for the hardware.

The success of the UNIX approach is self-evident. Even though it originated in the early 70’s,

UNIX is alive and thriving today in the form of OS X, (the Macintosh O/S), Linux, Solaris

and other commercial variants such as AIX, (IBM), HP-UX, (Hewlett-Packard) and

UnixWare., (Caldera).

Microsoft Notably, none of the operating systems developed by Microsoft – DOS or Windows –

supported virtualization until the company purchased Connectix’s Virtual PC in 2003. This is

even more astonishing when you consider that Microsoft’s Windows NT was developed by

the same team that worked at DEC on the PDP-11.

Java & the Java Virtual Machine The Java programming language was developed at Sun Microsystems in the early 1990s.

Applications written in Java run only on a Java Virtual Machine, (JVM). However, JVMs are

available for a wide array of devices. This is a powerful concept; it means that a programmer

can write an application one time, and it can then be run on any kind of hardware, as long as

that hardware is running a Java Virtual Machine. Java and JVM were designed from the ground

up to be hardware- independent. Because Java applications are dependent upon Java Virtual

Machines, you cannot discuss the virtual machine without discussing the programming

language.

Java began life in 1990 when Patrick Naughton wrote a lengthy diatribe to Sun President Scott

McNealy complaining about all the shortcomings of Sun’s software application programming

interfaces, (APIs). Naughton had accepted a new position at NeXT Computing, (Steve Jobs’

Page 10: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

10

post-Apple creation), and had nothing to lose, so he pulled no punches in his critique of Sun’s

shortcomings. To his surprise, senior executives at Sun saw the letter, agreed with it and

ultimately made a counter-offer to Naughton. The asked him to stay and offered him a pure

research position with the mandate to “make something cool”10. The team ultimately

numbered thirteen, including Naughton as the originator, Gosling as lead developer and eleven

others.

Early in the development process, they decided they wanted to create applications that would

run on a wide variety of embedded systems with minimal resources. Because the team was

composed of serious coders and because it was the early 90s, they were partial to C++, but

that language required far too much overhead to be seriously considered for embedded

systems. Instead, they created a language like C++, the language that ultimately became known

as Java.

They took apart all sorts of home electronics and appliances – everything from TV remote

controls to Nintendos to VCRs, set-top boxes and laser disk players in an attempt to figure out

a way to make home electronics communicate with each other. In the course of their research,

they found that all these various devices used different CPUs.

Thus if a manufacturer wanted to add functions or features to a TV or VCR, they were stuck because they

were limited by what the hardware and its wired-in programming would allow them to do. This, coupled with the

fact that the chips used by many of these devices were limited in program space, suggested a fresh approach to

software programming that might be a key to enabling innovation in this product space.11

With this realization, the team was driven to create a virtual machine that could run on

multiple devices, so that applications developed in their new language could run without the

developer having to for a specific piece of hardware.

In 1993, the team suffered a setback when a potential contract with Time-Warner was

scrubbed. Time-Warner was going to make use of the new technology in their set-top boxes

for TV-On-Demand, but Time-Warner backed out at the last minute. When the NCSA

introduced Mosaic, the world’s first web browser in 1994, the team redirected its efforts

toward creating a Java Virtual Machines for the browser. This was a either a very insightful or a

Page 11: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

11

very lucky decision, but one way or another, Java was exactly what the internet community

needed – a way for programmers to create applications that could be deployed across a wide

variety of platforms without worrying about the specific needs of individual CPUs and

operating systems. In 1995, Sun announced that Java and made it available to the world.

Java is a bit of a unique animal in the virtualization world, because it takes the concept of

virtualization to a level previously unexplored. As I mentioned earlier, you cannot really talk

about Java the language without talking about Java the virtual machine.

If we refer back to our schematic of the difference between an actual machine and a virtual

machine, you will recall that a virtual machine is hosted on an actual machine and

communicates with the hardware layer through the actual machine’s operating system and

BIOS. A Java virtual machine is no different. Where it stretches the concept of virtualization,

however, is at the interface between the top layer – the application layer – and the virtual

machine.

As you can see from Figure 3, the application layer and the virtual machine layer are both

“Java-specific”. Java applications – by their very nature – are limited in scope and it is quite

normal for other hardware-specific applications to run on the same hardware with a Java

Virtual Machine.

5. JAVA APPLICATION for JAVA

VIRTUAL MACHINE

4. VIRTUAL HARDWARE

LAYER (Hardware Specific)

Other Applications

3. OPERATING SYSTEM LAYER

2. BIOS LAYER

1. HARDWARE LAYER

Figure 3 - Java application on Java Virtual Machine

Page 12: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

12

Furthermore, it would not be beyond the realm of possibility for other virtual machines to be

running on the same hardware platform at the same time as a JVM.

Anyone who has used a web browser has in all likelihood used a Java applet, (a small, limited-

scope application), with a Java Virtual Machine, whether they knew it or not. JVMs are

available for all major browsers, (Internet Explorer, Firefox, Safari, Chrome and Opera) and

for well over 50 different CPU and operating system combinations.12

Advantages & Disadvantages of Virtual Machines We can go back to the original virtual machine, the IBM 360 model 67 running CP/CPM, and

see that the advantages and disadvantages of that particular virtual machine implementation

still apply today.

The advantages then and now are fairly straightforward: the user has sole use of the virtual

machine, and failures inside the virtual machine are limited to that machine alone.

The disadvantages are also the same: running an application on a virtual machine makes far

greater demands on the host system than applications running “directly” on the hardware, and

because of those additional demands, the performance of the virtual machine will necessarily

be less responsive than that of an actual machine.

When a user ran CP/CPM on the S/360 Model 67, it appeared to him or her that he had

complete control of the computer. He had access to all the resources of the machine, had

100% of the available (virtual) CPU cycles devoted exclusively to his use, and didn’t need to

worry about his system being crashed or compromised by the actions of other users sharing

the system.

In fact, it is this last point – isolation of CPU processes – that was and still is the primary goal

in the development of virtual systems. The original multi-user, (or multi-process), systems

suffered from the problem of little-to-no isolation between processes. For example, a poorly

written printer process could cause problems across the entire system, causing failures ranging

from insignificant bugs to complete system crashes. The impact of the problem was multiplied

on systems that time-sliced processor cycles, because you could have tens or hundreds or even

thousands of users and/or processes running on the same machine, and a small, insignificant

Page 13: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

13

bug on a minor application could cause problems to cascade across the entire system.

Virtualization, when implemented well, isolated such bugs to a single virtual system. A bug

might crash a system, but it could at worst only crash a single instance of the virtual machine

rather than crashing the entire computer system.

“Inter-process communication” is the term coined to describe the interaction between various

processes, sub-routines and functions within a single computer system or between networked

systems. For example, IPC is what allows a process that displays a stock-ticker feed to

communicate with a process that displays a web page.

Here’s a definition of IPC from Webopedia:

Inter-process communications (IPC) is “…a capability supported by some operating systems that allows one process to

communicate with another process. The processes can be running on the same computer or on different computers connected

through a network. IPC enables one application to control another application, and for several applications to share the

same data without interfering with one another. IPC is required in all multiprocessing systems, but it is not generally

supported by single-process operating systems such as DOS.13

The key phrase in the previous definition is “share the same data without interfering with one

another”. Early time-sharing systems had to deal with the problem of multiple users of the

same system resources “stepping on” one another’s data and/or processes. You could easily

have 100 users all wanting access to the same printer, the same disk storage and – most

critically – the same memory. The control processes (at the BIOS and O/S level) of the

computer needed to keep track of the machine state of each user’s processes and manage the

interaction between them.

Although conceptually this problem is easy to state, in practice it proved to be nearly

impossible to manage. One reason is that on these time-slicing systems, users were not only

sharing external devices such as printers, disks, card readers and tape machines, they were also

sharing internal memory. It was in this shared memory area that the limitations and dangers of

time-slicing became apparent.

The greatly simplified schematic in Figure 4 shows the major sub-systems in early time-sliced

computers.

Page 14: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

14

CPU

MEMORY

Addresses for:

user data, user states, OS

states, device states,

device data

INTERFACE

to DEVICES

Figure 4: Major sub-systems in Time-Sliced Computers (Greatly simplified)

In a time-sliced system, each user got a proportional share of the clock cycles of the CPU. The

state of each user had to be tracked and recorded in memory. All by itself, that is not a

significant issue. The O/S assigned an address in memory for each user, and a corresponding

address for the user’s state. The number of users was naturally limited on the one hand to the

number of physical connections available, and on the other hand to the amount of memory

addresses available for storing users and state information.

The significant problems occurred when you began tracking the state of all connected devices,

the state of the CPU for each user, and the state of each user’s data. User 1 might be accessing

the disk to do a Write, while user 2 might be trying to do a Read, and user 3 might be doing an

Erase – all on the same disk and perhaps even with the same data. Users could perform

different, (and sometimes mutually exclusive), actions with devices and data. Obviously, it is

impossible to both erase, read and write the same data at the same time, but under the write

circumstances, a time-slicing system could find itself attempting to do exactly that.

The number of possible states rose exponentially with the number of users, devices, related

connections and possible actions, and with it, the number of possible toxic states rose

proportionally. Although keeping it all straight without the benefits of virtualization is

Page 15: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

15

theoretically possible, (after all, the number of possible states is finite), in practice it was

impractical to try to anticipate and account for every possible combination of user, device and

CPU states.

Inevitably, some combination of machine states would occur that the O/S designer and/or the

Application designer had never anticipated, with unpleasant and undesirable results: corrupted

memory, locked devices, crashed systems. If the sysadmin was fortunate, the effect of the

failure was immediate and impossible to miss. If not, the failures would build unobserved over

time until at last the system was overwhelmed, data irretrievably corrupted and irreplaceable

time lost.

It was the unpredictability of the finite but very large number of possible interactions within

system(s) that led early designers to attempt to isolate processes and thus reduce and manage

the number of possible interactions between system processes. The solution then as now was

virtual machines.

The schematic in Figure 5 shows how virtualization “compartmentalizes” memory so that a

failure on the part of one process is isolated from the rest of the system.

CPU

Virtualized

MEMORY

INTERFACE

to DEVICESVirtualized

MEMORY

Virtualized

MEMORY

Virtualized MEMORY

Figure 5: Compartmentalized Memory

Page 16: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

16

In this setup, while a toxic combination of processes is still able to muck up its own memory

space and thus corrupt data or crash a system, it is not able to bring the entire actual system to

its knees. In the event of a failure in one virtual system, the memory space for that virtual

system would be “rebooted”, (whatever that may mean for the particular VM implementation),

while the other still-functioning VMs remain untouched. Literally the only impact on the

overall system will be the clock cycles required to manage the virtual system.

(A clever developer will easily see holes in this concept, holes that could be exploited by

someone with malevolent intent, but the reader should remember that this schematic is greatly

simplified for the purposes of explaining the advantages and disadvantages of virtualization; it

was not intended to serve as a blueprint for system designers.)

Inter-Virtual Machine Communications No machine – whether virtual or actual – is as valuable when running stand-alone as when

communicating with other machines. In fact, the value of a machine tends to increase

exponentially with the number of machines to which it is connected. This is what’s commonly

known as “the network effect”. In other words, machines on a network are more

useful/powerful/valuable than stand-alone machines.

The de facto standard protocol for machine-to-machine communication is TCP/IP. This

protocol was designed primarily to assure that messages passed between machines would

arrive intact. (Since the message originator normally knows almost nothing about the message

receiver, the protocol had to manage all the possible problems associated with such a high

degree of uncertainty.

When we think of a network of machines, we normally think of discrete actual - as opposed to

virtual – machines that share nothing except a connection to the network. In such an

environment, the benefit derived from using TCP/IP to communicate to machines on the

network is worth the processing overhead costs incurred.

However, when all the machines on the network are virtual machines that reside on the same

physical machine, then the underlying assumptions governing the use of TCP/IP are no longer

valid. Specifically, the hardware the machines are running on is known, the amount of memory

Page 17: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

17

and storage available to the machines is known and the type and number of processors

available to the machines is known.

Because virtualization adds overhead processing demands to the processing unit,

communication to and from virtual machines tends to be slower than machine-to-machine

communications when virtualization is not involved. (All else being equal, if you are running

machines on a network, the ones without a virtualization layer will perform faster than those

with such a layer.) Therefore, to take full advantage of the benefits of networking, it has been

incumbent upon the industry to create better inter-machine communications for virtual

machines.

Before discussing ways to improve communications to and from virtual machines, I first want

to briefly review inter-machine communications with no regard to virtualization.

In the OSI model, TCP/IP sits at Layer 3 of the communications stack. This is the layer that

manages the routing of data between different addresses on the network as illustrated in Figure

6.

Page 18: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

18

Figure 6 - The OSI Model

The leftmost column illustrates the 7-layer OSI communications stack. The rightmost column

illustrates TCP/IP residing at Layer 3 in the OSI model. The TCP/IP communications stack

has three layers itself.

Page 19: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

19

Since all virtual machines residing on a physical machine share storage space, memory space

and processors, a successful VM operating systems has to make sure these virtual machines

don’t negatively affect the state of one another. This was the focus of early VM development.

However, with the rise of the internet and the benefits of networking taking more priority, VM

developers have realized that the shared memory, storage and processor space of virtual

machines can be leveraged to increase machine-to-machine communication speed.

TCP/IP is overkill in such an environment. To address these needs, a number of different

VM-specific communication protocols have been developed. Perhaps the most popular, (if

only because the underlying VM is most popular), is VMCI from VMWare.

According to VMWare, “the Virtual Machine Communication Interface (VMCI) supports fast

and efficient communication between a guest virtual machine and its host server, or between

multiple virtual machines on the same host.”14 Note that this interface works exclusively on a

single physical machine, in direct contrast to popular internet communications protocols

which are designed for machine-to-machine communications.

Thinking back to the OSI model, VMCI resides at the hardware layer of the virtual machine

rather than the network layer. That’s why it is so much faster and more efficient. The

schematic in Figure 7 illustrates this:

Figure 7 - VMCI schematic15

Page 20: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

20

Summary Review of Popular VMs Two major categories of VMs are most popular today: Application VMs and platform VMs.

An application VM is a virtual machine designed to host specific applications. A platform VM

is designed to host a particular CPU and/or Operating System. It would not be inaccurate to

say that an application VM has a more focused and therefore more narrow scope than an

platform VM.

JVM By far the most popular and widely used VM in existence, the Java Virtual Machine is an

application Virtual Machine rather than a Platform Virtual Machine. Because it is an

application VM, the applications which run under JVM tend to be narrowly focused, typically

of the “do one thing and do it well” variety. As mentioned previously, the beauty of the JVM is

that any application written in Java can be run on a JVM hosted on any kind of platform. With

well over 50 different platforms currently supported, Java applications can be run on a wider

variety of platforms than any other development environment in existence.

Because it is an application VM, the JVM typically does not require as much processor

overhead as a platform VM.

VMWare16 VMWare is the name for a family of virtual machine products from a company by the same

name. VMWare has a number of titles that are free and others that are commercial.

VMWare’s VMWare Player is likely the most widely used free virtualization software. It can be

loaded on any Intel x86 based machine and will run Windows, Solaris, Linux and Mac OS X,

all with various limitations.

VMWare Workstation is the commercial version of VMWare Player. (In fact, Player is included in

the Workstation distribution.) Its features are a superset of Player’s. One notable advantage of

Workstation over Player is that it allows you to copy files from the virtual PC to the host PC.

One disadvantage is that the additional features incur additional performance penalties.

Although it is possible to host Mac OS X in VMWare on a Windows machine, VMware made

its reputation by allowing Mac users to host Windows and/or Linux on their Mac.

Page 21: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

21

Virtual PC17 Virtual PC is Microsoft’s free VM. It allows you to host the various flavors of Windows

operating systems on a Windows machine. It also supports OS/2 hosting. I’ve used Virtual PC

and found it to be light and fast. If you need to host one Windows machine on another

Windows machine, there is likely no better solution since Microsoft wrote both the host and

the guest applications and knows Windows better than anyone else in the world.

I did not attempt to host an OS/2 session, so I cannot address the performance of the OS/2

VM.

Some of the criticisms of Virtual PC are that it is not as feature rich as some of the

competition. On the other hand, it is very simple to install and it is free, so it would be a good

starting point for someone wanting to begin experimenting with virtualization on a Windows

platform.

VirtualBox18

VirtualBox is Sun’s entry in the VM world. Like VirtualPC, it is free. Unlike VirtualPC, it is not

limited to Windows. There are versions of VirtualBox available for Windows, Linux, Mac and

Solaris operating systems. VirtualBox is an Intel x86, AMD64/Intel64 virtual machine which

supports a wide variety of OS’s including all the Windows platforms since NT,

DOS/Windows 3.x, Linux, Solaris, OpenSolaris and OpenBSD. It runs on Mac, Windows,

Linux and OpenSolaris hosts. It offers a wider range and depth of support for devices than

any other popular platform.

I have installed and briefly used VirtualBox to host Linux on Windows Vista. I can report that

installation was straightforward and fairly easy. Although startup is noticeably slow, once the

VM is loaded, performance is acceptable. Because it offers such a wide range of features, the

learning curve for implementing VirtualBox is much steeper than that for VirtualPC, but if you

need the features of VirtualBox – such as the ability to run on Mac or Linux and/or to host

something other than Windows, then VirtualBox appears to be the best freeware choice.

Parallels19 Parallels is a commercial product from a company of the same name, (notice a theme here?)

Like VMWare, there are multiple products which fall into two categories of virtualization:

Page 22: Virtualization and Virtual Machines...Virtualization and Virtual Machines – An Introduction 4 (For illustration purposes only, I am describing an x86-based hardware platform running

Virtualization and Virtual Machines – An Introduction

22

desktop virtualization and server virtualization. In fairness to the competition, it should be

pointed out that Parallels markets some of their products as “virtualizations” when in fact they

are merely utilities for moving files from one OS to another. However, the company does

provide a very popular “Windows on Mac” version that does what it sounds like: it allows you

to run Windows applications on a Mac. They also sell a “Windows and Linux” version that

allows you to run multiple instances of both Windows and Linux on the same box at the same

time.

Sales figures for VMWare and Parallels are not publicly available, so it is difficult to say which is

most popular. It must be pointed out, however, that the focus of Parallels, at least in the

desktop virtualization world, is narrowly focused whereas VMWare casts as much wider net.

End Notes 1 http://www.cap-lore.com/Software/CP.html 2 http://www.spiritus-temporis.com/ibm-system-360-model-67/virtualization.html 3 http://www.multicians.org/mgp.html#ProjectMAC 4 http://web.mit.edu/multics-history/ 5 http://www.multicians.org/unix.html 6 http://en.wikipedia.org/wiki/Multics#Novel_ideas 7 http://www.multicians.org/thvv/360-67.html 8 Ibid. 9 http://cm.bell-labs.com/cm/cs/who/dmr/hist.html 10 http://web.archive.org/web/20010610045734/http://java.sun.com/nav/whatis/storyofjava.html 11 Ibid. 12 http://en.wikipedia.org/wiki/List_of_Java_virtual_machines 13 http://www.webopedia.com/TERM/I/interprocess_communication_IPC.html 14 http://www.vmware.com/support/developer/vmci-sdk/ 15 http://pubs.vmware.com/vmci-sdk/vmci_architecture_8.jpg 16 http://www.vmware.com/ 17 http://www.microsoft.com/windows/virtual-pc/ 18 http://www.virtualbox.org/ 19 http://www.parallels.com/products/desktop/pd4wl/