cloudviews 2010 proceedings

76
ro oud Euro Clou CLOUDVIEWS 2010 CLOUD ECOSYSTEM PROCEEDINGS CLOUDVIEWS 2010 2 ND CLOUD COMPUTING INTERNATIONAL CONFERENCE MAY 20-21 2010, PORTO, PORTUGAL

Upload: paulo-calcada

Post on 18-Feb-2016

228 views

Category:

Documents


0 download

DESCRIPTION

CloudViews 2010 Cloud Computing International Conference Proceedings

TRANSCRIPT

Page 1: CloudViews 2010 Proceedings

EuroCloud

EuroCloud

EuroCloud

EuroCloud CLOUDVIEWS

2010 CLOUD ECOSYSTEM

CLO

UD

ECO

SYST

EM P

RO

CEED

ING

S CL

OU

DV

IEW

S 2

010

2nd CLOUD COMPUTINGINTERNATIONAL

CONFERENCE

PROCEEDINGS CLOUDVIEWS 2010 2ND CLOUD COMPUTING INTERNATIONAL CONFERENCE MAY 20-21 2010, PORTO, PORTUGAL

Page 2: CloudViews 2010 Proceedings

EuroCloud

EuroCloud

EuroCloud

EuroCloud

Page 3: CloudViews 2010 Proceedings

CloudViews 2010 Cloud ECosystEm

PRoCeediNGs CloudViews 2010

2Nd Cloud ComPutiNG iNteRNatioNal CoNfeReNCe

mAy 20-21 2010, PoRto, PoRtuGAl

Page 4: CloudViews 2010 Proceedings

PRoCeediNGs of the CloudViews 2010

2Nd Cloud ComPutiNG iNteRNatioNal CoNfeReNCe

may 20-21, 2010, PoRto, PoRtuGal

EditoRs

Benedita MalheiroMiguel LeitãoPaulo CalçadaPedro Assis

PRoGRAm CommittEE

António Costa (ISEP)António Pinto (ESTGF)Benedita Malheiro (ISEP)Miguel Leitão (ISEP)Inês Dutra (UP)Paulo Calçada (IPP)Ricardo Costa (ESTGF)Sérgio Lopes (IPP)Hugo MagalhãesJuan Burguillo (UVigo)

Design: antoniocruz.euPrint: Minerva, Artes GráficasISBN: 978-989-96985-0-5Deposito Legal:

Copyright © 2010 EuroCloud Portugal

Copying is permitted provided that copies are not made or distributed for direct commercial advantage, and credit to the source is given. Abstracting is permitted with credit to the source. Contact the editor or the publisher for other uses.

A publication of EuroCloud Portugal Association

Campus ISEPRua Dr. António Bernardino de Almeida, 4314200-072 PortoPortugal

Page 5: CloudViews 2010 Proceedings

suPPoRtERs

Page 6: CloudViews 2010 Proceedings

PRoCEEdinGs sPECiAl sPonsoR

PARtnERs

Page 7: CloudViews 2010 Proceedings

ContEnts

VI AUTHoRS

VII MESSAGE FRoM EDIToRS

1 How HARDwARE VIRTUALIzATIon woRkS GREGoRy PfistER

19 USInG PRIVATE CLoUDS To InCREASE SERVICE AVAILABILITy

LEVELS AnD REDUCE oPERATIonAL CoSTS nARCiso montEiRo

29 CERn'S VIRTUAL BATCH FARM tony CAss, sEbAstiEn GoAsGuEn, bElmiRo moREiRA, EwAn

RoChE, ulRiCh sChwiCkERAth, RomAin wARtEl

41 GRID, PAAS FoR E-SCIEnCE J. GomEs, G. boRGEs, m. dAvid

49 PRIVACy FoR GooGLE DoCS: IMPLEMEnTInG A

TRAnSPAREnT EnCRyPTIon LAyER liliAn Adkinson-oREllAnA, dAniEl A. RodRíGuEz-silvA,

fEliPE Gil-CAstiñEiRA, JuAn C. buRGuillo-RiAl

57 wEB BASED CoLLABoRATIVE EDIToR FoR LATEX DoCUMEnTS fábio CostA, António Pinto

63 MAnAGInG CLoUD FRAMEwoRkS THRoUGH MAInSTREAM AnD EMERGInG nSM PLATFoRMS

PEdRo Assis

Page 8: CloudViews 2010 Proceedings

› vi

AuthoRs

António PintoCIICESI, Escola Superior de Tecnologia e Gestão de FelgueirasPolitécnico do Porto InESC Porto, [email protected]

bElmiRo moREiRAEuropean organization for nuclear ResearchCERn CH-1211, Geneva, Swizterland

dAniEl A.RodRíGuEz-silvAR&D Centre in Advanced TelecommunicationsLagoas-Marcosende s/n, 36310, Vigo, Spain

[email protected]

EwAn RoChEEuropean organization for nuclear Research

CERn CH-1211, Geneva, Swizterland

fábio CostACIICESI, Escola Superior de Tecnologia e Gestão de FelgueirasPolitécnico do [email protected]

fEliPE Gil-CAstiñEiRAEngineering Telematics Department, Universidade de Vigo C/ Maxwell, s/n, Campus Universitario de Vigo, 36310, Vigo, Spain [email protected]

G. boRGEsLaboratório de Instrumentação em Física Experimental de PartículasLisboa, Portugal

GREGoRy PfistERResearch Professor, Colorado State University, USAIBM Distinguished Engineer (retired)[email protected]://perilsofparallel.blogspot.com

J. GomEsLaboratório de Instrumentação em Física Experimental de PartículasLisboa, [email protected]

JuAn C.buRGuillo-RiAlEngineering Telematics Department, Universidade de VigoC/ Maxwell, s/n, Campus Universitario de Vigo, 36310, Vigo, Spain [email protected]

liliAn Adkinson-oREllAnAR&D Centre in Advanced TelecommunicationsLagoas-Marcosende s/n, 36310, Vigo, [email protected]

m. dAvidLaboratório de Instrumentação em Física Experimental de PartículasLisboa, Portugal

nARCiso montEiRoPorto, [email protected]://narcisomonteiro.gamagt.com

PEdRo AssisSchool of EngineeringPorto Polytechnic [email protected]

RomAin wARtElEuropean organization for nuclear ResearchCERn CH-1211, Geneva, Swizterland

sEbAstiEn GoAsGuEnEuropean organization for nuclear ResearchCERn CH-1211, Geneva, SwizterlandClemson UniversityClemson, SC 29634, [email protected]

tony CAssEuropean organization for nuclear ResearchCERn CH-1211, Geneva, Swizterland

ulRiCh sChwiCkERAthEuropean organization for nuclear ResearchCERn CH-1211, Geneva, Swizterland

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 9: CloudViews 2010 Proceedings

vii ‹

mEssAGE fRom EditoRs

The motto of 2nd Cloud Computing International Conference, which took place at the Instituto Superior de Engenharia do Porto on the 20th and 21st of May 2010, was the development of a seamless cloud ecosystem. It was organised by EuroCloud Portugal Association as an open forum for the exchange of knowledge, ideas and technology, contributing and promoting to the materialization of a true cloud ecosystem.

The conference included several activities involving companies, researchers, managers and students and was organised into two main streams addressing, respectively, the state of the art (main event) and the new challenges and ideas (Business opportunity Forum). The main event hosted the presentation of the current scientific, technical and commercial solu-tions by some of the most relevant Cloud Computing players from the industry – CISCo, IBM, SalesForce, EMC2, Microsoft, novell, PHC, Primavera, Galileu, etc. – and from academia – open nebula, CERn, LIP, CMU Portugal, etc. The Business opportunity Forum held the presentation and debate of new business ideas and commercial products as well as scientific proposals and prototypes, promoting the exchange of knowledge and fostering cooperation between acade-mia and industry. The best commercial idea/product was elected by an invited panel of experts.

This book gathers a group of selected papers presented at the 2nd Cloud Computing Inter-national Conference. They cover distinct relevant cloud ecosystem aspects such as hardware virtualization ["How Hardware Virtualization works", G. Pfister], service availability ["Using Private Clouds to Increase Service Availability Levels and Reduce operational Costs", n. Mon-teiro], cloud infrastructure development ["CERn's Virtual Batch Farm", T. Cass et al.], evolution and complementarities between the Grid and the Cloud paradigms ["GRID, PaaS for e-science", J. Gomes et al.], privacy ["Privacy for Google Docs: Implementing a Transparent Encryption Layer", L. Adkinson-orellana et al.], collaboration ["web based collaborative editor for LaTeX documents", F. Costa et al.] and management ["Managing Cloud Frameworks through Mains-tream and Emerging nSM Platforms", P. Assis].

The Editors would like to express their gratitude to the authors for their valuable contri-bution to the conference and to this publication as well as to the participants of the 2nd Cloud Computing International Conference for the lively and interesting discussions provided.

Porto, May 2010EuroCloud Portugal Association

PRoCEEDInGS

Page 10: CloudViews 2010 Proceedings
Page 11: CloudViews 2010 Proceedings

1 ‹

How Hardware Virtualization Works

Gregory Pfister

Research Professor, Colorado State University, USA IBM Distinguished Engineer (retired)

[email protected]

http://perilsofparallel.blogspot.com

Abstract. Zero. Zilch. Nada. Nothing. Rien. That’s the best approximation to the intrinsic overhead for computer hardware virtualization, with the most mod-ern hardware and adequate resources. Judging from comments and discussions I’ve seen, there are many people who don’t understand this. So I’ll try to ex-plain here how this trick is pulled off.

Keywords: Virtualization, Cloud Computing

1 Virtualization and Cloud Computing

Virtualization is not a mathematical prerequisite for cloud computing; there are cloud providers who do serve up whole physical servers on demand. However, it is very common, for two reasons:

− First, it is an economic requirement. Cloud installations without virtualization are like corporate IT shops prior to virtualization; there, the average utilization of commodity and RISC/UNIX servers is about 12%. (While this seems insane-ly low, there is a lot of data supporting that number.) If a cloud provider could only hope for 12% utilization at best, when all servers were used, the provider will have to charge a price well above competitors who do not have that disad-vantage.

− Second, it is a management requirement. One of the key things virtualization does is reduce a running computer system to a big bag of bits, which can then be treated like any other bag o’ bits. Examples: It can be filed, or archived; it can be restarted after being filed or archived; it can be moved to a different physical machine; and it can be used as a template to make clones, additional instances of the same running system, thus directly supporting one of the key features of cloud computing: elasticity, expansion on demand.

Notice that I claimed the above advantages for virtualization in general, not just the hardware virtualization that creates a virtual computer. Virtual computers, or “virtual machines,” are used by Amazon AWS and other providers of Infrastructure as a Ser-vice (IaaS); they lease you your own complete virtual computers, on which you can load and run essentially anything you want.

PRoCEEDInGS

Page 12: CloudViews 2010 Proceedings

› 2

In contrast, systems like Google App Engine and Microsoft Azure provide you with complete, isolated, virtual programming platform – Platform as a Service (PaaS). This removes some of the pain of use, like licensing, configuring and maintaining your own copy of an operating system, database system, and so on. However, it re-stricts you to using their platform, with their choice of programming languages and services.

In addition, there are virtualization technologies that target a point intermediate be-tween IaaS and PaaS, such as the containers implemented in Oracle Solaris, or the WPARs of IBM AIX. These provide independent virtual copies of the operating sys-tem within one actual instantiation of the operating system.

The advantages of virtualization apply to all the variations discussed above. And if you feel like stretching your brain, imagine using all of them at the same time. It’s perfectly possible: .NET running within a container running on a virtual machine.

Here, however, I will only be discussing hardware virtualization, the implementa-tion of virtual machines as done by VMware and many others. Also, within that area, I am only going to touch lightly on virtualization of input/output functions, primarily to keep this article a reasonable length.

So, on we go to the techniques used to virtualize processors and memory.

2 The Goal

The goal of hardware virtualization is to maintain, for all the code running in a virtual machine, the illusion that it is running on its own, private, stand-alone piece of hard-ware. What a provider is giving you is a lease on your own private computer, after all.

“All code” includes all applications, all middleware like databases or LAMP stacks, and crucially, your own operating system –including the ability to run differ-ent operating systems, like Windows and Linux, on the same hardware, simultaneous-ly. Hence: Isolation of virtual machines from each other is key. Each should think it still “owns” all of its own hardware.

The result isn’t always precisely perfect. With sufficient diligence, operating sys-tem code can figure out that it isn’t running on bare metal. Usually, however, that is the case only when specific programming is done with the aim of finding that out.

3 Trap and Map

The basic technique used is often referred to as “trap and map.” Imagine you are a thread of computation in a virtual machine, running on a one processor of a multipro-cessor that is also running other virtual machines.

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 13: CloudViews 2010 Proceedings

3 ‹

So off you go, pounding away, directly executing instructions on your own procesor, running directly on bare hardware. ware of any kind involved in what you are doing; you manipulate the real physical registers, use the real physical adders, floatingrunning asfastas thehardwarewillgo. Faspointers, keepinghardwawrepipelinesfull,

BAM! You attempt to execute an instruction that would change the state of the physical

machine in a way thatJust altering the value in your own register file doesn’t do that, and neither does,

for example, writing into your own section of memory. That’s why you can do such things at full-bore hardware speed.

Suppose, however, you attempt to do something likeone master real time clock for the whole physical machine. Having that clock altered out from under other running virtual machines would not be very good at all for their health. You aren’t allowed to do things like that.

So, BAM, you trap. You are wrenched out of user mode, or out of supervisor mode, up into a new higher privilege mode; call it hypervisor mode. There, the hypevisor looks at what you wanted to do bag of bits it keeps that holds the description of your virtual machine. In particular, it grabs the value showing the offset between the hardware real time clock and your real time clock, alters that offset appropriately, returns the appropriate settings to you, and gives you back control. Then you start the real-time clock, the analogous sequence happens, adding that stored offset to the value in the hardware re

Not every such operation is as simple as computinple, a client virtual machine’s supervisor attempting to manipulate its virtual memory

So off you go, pounding away, directly executing instructions on your own procesor, running directly on bare hardware. There is no simulation or, at this point, sofware of any kind involved in what you are doing; you manipulate the real physical registers, use the real physical adders, floating-point units, cache, and so on. You are running asfastas thehardwarewillgo. Fastasyoucan. Poundingoncache, playingwitpointers, keepinghardwawrepipelinesfull, until…

You attempt to execute an instruction that would change the state of the physical would be visible to other virtual machines. (See the Figure

Just altering the value in your own register file doesn’t do that, and neither does, for example, writing into your own section of memory. That’s why you can do such

bore hardware speed.Suppose, however, you attempt to do something like set the real-time clock

one master real time clock for the whole physical machine. Having that clock altered out from under other running virtual machines would not be very good at all for their health. You aren’t allowed to do things like that.

BAM, you trap. You are wrenched out of user mode, or out of supervisor mode, up into a new higher privilege mode; call it hypervisor mode. There, the hypevisor looks at what you wanted to do – change the real-time clock -- and looks in a

eeps that holds the description of your virtual machine. In particular, it grabs the value showing the offset between the hardware real time clock and your real time clock, alters that offset appropriately, returns the appropriate settings to you, and

s you back control. Then you start runningasfastasyoucan again. If you later read time clock, the analogous sequence happens, adding that stored offset to the

value in the hardware real-time clock. Not every such operation is as simple as computing an offset, of course. For exa

ple, a client virtual machine’s supervisor attempting to manipulate its virtual memory

Fig. 1. Trap and Map

So off you go, pounding away, directly executing instructions on your own proces-There is no simulation or, at this point, soft-

ware of any kind involved in what you are doing; you manipulate the real physical point units, cache, and so on. You are

playingwith-

You attempt to execute an instruction that would change the state of the physical igure 1.)

Just altering the value in your own register file doesn’t do that, and neither does, for example, writing into your own section of memory. That’s why you can do such

time clock – the one master real time clock for the whole physical machine. Having that clock altered out from under other running virtual machines would not be very good at all for their

BAM, you trap. You are wrenched out of user mode, or out of supervisor mode, up into a new higher privilege mode; call it hypervisor mode. There, the hyper-

and looks in a eeps that holds the description of your virtual machine. In particular, it

grabs the value showing the offset between the hardware real time clock and your real time clock, alters that offset appropriately, returns the appropriate settings to you, and

again. If you later read time clock, the analogous sequence happens, adding that stored offset to the

g an offset, of course. For exam-ple, a client virtual machine’s supervisor attempting to manipulate its virtual memory

PRoCEEDInGS

Page 14: CloudViews 2010 Proceedings

› 4

mapping is a rather more complicated case to deal with, a case that involves maintain-ing an additional layer of mapping (kept in the bag ‘o bits): A map from the hardware real memory space to the “virtually real” memory space seen by the client virtual machine. All the mappings involved can be, and are, ultimately collapsed into a single mapping step; so execution directly uses the hardware that performs virtual memory mapping.

4 Concerning Efficiency

How often do you BAM? Unhelpfully, this is clearly application dependent. But the answer in practice, setting aside input/output for the moment, is not often at all. It’s usually a small fraction of the total time spent in the supervisor, which itself is usually a small fraction of the total run time. As a coarse guide, think in terms of overhead that is well less than 5%, or in other words, for most purposes, negligible. Programs that are IO intensive can see substantially higher numbers, though, unless you have access to the very latest in hardware virtualization support; then it’s negligible again. A little more about that later.

I originally asked you to imagine you were a thread running on one processor of a multiprocessor. What happens when this isn’t the case? You could be running on a uniprocessor, or, as is commonly the case, there could be more virtual machines than physical processors or processor hardware threads. For such cases, hypervisors im-plement a time-slicing scheduler that switches among the virtual machine clients. It’s usually not as complex as schedulers in modern operating systems, but it suffices. This might be pointed to as a source of overhead: You’re only getting a fraction of the whole machine! But assuming we’re talking about a commercial server, you were only using 12% or so of it anyway, so that’s not a problem. A more serious problem arises when you have less real memory than all the machines need; virtualization does not reduce aggregate memory requirements. But with enough memory, many virtual machines can be hosted on a single physical system with negligible degradation.

5 Translate, Trap and Map

The technique described above depends crucially on a hardware feature: The hardware must be able to trap on every instruction that could affect other virtual ma-chines. Prior to the introduction of Intel’s and AMD’s specific additional hardware virtualization support, that was not true. For example, setting the real time clock was, in fact, not a trappable instruction. It wasn’t even restricted to supervisors. (Note, not all Intel processors have virtualization support today; this is apparently a done to segment the market.)

Yet VMware and others did provide, and continue to provide, hardware virtualiza-tion on such older systems. How? By using a load-time binary scan and patch. (See Figure 2.) Whenever a section of memory was marked executable – making that marking was, thankfully, trap-able – the hypervisor would immediately scan the

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 15: CloudViews 2010 Proceedings

5 ‹

executable binary for troublesome instructions and replace each one with a trap istruction. In addition, of course, it augmented the bag ‘o bits for that virtual machine with information saying what each of those traps was supposed to do originally

Now, many software companies are not fond of the idea of someone else modifing their shipped binaries, and can even get sticky about things like support if that is done. Also, my personal reaction is that this is a kluge, needed to get around hardware deficiencies, and it has proven to work well in thousands, if not millions, of installations.

Thankfully, it is not necessary on more recent hardware releases.

6 Paravirtualization

Whether or not the hardware traps all the right things, there is still unavoidable ovehead in hardware virtualization. For example, think back to my prior comments about dealing with virtual memory. You can imagine the complex hoops a hypervisor must repeatedly jump through when the operating system in a client machine is setting up its memory map at application startup, or adjusting the working sets of applications by manipulating its map of virtual memory.

One way around overhead like that is to take a longyou expect virtualization to be, and seriously ask: Is this operating system ever really going to run on bare metal? Or will it almost always run under a hypervisor?

Some operating system development streams decided the answer is: No bare metal. A hypervisor will always be there. Examples: Linux with the Xen hypervisor, IBM AIX, and of course the IBM mainframe operating system z/OS (no mainframe has been shipped without virtualization since the mid

ecutable binary for troublesome instructions and replace each one with a trap istruction. In addition, of course, it augmented the bag ‘o bits for that virtual machine with information saying what each of those traps was supposed to do originally

Now, many software companies are not fond of the idea of someone else modifing their shipped binaries, and can even get sticky about things like support if that is done. Also, my personal reaction is that this is a horrendous kluge. But is a necessary kluge, needed to get around hardware deficiencies, and it has proven to work well in thousands, if not millions, of installations.

Thankfully, it is not necessary on more recent hardware releases.

Paravirtualization

ther or not the hardware traps all the right things, there is still unavoidable ovehead in hardware virtualization. For example, think back to my prior comments about dealing with virtual memory. You can imagine the complex hoops a hypervisor must

dly jump through when the operating system in a client machine is setting up its memory map at application startup, or adjusting the working sets of applications by manipulating its map of virtual memory.

One way around overhead like that is to take a long, hard look at how prevalent you expect virtualization to be, and seriously ask: Is this operating system ever really going to run on bare metal? Or will it almost always run under a hypervisor?

Some operating system development streams decided the answer to that question is: No bare metal. A hypervisor will always be there. Examples: Linux with the Xen hypervisor, IBM AIX, and of course the IBM mainframe operating system z/OS (no mainframe has been shipped without virtualization since the mid-1980s).

Fig. 2. Translate, Trap and Map

ecutable binary for troublesome instructions and replace each one with a trap in-struction. In addition, of course, it augmented the bag ‘o bits for that virtual machine with information saying what each of those traps was supposed to do originally.

Now, many software companies are not fond of the idea of someone else modify-ing their shipped binaries, and can even get sticky about things like support if that is

horrendous kluge. But is a necessary kluge, needed to get around hardware deficiencies, and it has proven to work well in

ther or not the hardware traps all the right things, there is still unavoidable over-head in hardware virtualization. For example, think back to my prior comments about dealing with virtual memory. You can imagine the complex hoops a hypervisor must

dly jump through when the operating system in a client machine is setting up its memory map at application startup, or adjusting the working sets of applications

, hard look at how prevalent you expect virtualization to be, and seriously ask: Is this operating system ever really going to run on bare metal? Or will it almost always run under a hypervisor?

to that question is: No bare metal. A hypervisor will always be there. Examples: Linux with the Xen hypervisor, IBM AIX, and of course the IBM mainframe operating system z/OS (no

PRoCEEDInGS

Page 16: CloudViews 2010 Proceedings

› 6

If that’s the case, things can be more efficient. If you know a hypervisor is always really behind memory mapping, for example, provide an actual to do things that have substantial overhead. For example: Don’t do your own memory mapping, just ask the hypervisor for a new page of memory when you need it. Don’t set the real-time clock yourself, tell the hypervisor directly to do it.

This kind of technique has become known as paravirtualization, overhead of virtualization significantly. A set of “paradirectly has even been standardized, and is available in Xen, VMware, and other hypervisors.

The concept of paravirtualizatin actually dates back to aroperating system developed in the IBM Cambridge Science Center. They had the notunreasonable notion that the right way to build a timeevery user his or her own virtual machine, a notion somewhat like today’desktop systems. The operating system run in each of those VMs used paravirtualiztion, but it wasn’t called that back in the Computer Jurassic.

Virtualization is, in computer industry terms, a truly ancient art.

7 Drown It In Silicon

In the previous discussion I might have lead you to believe that paravirtualization is widely used in mainframes (IBM zSeries and clones). Sorry. That turns out not to be the case. Another method is used.

Consider the example of reading the real time clock. All that hasilly little offset is added. It is perfectly possible to build hardware that adds an offset all by itself, without any “help” from software. So that’s what they did. 4.)

They embedded nearly the whole shooting match directhat the bag ‘o bits I’ve been glibly referring to becomes part of the hardware arch

at’s the case, things can be more efficient. If you know a hypervisor is always really behind memory mapping, for example, provide an actual call to the hypervisorto do things that have substantial overhead. For example: Don’t do your own memory

just ask the hypervisor for a new page of memory when you need it. Don’t time clock yourself, tell the hypervisor directly to do it. (See Figure

This kind of technique has become known as paravirtualization, and can lower the overhead of virtualization significantly. A set of “para-APIs” invoking the hypervisor directly has even been standardized, and is available in Xen, VMware, and other

The concept of paravirtualizatin actually dates back to around 1973 and the VM operating system developed in the IBM Cambridge Science Center. They had the notunreasonable notion that the right way to build a time-sharing system was to give every user his or her own virtual machine, a notion somewhat like today’desktop systems. The operating system run in each of those VMs used paravirtualiztion, but it wasn’t called that back in the Computer Jurassic.

Virtualization is, in computer industry terms, a truly ancient art.

Drown It In Silicon

us discussion I might have lead you to believe that paravirtualization is widely used in mainframes (IBM zSeries and clones). Sorry. That turns out not to be the case. Another method is used.

Consider the example of reading the real time clock. All that has to happen is that a silly little offset is added. It is perfectly possible to build hardware that adds an offset all by itself, without any “help” from software. So that’s what they did. (See F

They embedded nearly the whole shooting match directly into silicon. This implies that the bag ‘o bits I’ve been glibly referring to becomes part of the hardware arch

Fig. 3. Paravirtualization

at’s the case, things can be more efficient. If you know a hypervisor is always call to the hypervisor

to do things that have substantial overhead. For example: Don’t do your own memory just ask the hypervisor for a new page of memory when you need it. Don’t

igure 3.) and can lower the

APIs” invoking the hypervisor directly has even been standardized, and is available in Xen, VMware, and other

ound 1973 and the VM operating system developed in the IBM Cambridge Science Center. They had the not-

sharing system was to give every user his or her own virtual machine, a notion somewhat like today’s virtual desktop systems. The operating system run in each of those VMs used paravirtualiza-

us discussion I might have lead you to believe that paravirtualization is widely used in mainframes (IBM zSeries and clones). Sorry. That turns out not to be

s to happen is that a silly little offset is added. It is perfectly possible to build hardware that adds an offset

(See Figure

tly into silicon. This implies that the bag ‘o bits I’ve been glibly referring to becomes part of the hardware archi-

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 17: CloudViews 2010 Proceedings

7 ‹

tecture: Now it’s hardware that has to reach in and know where the clock offset rsides. What happens with the memory mapping gets, to me ancomplexity. But, of course, it can be made to work

Nobody else is willing to invest a pound or so of silicon into doing this. Yet.As Moore’s Law keeps providing us with more and more transistors,

some point the industry will tire of providing even more cores, and spend some of those transistors on something that might actually be immediately usable.

8 A Bit About Input and Output

One reason for all this mainframe talk is that it provides aframes have been virtualizing IO basically forever, allowing different virtual mchines to think they completely own their own IO devices when in fact they’re shared. And, of course, it is strongly supported in yet more hardware. A vissue an IO operation, have it directed to its address for an IO device (which may not be the “real” address), get the operation performed, and receive a completion interupt, or an error, all without involving a hypervisor, at full hardcan be done.

But until very recently, it could not be readily done with PCI and PCIe (PCI Epress) IO. Both the IO interface and the IO devices need hardware support for this to work. As a result, IO operations have for commodity andinterpretively, by the hypervisor. This obviously increases overhead significantly. Paravirtualization can clearly help here: Just ask the hypervisor to go do the IO direcly.

However, even with paravirtualization this requires the IO driver set, separate from that of the guest operating systems. This is a redundancy that adds significant bulk to a hypervisor and isn’t as reliable as one would like, for the simple reason that no IO driver is ever as reliablty is very strongly desired in a hypervisor. Errors within it can bring down all the guest systems running under them.

Another thing that can help is direct assignment of devices to guest systems. This gives a guest virtual machine sole ownership of a physical device. Together with

ture: Now it’s hardware that has to reach in and know where the clock offset rsides. What happens with the memory mapping gets, to me anyway, a tad scary in its complexity. But, of course, it can be made to work.

Nobody else is willing to invest a pound or so of silicon into doing this. Yet.As Moore’s Law keeps providing us with more and more transistors, perhaps at

some point the industry will tire of providing even more cores, and spend some of those transistors on something that might actually be immediately usable.

A Bit About Input and Output

One reason for all this mainframe talk is that it provides an existence proof: Maiframes have been virtualizing IO basically forever, allowing different virtual mchines to think they completely own their own IO devices when in fact they’re shared. And, of course, it is strongly supported in yet more hardware. A virtual machine can issue an IO operation, have it directed to its address for an IO device (which may not be the “real” address), get the operation performed, and receive a completion interupt, or an error, all without involving a hypervisor, at full hardware efficiency. So it

But until very recently, it could not be readily done with PCI and PCIe (PCI Epress) IO. Both the IO interface and the IO devices need hardware support for this to work. As a result, IO operations have for commodity and RISC systems been done interpretively, by the hypervisor. This obviously increases overhead significantly. Paravirtualization can clearly help here: Just ask the hypervisor to go do the IO direc

However, even with paravirtualization this requires the hypervisor to have its own IO driver set, separate from that of the guest operating systems. This is a redundancy that adds significant bulk to a hypervisor and isn’t as reliable as one would like, for the simple reason that no IO driver is ever as reliable as one would like. And reliabilty is very strongly desired in a hypervisor. Errors within it can bring down all the guest systems running under them.

Another thing that can help is direct assignment of devices to guest systems. This l machine sole ownership of a physical device. Together with

Fig. 4 Do it all in Hardware

ture: Now it’s hardware that has to reach in and know where the clock offset re-yway, a tad scary in its

Nobody else is willing to invest a pound or so of silicon into doing this. Yet.perhaps at

some point the industry will tire of providing even more cores, and spend some of

n existence proof: Main-frames have been virtualizing IO basically forever, allowing different virtual ma-chines to think they completely own their own IO devices when in fact they’re shared.

irtual machine can issue an IO operation, have it directed to its address for an IO device (which may not be the “real” address), get the operation performed, and receive a completion inter-

ware efficiency. So it

But until very recently, it could not be readily done with PCI and PCIe (PCI Ex-press) IO. Both the IO interface and the IO devices need hardware support for this to

RISC systems been done interpretively, by the hypervisor. This obviously increases overhead significantly. Paravirtualization can clearly help here: Just ask the hypervisor to go do the IO direct-

hypervisor to have its own IO driver set, separate from that of the guest operating systems. This is a redundancy that adds significant bulk to a hypervisor and isn’t as reliable as one would like, for

e as one would like. And reliabili-ty is very strongly desired in a hypervisor. Errors within it can bring down all the

Another thing that can help is direct assignment of devices to guest systems. This l machine sole ownership of a physical device. Together with

PRoCEEDInGS

Page 18: CloudViews 2010 Proceedings

› 8

hardware support that maps and isolates IO addresses, so a virtual machine can only access the devices it owns, this provides full speed operation using the guest operating system drivers, with no hypervisor involvement. However, it means you do need dedicated devices for each virtual machine, something that clearly inhibits scaling: Imagine 15 virtual servers, all wanting their own physical network card. This support is also not an industry standard. What we want is some way for a single device to act like multiple virtual devices.

Enter the PCI SIG. It has recently released a collection – yes, a collection – of spe-cifications to deal with this issue. I’m not going to attempt to cover them all here. The net effect, however, is that they allow industry-standard creation of IO devices with internal logic that makes them appear as if they are several, separate, “virtual” devices (the SR-IOV and MR-IOV specifications); and add features supporting that concept, such as multiple different IO addresses for each device.

A key point here is that this requires support by the IO device vendors. It cannot be done just by a purveyor of servers and server chipsets. So its adoption will be gated by how soon those vendors roll this technology out, how good a job they do, and how much of a premium they choose to charge for it. I am not especially sanguine about this. We have done too good a job beating a low cost mantra into too many IO ven-dors for them to be ready to jump on anything like this, which increases cost without directly improving their marketing numbers (GBs stored, bandwidth, etc.).

9 Conclusion

There is a joke, or a deep truth, expressed by the computer pioneer David Wheeler, co-inventor of the subroutine, as “All problems in computer science can be solved by another level of indirection.”

Virtualization is not going to prove that false. It is effectively a layer of indirection or abstraction added between physical hardware and the systems running on it. By providing that layer, virtualization enables a collection of benefits that were recog-nized long ago, benefits that are now being exploited by cloud computing. In fact, virtualization is so often embedded in cloud computing discussions that many have argued, vehemently, that without virtualization you do not have cloud computing. As explained previously, I don’t agree with that statement, especially when “virtualiza-tion” is used to mean “hardware virtualization,” as it usually is.

However, there is no denying that the technology of virtualization makes cloud computing tremendously more economic and manageable.

Virtualization is not magic. It is not even all that complicated in its essence. (Of course its details, like the details of nearly anything, can be mind-boggling.) And despite what might first appear to be the case, it is also efficient; resources are not wasted by using it. There is still a hole to plug in IO virtualization, but solutions there are developing gradually if not necessarily expeditiously.

There are many other aspects of this topic that have not been touched on here, such as where the hypervisor actually resides (on the bare metal? Inside an operating sys-tem?), the role virtualization can play when migrating between hardware architec-

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 19: CloudViews 2010 Proceedings

9 ‹

tures, and the deep relationship that can, and will, exist between virtualization and security. But hopefully this discussion has provided enough background to enable some of you to cut through the marketing hype and the thicket of details that usually accompany most discussions of this topic. Good luck.

PRoCEEDInGS

Page 20: CloudViews 2010 Proceedings
Page 21: CloudViews 2010 Proceedings

11 ‹

Using Private Clouds to Increase Service Availability Levels and Reduce Operational Costs

Narciso Monteiro

Porto, Portugal [email protected]

http://narcisomonteiro.gamagt.com

Abstract. In an era marked by the so-called technological revolution, where computing power has become a critical production factor, just like manpower was in the industrial revolution, very quickly this dispersal of computing power for numerous facilities and sites became unsustainable, both economically as in terms of maintenance and management. This constringency led to the need of adopting new strategies for the provision of information systems equally capa-ble but in a more consolidated manner, while also increasing the desired levels of the classic set of information assurances: availability, confidentiality, authen-ticity and integrity. At this point enters a topic not as new as it may seem, virtu-alization, which after two decades of oblivion presents itself as the best candi-date to solve the problem technology faces. Throughout this document, it will be explained what this technology represents and what benefits are obtained from the evolution of a classical model of systems architecture into a model based in optimization, high availability, redundancy and consolidation of in-formation communications and technology using the virtualization of networks and systems. A number of good principles to adopt will also be taken into ac-count to ensure the correct deployment of the technology, culminating with the analysis of a case study from the Metro do Porto, SA company.

Keywords: Virtualization, High Availability, Private Cloud

1 Introduction

The term virtualization is, in common sense, seen as an emerging and revolutionary technology whose existence arises in recent years as a capability never seen before and only made possible by the technological revolution of the 2000s. However, this concept was first launched in the '60s by Christopher Strachey, an Oxford University professor, with the aim of creating a way to take advantage of computing by sharing time, resources, processes and peripherals. Following this theory, two computers considered essential landmarks in the history of virtualization development are born: The Atlas and project M44/44X by IBM. If by one hand Atlas pioneered in introduc-ing the concept of virtual memory and paging techniques, the M44/44X project is actually the first physical equipment to run multiple virtual machines. The cost and

PRoCEEDInGS

Page 22: CloudViews 2010 Proceedings

› 12

size of the then current mainframes were IBM main incentives, which sought the best way to take advantage of the large investment in hardware, giving them the title of creating the "virtual machine" concept [1]. Interestingly, perhaps by a lack of vision or long-term planning, in the 80s and 90s the proliferation and the drastically reduc-tion of the computer equipments cost, coupled with the birth of more capable and easy to use operating systems, leads to the abandon of the idea of sharing resources in favor of the massive distribution of computing. What the industry didn’t realize at the time were the operational costs of managing large physical size infrastructures:

− Maintenance contracts − Logistics and power consumption − Complexity and diversity of the architecture and the consequent need for a

large number of skilled personnel dedicated solely to maintenance tasks, often manual

− Low usage of the infrastructure total capacities − Megalomaniac disaster recovery and high availability scenarios − Lack of scalability, leading to the continuous need for hardware investment,

causing a snowball effect in the above factors

Already in the 2000s, when the management of these infrastructures reached an un-bearable and limited point, very similar to that which led IBM in the 60s (which should be noted that never abandoned the idea and continued research and develop-ment in this area [1]), virtualization has once again a word to say, much by the fault of a company called VMWare.

2 Virtualization Principles and State of the Art

Virtualization consists in the insertion of a software layer between the operating sys-tem and the underlying hardware, which implements an abstraction of that hardware and makes it available in a controlled manner to the upper layer. This layer is called the hypervisor, whose name derives from the original concept of supervisor, intro-duced primarily by Atlas, and represents a component created for managing system processes and provisioning of resources like memory and time sharing, which sepa-rates it from the component responsible for the effective execution of applications. This hypervisor layer can be of two types: Hypervisor type 1 or Hypervisor type 2. In the first case, the virtualization software acts as an operating system, allowing its execution directly on the hardware, being considered that virtual machines operate on the second layer above hardware (Fig. 1). Naturally, this solution has the most resem-blance to a real machine and allows better performance of the guest operating sys-tems.

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 23: CloudViews 2010 Proceedings

13 ‹

Fig. 1. Hypervisor Type 1

On the other hand, Hypervisor 2 is unable to operate autonomously and requires another operating system to intermediate the communication between the virtualiza-tion software and the physical host, placing virtual machines on the third layer above hardware (Fig. 2). The introduction of an additional layer naturally reduces the smoothness and performance of the interaction between the host and guest systems but in counterpart becomes useful in testing and investigation scenarios, as it allows the intermediate operating system to capture and/or inject low level virtual machine instructions, typically for diagnosis and simulation purpose.

Fig. 2. Hypervisor Type 2

In both cases, the hypervisor function consists in making physical resources available to virtual machines in a transparent and secure manner, a sort of invisible mediator that aims in guaranteeing the virtual abstraction of hardware to each system and re-spective applications, which are totally unaware of the presence of a logical layer and act as if a whole physical machine was under their control. One of the virtues of this abstraction is the possibility to indicate the presence of a device or peripheral that actually doesn’t exist as well as the emulation of an interface type (for instance, the hard drive interface technology) that omits the true technological form the hardware has. Even so, there are known limitations to the type of controllers that can be emu-lated, especially at graphical level and with more specific or less common devices.

Following the general growing demand and interest in virtualization, many manu-facturers and solutions arise in the market, with many flavors and goals. At least 73 different developments have been identified, ranging from proprietary to the most common systems, including the world of workstations. For the virtualization of serv-

PRoCEEDInGS

Page 24: CloudViews 2010 Proceedings

› 14

ers with a high demand of performance and high availability, there are three most recognized Hypervisor 1 products that stand out in the market, having the largest number of implementations. These products are ESX from VMWare, Hyper-V from Microsoft and Citrix’s XenServer. Comparing each solution’s specifications, ESX appears as the most complete and more capable product, which justifies their market leadership [2]. Based on the work already developed by IBM, VMWare pointed every effort in a big challenge: the virtualization of x86 systems based on Intel 32bit archi-tecture. The execution scheme of input / output instructions wasn’t designed for this purpose, causing serious obstacles in the abstraction of resources. This is due to an identified set of instructions that require execution in privileged mode, in other words, direct contact with the hardware instead of communication with an interface between the virtual operating system and the physical host, creating protection exceptions that lead to system blocking. This was a key point because technology would not succeed if the stability of the system wasn’t guaranteed. As the result of a proprietary mechan-ism to monitor processing, VMware was able to circumvent this constraint and thus successfully implement its conceptual model, becoming a pioneer in x86 virtualiza-tion and aftermost market leader. This mechanism consists in the dynamic rewriting of operating system kernel key parts in order to capture these sensitive instructions and therefore allow its interpretation to be performed by the virtual machine supervi-sor [6].

3 Creating a Private Cloud through the Implementation of a High Availability Cluster: Key Points and Expected Achievements

This implementation involves the application of a private cloud in the critical services infrastructure of Metro do Porto, SA. The company is endowed with a classical archi-tecture, which provides good results in the field of availability and, due to its recent existence, is at a high rate of expansion. It is against this condition that the company wants this growth and its management made in a sustainable manner and with clear improvement in the service quality. For this end, virtualization presents itself as the technology with the right motives and, based on a previous cost comparison study, is also the most economically advantageous. The aim is to provide a development that will achieve higher scalability, lower costs of operation and maintenance as well as an increase in the availability rate, while also improving their data/services backup me-thodologies. In the end, the results must achieve the following goals:

− At least 99.99% annual average of service availability − Consolidation of the infrastructure management − Reduction in the annual costs for operation and maintenance − Capacity growth without hardware investment over the medium term

To support this technology, it was decided to purchase a cluster of three servers with VMWare ESX and the high availability module, based on a HP Blade Center platform and using a Storage Area Network (SAN) HP EVA 4400 storage equipment. This

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 25: CloudViews 2010 Proceedings

15 ‹

SAN is composed of four drawers with twelve 300GB drives, totaling approximately 12TB of raw space. This capacity was estimated to be able to store all systems and their data and provide growth opportunities for a year without adding disks. It should be noted that the company, due to their business area, has atypical needs of storage, culminating in an average quarterly growth of about 11%. At processing level, the blade servers are equipped with dual quad-core Intel Xeon 2.83 GHz and 32GB of memory. Abstracting these resources in the cluster, a total of 128GB of memory and 24 processing cores are made available. This central unit will start by holding 9 stan-dard servers but, with these features, growth possibilities in the medium term without any investment are expected.

Before setting up the cluster, storage volumes must be configured in the SAN ac-cording to the provisioning requirements. VMWare good practices recommend that no more than 16 virtual machines share the same volume (defined as datastore), even though this allocation depends on the system purpose, according to its data type and required I/O capabilities. In this provisioning, it is necessary to have in account that one of the virtualization capabilities is the possibility to take system snapshots. This method, as its name suggests, allows a sort of picture to be taken to the binary struc-ture of a file system in a specific point of time. With this resource, it is possible to recover the system to that point allowing the rollback of upgrade and maintenance tasks. It also enables the possibility to copy the main file, as it is unlocked for reading and writing during that period.

For this reason, each presented volume should have an exceeding size of 30%, as recommended by the manufacturer, even though this value could be excessive in situations where storage capacity isn’t abundant. The networking architecture must also be defined. The blade center used in this demonstration is equipped with four network interface cards and, following security recommendations [5], two of them will be exclusively dedicated to the management network, which includes the service console. It is in this network that inter-virtual machine traffic occurs and tasks like VMotion are executed, therefore requiring isolation from other networks. The other network cards are dedicated, in load balancing mode, to regular traffic and connected to the physical network devices. It is also advisable, in order to prevent unusual situa-tions, to setup each pair of network cards as an alternative to the other pair, allowing an additional fault tolerance scenario.

The process of creating virtual machines is a rather easy step and there are three ways of introducing them to the cluster: starting from scratch, physical to virtual con-version or importing vApps. The physical to virtual process, or P2V, is provided by an additional application called VMWare Converter. This kind of conversion usually results without any problem in Windows systems and most Linux releases, as long as they are conveniently prepared. This preparing means the removal of specific drivers which are not automatically recognized by the operating system without additional software (hard locks, special graphic cards, etc.) as well as hardware management products inherent to the physical platform, like Systems Homepage from HP. The Converter is also capable of importing to ESX files created on VMWare Server or even third-party software, like Acronis True Image and Symantec Ghost images.

PRoCEEDInGS

Page 26: CloudViews 2010 Proceedings

› 16

Before considering the cluster fully implemented, there is another security recom-mendation that should be taken into account. It is highly advisable to use the ESX native functionality that restricts the resources used by each virtual machine, in order to prevent extraordinary situations that can affect the other platforms. This process requires some learning time so the most effective option is to define limits that seem acceptable and in accordance with the characteristics assigned to each service. Within the normal course of production, it will be easier to understand the needs that each system has and thus adjust restrictions into a more appropriate level.

This kind of migration process, when performed with a comprehensive study of needs and with the implementation scheme properly defined and planned, has all the conditions to run as smoothly as in this case study, where no abnormal situation was verified and the commitment to be done without any impact to users was completed exceptionally. After reaching the optimum configuration, the implementation is con-sidered complete and outcome indicators can be raised, allowing a comparative analy-sis of results.

3.1 Outcome Assessment Indicators

The process of evaluating outcomes consists in a direct comparison between the val-ues verified before and after the conversion of the infrastructure. Naturally, the lifting of such data requires that the collection has been carried out in an acceptable period of time, allowing to effectively achieve results in a medium term. In this case, figures are available for the classical model since the first half of 2007, and it is easy to compute an average for a reasonable period. But, in the case of the virtual infrastructure, given its recent implementation, the time period under review will be just the first quarter of 2010. Although the implementation occurred before, this is the period when actually all the adjustments, which are part of the learning process, were considered as being in a stable point, hence it’s only legitimate to compare data after this state. Using the reporting module of the application the company has for monitoring systems and services, data can be exported into a spreadsheet format for a given period of time. These values, among others, include the response time and service status, which al-lows evaluating the availability observed in this period of time. This evaluation is performed by considering a service as available whenever it is running within a given range of performance values.

3.1.1 Before Virtualization

Before this process, every new service required the acquisition of new hardware and its corresponding maintenance contract. This also meant that every added service would increase the data center cooling and power needs, while adding another point of failure in the infrastructure. Note that this need for hardware isn’t just the acquisi-tion of the server itself. For each new equipment fiber optical connections to the SAN and to the backup devices (cables and interface cards) are required, along with Ether-net connections and corresponding switch ports. All together, this increased the infra-

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 27: CloudViews 2010 Proceedings

17 ‹

structure management complexity and dispersion, making it harder to keep under control. As for service availability, the values obtained in the three years before virtualization were as follows. In 2007, the first full year of operation of the Office of Information Systems, an average availability rate of 99.95% was achieved, representing a total of approximately 1836 minutes without service, as shown in Table 1.

Table 1. Availability data for 2007

Servers - January 01, 2007 00:00:00 - December 31, 2007 00:00:00 Service Unavailability time (minutes) Availability (%) Domain Controller 1 1253,15 99,73% Domain Controller 2 80,1 99,98% E-mail Service 30,02 99,99% ERP 40,01 99,99% Maintenance Service 20,44 99,98% File Server 80,04 99,98% Print Server 50,63 99,99% Geographical Information System

220,92 99,95%

Webmail 60,36 99,99% 1835,67 99,95%

The following year, due to some technological changes and various unexpected fail-ures of equipment, there was a decline to 99.83%, totaling about 6503 minutes of accumulated staging (Table 2).

Table 2. Availability data for 2008

Servers - January 01, 2008 00:00:00 - December31, 2008 00:00:00 Service Unavailability time (minutes) Availability (%) Domain Controller 1 352,54 99,92 Domain Controller 2 810,82 99,81 E-mail Service 1456,91 99,67 ERP 472,42 99,89 Maintenance Service 80,05 99,98 File Server 80,06 99,98 Print Server 133,89 99,97 Geographical Information System

1558,21 99,64

Webmail 1558,07 99,64 6502,97 99,83

PRoCEEDInGS

Page 28: CloudViews 2010 Proceedings

› 18

In 2009, the year that the infrastructure was converted, there was an improvement on the previous year, still below desired levels. The rate was 99.90%, which represents about 4420 total minutes without service (Table 3).

Table 3. Availability data for 2009

Servers - January 01, 2009 00:00:00 - December 31, 2009 23:30:00 Service Unavailability time (minutes) Availability (%) Domain Controller 1 270,81 99,95 Domain Controller 2 250,75 99,95 E-mail Service 601,19 99,88 ERP 410,89 99,92 Maintenance Service 711,74 99,86 File Server 790,98 99,84 Print Server 570,89 99,89 Geographical Information System

180,84 99,96

Webmail 631,71 99,87 4419,8 99,90

3.1.2 After Virtualization

As indicated, the values for this scenario are still premature for a medium term con-trast. However, if the present situation continues and these data are extrapolated to the rest of the year, the outlook is fairly encouraging. In the first quarter of 2010, the much sought rate of 99.99% was obtained, which culminates in less than 153 total minutes without service (Table 4).

Table 4. Availability data for 2010

Servers - January 01, 2010 00:00:00 - March 31, 2010 23:30:00 Service Unavailability time (minutes) Availability (%) File Server 70,4 99,95 Print Server 10,01 99,99 ERP 0 100,00 Domain Controller 1 0,33 100,00 Domain Controller 2 10,34 99,99 E-mail Service 20,33 99,98 Maintenance 10,34 99,99 Geographical Information System

20,34 99,98

Webmail 10,33 99,99 152,42 99,99

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 29: CloudViews 2010 Proceedings

19 ‹

This kind of values is only possible thanks to the ability the IT staff gained in power perform maintenance tasks without any downtime window, as the result of a cluster that can migrate and change the services characteristics without interruption. In addi-tion to these data, we should note that no subsequent purchase of equipment was made and five new services were added, with a few more already in line for entering production. Nine physical servers were discontinued, with their corresponding main-tenance contracts, and the power consumption of the data center was reduced by 42%. With this type of solution the IT management gains a new consolidated way of deal-ing with the infrastructure, not only by its ease of use but also in the fact that most maintenance tasks can now be performed at any time, without the need for hard and stressful overtime interventions, sometimes just to perform small operations. Every time one of the nodes is placed in maintenance mode (or when it really fails), ESX automatically evacuates virtual machines contained in it to the others, informing the high availability module that there is an unavailable node, in order to reorganize all the resource provisioning plan. All of this is made without any service rupture. It is a solution of this nature that allows establishing an official SLA that can actually be accomplished without the need for an incredibly high budget.

4 Conclusion

Desirably, this work will provide a basis for a better understanding of the sustainabili-ty problem that the world of IT faces. The fast changes that everyday confronts pro-fessionals of this area make their adaptation speed and the ability to respond to these challenges key factors of success. And with the evolution of all active parts sensitiz-ing for the need to align business with their information systems orientation, the more critical becomes the idea of supplying large scale, uninterruptible and high quality IT services. The focus must necessarily pass from the technology itself to the results it produces.

As Nascimento refers in his notable work about the changes that information sys-tem professionals have faced along the last years, the informatics department is no longer an area that independently defines and implements working methods but in-stead an area that, just like every other department in an organization, works in a collaborative way. In this scenario, the IT is only the vehicle that provides these ser-vices to the upper layer, to the information systems, which by their hand supply es-sential data for development and decision making. Supplying these services is unar-guably a complex task that requires specialized professionals and high level practices, but without putting the emphasis on the vehicle instead of the data it produces. Ideal-ly, there will become a time where accessing a service will be like turning on a light. Everyone expects it to work without questioning which method or by which means the energy is supplied. It will be a given fact that it exists, what is its goal and its unavailability unacceptable [3].

Just like virtualization abstracts resources and makes them available to the services in a controlled manner, so does the IT walk to the creation of an abstract cloud that only its managers and technicians understand the working details, leaving space for clients of that cloud to benefit in a correct and continuous way, with a single concern: productivity.

PRoCEEDInGS

Page 30: CloudViews 2010 Proceedings

› 20

References

1. Dittner, R., Rule, D. (2007) “The Best Damn Server Virtualization Book Period”. Syn-gress Publishing, Inc., Elsevier, Inc.

2. IT 2.0: Next Generation IT Infrastructures. [Online]. Available: http://www.it20.info/misc/virtualizationscomparison.htm [Accessed on May 2010]

3. Nascimento, J. C. (2006) “Gestão de Sistemas de Informação e os Seus Profissionais”. FCA

4. Oltsik, J. (2009) “The new security management model. Enterprise Strategy Group” 5. VMWare (2009) “Network segmentation in virtualized environments” 6. Virtualization Basics: History of virtualization. [Online]. Available:

http://www.vmware.com/virtualization/history.html [Accessed on December 2009]

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 31: CloudViews 2010 Proceedings

21 ‹

1 1,2 1 1

1 1

1

2

PRoCEEDInGS

Page 32: CloudViews 2010 Proceedings

› 22

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 33: CloudViews 2010 Proceedings

23 ‹

PRoCEEDInGS

Page 34: CloudViews 2010 Proceedings

› 24

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 35: CloudViews 2010 Proceedings

25 ‹

PRoCEEDInGS

Page 36: CloudViews 2010 Proceedings

› 26

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 37: CloudViews 2010 Proceedings

27 ‹

PRoCEEDInGS

Page 38: CloudViews 2010 Proceedings

› 28

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 39: CloudViews 2010 Proceedings

29 ‹

PRoCEEDInGS

Page 40: CloudViews 2010 Proceedings

› 30

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 41: CloudViews 2010 Proceedings

31 ‹

PRoCEEDInGS

Page 42: CloudViews 2010 Proceedings
Page 43: CloudViews 2010 Proceedings

33 ‹

GRID, PaaS for e-science

J.Gomes, G.Borges, M.David

Laboratorio de Instrumentacao em Fısica Experimental de Partıculas, Lisboa,Portugal

[email protected]

Abstract. Grid computing shares many of the characteristics of a plat-form as a service cloud. Still grid computing has remained confined tolarge scientific communities that need access to vast amounts of dis-tributed resources, while cloud computing is gaining adoption and emerg-ing as a more flexible way to use remote computing resources. In thispaper we highlight some of the grid computing achievements and short-comings and provide some insights as how they can be improved throughthe combined use of infrastructure as a service clouds.

Keywords: Cloud Computing, Grid Computing, PaaS

1 Introduction

The grid computing paradigm is widely used to address the needs of demandingcomputational applications in the scientific research domain. Europe has beenvery successful in the development of grid technologies applied to e-science, mak-ing possible the deployment of large production infrastructures such as the oneoperated by the project Enabling Grids for E-science (EGEE) [1].

The EGEE project enabled the integration of computing resources acrossEurope and elsewhere creating the largest multidisciplinary grid infrastructurefor scientific computing worldwide. The infrastructure was built using the gLite[2] grid middleware which made possible the integration of computing clustersin distinct geographic locations.

However the adoption of grid computing by smaller and less organized re-search communities has been slow. Low flexibility, high complexity, and inad-equate business models are often mentioned as barriers to the adoption of thegrid technologies.

Cloud computing is a possible approach to complement grid computing andaddress some of its current limitations. Cloud computing has the potential toprovide a more flexible environment for grid computing and other paradigms,while enabling higher flexibility, elasticity and better optimization of the under-lying infrastructures.

2 Grid and cloud computing

Cloud computing is based on the concept that the user needs can be satisfiedby the provisioning of remote computing services through the Internet. The user

PRoCEEDInGS

Page 44: CloudViews 2010 Proceedings

› 34

can instantiate and use these services without caring about the infrastructurethat is behind.

Grid computing and cloud computing have much in common, in fact gridcomputing can be seen as a Platform as a Service (PaaS) cloud. A grid infras-tructure is a platform to manage data and execute processing jobs. The usersaccess the resources through the Internet regardless of their location or natureand without having to worry about what is behind them. The technologies thatsupport both grid computing and cloud computing interfaces are in many as-pects similar. For the end user the biggest conceptual difference is that in gridcomputing the infrastructure architecture and details are more open, while cloudcomputing infrastructures tend to be a black box where the underlying detailsare much more hidden. Since the first clouds appeared as commercial services,it was natural that the architectures and software behind them were closed andproprietary. As the interest grew, open implementations appeared that madepossible the deployment of private clouds. In contrast grid computing was bornin the academic domain and as such the architectures and implementations wereopen from the beginning.

Although PaaS clouds are emerging, what has made cloud computing gainmomentum are the Infrastructure as a Service (IaaS) clouds in which users caninstantiate and access remote computer systems usually provided as virtual ma-chines. This type of service is highly flexible. Users can gain immediate accessto virtual machines that can be tailored and customized to perform whatevertask is needed. The user has access to the operating system and is no longerrestricted to specific software interfaces as in the grid or in PaaS clouds. Finallyusers can instantiate more resources as needed.

In architectural terms, grid computing sits on top of a layer constituted byphysical processing clusters and storage systems. It is conceivable that his layercan be replaced by a virtual one composed of resources provided by IaaS clouds.In this way the advantages of both paradigms could be combined and fullyexploited.

3 The grid achievements

Grid computing provides a software layer between the user and the actual com-puting resources. By implementing common interfaces the grid middleware canhide the specificities of each resource (processing, storage, instruments) and pro-mote their integration under a unified computing infrastructure. In this waythe users can have transparent access to a wider range of resources regardlessof their location, ownership and characteristics. The grid middleware effectivelysimplifies the access to distributed resources making possible their combineduse to solve complex and demanding computational problems. The grid middle-ware can also facilitate the data management in data intensive applications. ThegLite middleware can keep track of files, manage data replication, schedule datatransfers and provide transparent and efficient access to many types of storagesystems.

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 45: CloudViews 2010 Proceedings

35 ‹

Grid computing has attained many achievements. The standards effort orga-nized in turn of the Open Grid Forum (OGF) [3] contributed to interoperabilityamong different grid middleware stacks and better user and programming inter-faces. The development of sophisticated data and resource management capabili-ties has made possible a high degree of efficiency for distributed computing. TheInternational Grid Trust Domain (IGTF) [4] created a global authenticationdomain for grid users and services based on national certification authorities.Common usage and security policies contributed to establish clear responsibili-ties and promote trust between users, resource providers and infrastructures. Thedevelopments around the virtual organization concept enabled the creation ofstructured user communities that share common resources, and in which usersmay have different roles, responsibilities and be further structured in groups,with excellent access rights granularity. The Geant [5] European academic net-work and the national research networks (NREN), made possible the deploy-ment of high performance international scientific computing grids supportingdistributed processing for data intensive applications at an unprecedented scale.

Grid computing has become a vital tool for many research communities thatrely on it as an integration technology to unify and provide seamless access totheir computing resources. Consequently a sustainable model for grid computinginfrastructures in Europe was needed to ensure the long term availability of therelevant technologies and services. A two layer model was defined with NationalGrid Initiatives (NGI) in each country supported by the governments, and theEuropean Grid Initiative (EGI) [6] a body established to unify the NGIs andoperate a pan-European grid. EGI is now taking the role of EGEE ensuring asmooth transition and continuous growth of the European grid infrastructure.

Most national grid initiatives also encompass other distributed computingtechnologies as a complement or option to traditional grids. In this context thereis a growing interest in cloud computing by both the NGIs and the communitiesthat they serve. This interest extends to the European Grid Initiative that in itsEGI-Inspire [7] project highlights cloud computing as one of the new distributedcomputing technologies that it seeks to integrate. EGI will establish a roadmapas to how clouds and virtualization can be integrated into EGI exploring not onlythe technology issues but also the total cost of ownership under multiple scenar-ios. This trend is not completely new as the EGEE project already started someexploratory work on the provisioning of gLite services on top of clouds withinthe context of the RESERVOIR [8] project and the StratusLab [9] collaboration.

Europe has now an excellent framework for distributed computing that canbe potentially exploited by several paradigms and technologies.

4 Grid user communities

The grid has been very successful and effective for many user communities. Somecommunities such as High Energy Physics match perfectly the grid computingparadigm. Some of the ideal grid community properties are: a very structured andwell organized community, large user base, geographically dispersed users and

PRoCEEDInGS

Page 46: CloudViews 2010 Proceedings

› 36

resources, good technical skills, share of common goals and common data, hugeprocessing and data management requirements, will to share and collaborate toachieve common goals. In short, an evident motivation and a reward for sharingcomputing resources within the community is needed.

However many communities do not match these properties so well. Smallcommunities may not have the necessary technical skills and resources. Commu-nities that are not structured will have more problems to adhere to the virtualorganizations model. Communities with fierce competition or without a traditionto cooperate do not have the will to share. Communities owning some computingcapacity but having isolated peeks of load may find the grid too much of an effortfor their needs. These communities have a low motivation for grid computing.Their needs can be better satisfied through the use of other paradigms.

5 Grid business model

Scientific grids are based on the virtual organizations model. In this model usersorganize themselves and create virtual organizations which are basically usercommunities that share their own resources. Therefore the grid infrastructurecan be seen as a bus where the computing resources and the virtual organizationsare plugged so that resource sharing can happen.

This model is ideal for distributed user communities that have their ownresources and want to share them to achieve some common goal. However itdoes not promote resource sharing outside of the VO boundaries because thereis no compensation or reward for sharing resources with other VOs with whichthere are no common goals. An economic model promoting compensations forresource sharing is missing.

This issue extends to the computing resource owners that in the absenceof local users pushing for grid integration do not have motivation to share theirresources in a grid infrastructure. Even if they have such motivation they tend tocommit and share the least possible resources. As extreme consequence the VOsbecome isolated islands and there is no resource sharing outside of their scope.This behavior reduces tremendously the elasticity of the grid infrastructures.Consequently although some VOs may have idle resources, others that couldbenefit from this capacity will not be able to access it. Although the aggregatedgrid capacity can be large, small user communities that don’t own computingresources may not benefit by joining the grid.

Cloud computing offers a more generic solution that fits a much wider range ofuser requirements. Therefore cloud computing has the potential to attract moreusers and resource owners that could share their capacity through a scientificcloud. A large pool of cloud resources could be built by joining resources fromacademic and research organizations leading to a better optimization of theinstalled capacity. This capacity could then be exploited for grid computing andother uses.

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 47: CloudViews 2010 Proceedings

37 ‹

6 Grid adoption barriers

From the user feedback and experience we identified a number of limitations thatconstitute barriers to the adoption of the current grid computing technologies:

– Mostly oriented to batch processing– Limited support for HPC applications– Steep learning curve– Hard to deploy, maintain and operate– May require considerable human resources– Creation of new VOs is a heavy task– First time user induction is hard– Several middleware stacks without full interoperability– Not much user friendly– Reduced range of supported operating systems– Too heavy for small sites– Too heavy for users without very large processing requirements

Many improvements addressing these and other concerns have been intro-duced to simplify the deployment and use of the current infrastructures. Never-theless many of these issues are still problematic. We believe this is one of thereasons why the number of active VOs in infrastructures such as EGEE is be-coming flat. Further simplification of the processes and technology may expandthe grid to new communities. Still there will be always users and applicationsthat do not match well the grid computing model. For these cloud computing orother computing paradigms can provide a better solution.

7 Combining clouds and grids

Several scenarios for the combined use of IaaS clouds and grids are being studied.Notably the StratusLab [9] collaboration has tested several methods for runninggrids on top of clouds. Here we provide an overview of some models their advan-tages and disadvantages in the scope of scientific computing infrastructures.

7.1 Partially clouded grid

The elasticity of grid infrastructures can be increased using one or more clouds.When the grid resources become saturated clouds could be used to provide addi-tional computing systems to run grid processing nodes. The grid would expandand run on top of the cloud infrastructures. This model is economically appeal-ing because the capacity of the native grid resources could be minimized anddimensioned to sustain only the typical load scenarios, then the cloud wouldbe used to sustain the usage peaks. This model would also minimize the cloudusage costs and dependency, and as such can be more safely used with commer-cial clouds. This can be applied both to expand the capacity of individual gridcomputing sites making them elastic, and to create virtual grid computing sites

PRoCEEDInGS

Page 48: CloudViews 2010 Proceedings

› 38

that would be nothing more than front ends to computing nodes running on topof clouds. In figure 1 a grid site composed of a computing element (CE) givingaccess to a computing cluster composed of worker nodes (WN) can be expandedby joining additional worker nodes instantiated from cloud providers. Optionallyall worker nodes could be instantiated from cloud providers.

Fig. 1. Expanding grid sites to the cloud.

7.2 Fully clouded grid

A grid fully based on cloud services where the whole infrastructure would berunning on top of clouds could be used to provide a fully dynamic allocation ofresources. There can be several advantages on basing the full grid infrastructureon clouds: no need of a physical infrastructure, all management efforts can befocused in running the grid service, easier deployment without the troubles ofinstalling physical machines, and possibly load balancing and higher resiliency ifsupported by the cloud service provider. For sustained high usage this model maybecome very expensive when implemented on top of commercial grids, thereforeit requires a careful estimation of the expected usage and related costs.

8 Potential issues for scientific clouds

For scientific computing the described models present the following additionalconcerns.

For data intensive applications the network bandwidth between the comput-ing nodes, the storage systems and the users is extremely important. In Eu-rope, grids such as EGEE were built on top of Geant the European academicand research backbone, possibly the world most performing network backbone.Commercial clouds are outside of the Geant backbone and therefore must beaccessed through the commercial Internet with much higher costs and less avail-able bandwidth. We believe that for data intensive applications the computingand storage resources must be inside the Geant network or directly accessiblewithout additional costs.

Also for data intensive applications the data storage is a concern. For effi-ciency reasons the scientific data must be stored near the processing nodes andfor very large storage requirements the commercial clouds may not provide nei-ther the required capacity nor a competitive price. Even if they do, there areconcerns regarding: data privacy, data availability, and long term data access.

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 49: CloudViews 2010 Proceedings

39 ‹

High performance computing applications namely parallel applications fre-quently require low latency communication and parallel file systems. This typeof applications may need specialized low latency hardware interconnects andsoftware setups that are not commonly available in cloud infrastructures. Evenwhen the application latency requirements can be satisfied with Ethernet net-works, the use of virtual machines increases considerably the communicationlatency making it unacceptable for many parallel computing applications. Theclouds currently available may be only usable for high throughput computingapplications.

The black box approach of the commercial cloud providers prevents a fullunderstanding of the architecture and scalability of these cloud services whichraises concerns given the complexity of the scientific application requirements.

The lack of standards among cloud providers reduces the market competitive-ness and increases the fear of dependence from a specific provider (vendor lock-in). Furthermore it limits interoperability and increases the application portingeffort.

The service level agreements offered by most cloud providers fail short inproviding enough confidence in the availability and reliability of the services.

For the scientific community the financial sustainability is also a concern. Forlong term projects the payment of commercial cloud services would depend onsustained funding over the years which in many countries is not easy to ensure. Aproject may receive considerable funding for the first year but be severely cut inthe next. In this context buying hardware when there is funding can be a bettersolution. But frequently the reverse also happens when projects buy hardwarebut don’t receive sustained funding to pay their maintenance and operation.

Finally there are legal concerns in depending from a commercial cloud providersuch as what happens to the data stored in a cloud provider that closes.

9 Conclusions

Many factors have to be weighted when deciding between building and oper-ating a computing infrastructure or buying the service from a cloud provider.Independently from the cost issues many of these aspects suggest that commer-cial clouds may not be the ideal solution for the scientific research community,and that scientific clouds operated by the scientific community may provide amore adequate service. In a first approach commercial clouds may only be fea-sible for high throughput computing applications without large data processingrequirements.

In many countries the national grid initiatives have been deploying computingfacilities that although initially thought for grid computing are also suitable forother types of distributed computing. The availability of computing resourcesmade possible by the grid can now be used to promote other types of usage suchas cloud computing to complement or maximize the return of these investments.

An interesting solution for the scientific community would be the creationof scientific cloud on top of the existing computing resources operated by the

PRoCEEDInGS

Page 50: CloudViews 2010 Proceedings

› 40

national grid initiatives. These clouds could be used to provide grid computing,cloud computing and other types of services thus covering a wider range of usersand needs.

Applications that do not fit well the present grid computing models could berun directly on the cloud resources or, the users could themselves deploy in thecloud whatever middleware they consider more adequate for their applications.Users would have the power to choose and also take care of their specific needsthemselves. In addition services such as databases, repositories, web servers andothers that are not adequately supported in grid computing could be deployedin the cloud.

Since cloud computing provides a more generic approach suitable for a widerrange of requirements, it can be more widely accepted and adopted than gridcomputing. Organizations that don’t see a benefit in grid computing could be-come more easily interested in joining their resources to a scientific cloud infras-tructure. By increasing the resources available in the cloud the potential universeof computing resources for the grids running on top of it also increases and assuch grid computing would benefit from the increased capacity offered by theclouds. This approach could be complemented by an economic model that wouldgrant to the organizations sharing resources some processing time in other cloudsites, thus enabling organizations to get something in return for the idle timethat they share.

These scenarios show that infrastructures mixing cloud computing and gridcomputing can be mutually beneficial, and that grid computing can profit fromthe flexibility, and elasticity of the cloud technologies.

References

1. Enabling Grids for E-SciencE. Web site: http://www.eu-egee.org/2. Lightweight Middleware for Grid Computing. Web site:

http://glite.web.cern.ch/glite/3. Open grid Forum. Web site: http://www.ogf.org/4. The International Grid Trust Federation. Web site: http://www.igtf.net/5. Pan-European data network for research and education. Web site:

http://www.geant.net/6. European Grid Initiative. Web site: http://web.eu-egi.eu/7. Integrated Sustainable Pan-European Infrastructure for Researchers in Europe. Web

site: http://www.egi.eu/8. Resources and Services Virtualization without Barriers. Web site:

http://www.reservoir-fp7.eu/9. Enhancing Grid Infrastructures with Cloud Computing. Web site:

http://www.stratuslab.org/

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 51: CloudViews 2010 Proceedings

41 ‹

Privacy for Google Docs: Implementing a Transparent Encryption Layer

Lilian Adkinson-Orellana1, Daniel A. Rodríguez-Silva1, Felipe Gil-Castiñeira2, Juan C. Burguillo-Rial2,

1 GRADIANT, R&D Centre in Advanced Telecommunications,

Lagoas-Marcosende s/n, 36310, Vigo, Spain {ladkinson, darguez}@gradiant.org

2 Engineering Telematics Department, Universidade de Vigo,

C/ Maxwell, s/n, Campus Universitario de Vigo, 36310, Vigo, Spain {xil, jrial}@det.uvigo.es

Abstract. Cloud Computing is emerging as a mainstream technology thanks to the provided cost savings in deployment, installation, configuration and maintenance. But not all is positive in this new scenario, user’s (or company’s) confidential information is now stored in servers possibly located in foreign countries and under the control of other companies acting as infrastructure providers; so its security and privacy can be compromised. This fact discourages companies and users to adopt new solutions implemented following the Cloud Computing paradigm. In this paper we propose a solution for this problem. We have conceived a new transparent user layer for Google Docs, and implemented it as a Firefox add-on, which encrypts the information before storing it on Google servers; making virtually impossible to get access to the information without the right password.

Keywords: cloud computing, google docs, security, privacy, firefox add-on.

1 Introduction

The continuous evolution of Information Technologies (IT) and the lower cost of servers and desktop PCs (which are becoming more and more powerful) is promoting the emerging of new IT services. Among them stands out the Cloud Computing (or simply “Cloud”) paradigm. We can say that Cloud Computing has been born as the evolution and combination of several technologies, mainly: distributed computing [1], distributed storage [2] and virtualization [3]. Cloud Computing implies a change in the traditional paradigms basically because the infrastructure is completely hidden to the final user. In Cloud Computing we can find 3 levels or layers as shown in Fig. 1.

PRoCEEDInGS

Page 52: CloudViews 2010 Proceedings

› 42

Fig. 1. Cloud Computing levels: Infrastructure, Platform and Software as a Service.

IaaS (Infrastructure as a Service) is the lower level and includes infrastructure services, i.e. (virtual) machines used to run applications. We can find examples of IaaS in Amazon EC2, GoGrid or RackSpace.

PaaS (Platform as a Service) is the next abstraction level. Here we can find a

platform that allows developers to build applications following a specific API. Examples of PaaS are Google App Engine, Microsoft Azure or Sales Force.

SaaS (Software as a Service) is the highest level and involves applications

offered as a service that are executed in the Cloud (over a PaaS or a Iaas). Examples of SaaS are Google Apps, Salesforce or Zoho.

Some of the specific advantages of using cloud services are scalability, ubiquity,

pay-per-use and no hardware/maintenance investment, however there exist some problems related with integration with current systems and especially with the security and the reliability on the service [4].

As well known examples of Cloud Computing applications (belonging SaaS level)

we may cite Google’s Gmail email client service or Google Docs, a web based editor for text documents and spreadsheets that offers its service to the users who have a Google account. This paper explains the development of an add-on for the Firefox browser [5], which allows users of Google Docs service to use a security layer to protect their documents in a transparent way.

This paper is organized as follows. Section 2 describes in a more deeply way the

problematic of privacy in Cloud Computing, and also the services that offer possibilities for editing documents on the Cloud, giving particular emphasis on Google Docs. Section 3 describes the functionality and the internal structure of the presented add-on, as well as an example of the requirements and the behavior for users. Finally, section 4 gives some conclusions and raises possible future enhancements to extend the add-on functionality, explaining the current difficulties to implement them.

IaaS

PaaS

SaaS

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 53: CloudViews 2010 Proceedings

43 ‹

2 Privacy in Cloud Computing

2.1 Cloud Privacy

Concerning security in Cloud Computing paradigm, it does not only include aspects of confidentiality or privacy of the information, but it also could affect to the loss of data, although it is out of the scope of this paper. Since the processing of applications is moved to the cloud servers, sensitive data of users is exposed to the infrastructure provider. This means that users must trust in providers, nevertheless this is not always feasible, so some security mechanisms are required to solve the problem.

Particularly, it is especially critical the case when users store sensitive data remotely, because in the case the cloud servers, containing that information, suffered an attack; user’s data would be compromised. For example in [6] it is explored information leakage in third-party clouds (Amazon EC2) and it is described how it is possible, under specific circumstances, to access the information of a cloud server (a virtual machine) from another different virtual machine both in the same physical server.

In the case of a service like Google Docs, the documents of the user are simply

protected by the password associated to his Google account. If the session was not properly closed or his password was stolen, all the documents that were kept using this service would be exposed.

2.2 Document editing cloud solutions (SaaS)

Actually, there are many SaaS that offer the possibility of editing documents on the Cloud. In Table 1 some known solutions are compared taking into account their main features.

The table shows a representative set of solutions, most of them free, but there are much more web based editors that can be found with similar characteristics to the described in the table, mainly because Cloud software solutions are becoming more and more common. For example, OpenOffice offers an online version too, which is very similar to the desktop application, but it is still a beta version. Many people use this kind of software for free and this means that they have to be careful with the sensitive information they are storing in the Cloud.

PRoCEEDInGS

Page 54: CloudViews 2010 Proceedings

› 44

Table 1. Comparison among different Cloud office applications.

Maximum document

size Maximum

storage Price Real time

collaboration

Edit uploaded

documents Type

documents Google

Docs 500K 1 GB free Yes Yes Text

Spreadsheets Presentations

Zoho - 1 GB free No Yes Text Spreadsheets Presentations

Microsoft Office Live

25MB 5 GB free No No Text Spreadsheets Presentations

ThinkFree 10 MB 1 GB 30 days trial

Yes - Text Spreadsheets Presentations

Feng Office - 300MB 30 days trial

Yes Text Spreadsheets Presentations

Adobe BuzzWord

10 MB - free Yes No Text

The reason why we have chosen Google Docs is that it is a very popular and free

service, with a complete and well-documented API. The use of this API simplifies the development of possible extensions. In addition it has an easy interface that allows users to made changes at real time on shared documents.

3 Security layer to protect Google Docs documents

3.1 Firefox add-on to protect Google Docs documents

The security layer we have implemented to add privacy to Google Docs documents relies on a Firefox add-on based on JavaScript [7] and XUL [8], a language similar to XML used to create Firefox extensions.

The add-on uses two hidden documents created using the Google Docs API, which will contain all the information needed to encrypt and decrypt user’s information. One of the documents contains the data about the user’s ciphered documents (algorithm used, password and encryption options if it was necessary). The other one maintains the same information, but only about the documents that are currently being shared.

When the add-on is enabled, it starts an asynchronous communication with Google Docs servers using the API, sending AJAX requests to authenticate the user and get data about all the documents owned by the user, their sharing permissions or the content of the documents. In this way, the add-on also gets access to the hidden

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 55: CloudViews 2010 Proceedings

45 ‹

documents described before, which will act as indices of the ciphered documents, whether there are shared or not.

Furthermore, as we can see in Fig. 2, when the process starts, it creates two channel listeners to capture all the data that it is sent to and received from the servers. When for example, a document is saved, the message with the data is intercepted, encrypting only the user’s content and leaving the rest unmodified. A password chosen by the user is required in the encryption process, so nobody else could access the information of the document. Afterwards, the plaintext is replaced with the ciphered text, and the message is released, so it continues its way to the server. With this method, the user’s information received by the server is indecipherable, but the server will not notice any difference because only the document’s content is modified. It is also remarkable that every time a document is encrypted, the information related to the process is stored in the indices. If any condition of the ciphering changed, such as the document’s password or the algorithm used, the indices would be automatically updated with the updated information.

Algorithm, options

docId

Recover

Store

Google docsCloud

Encrypteddocuments

Client with Internet browser

EncryptionDecryption

module

Browser add-on

Cipheredtext

S

tore

Recover

Index

Plaintext

Fig. 2. Behavior of the add-on: components and actions involved in a secure editing.

When an encrypted document is requested, the same process is executed, but in the

opposite way. After identifying the document, the hidden index is read and the associated information to the encrypted document is obtained. Then the ciphered data of the incoming message is accessed, decrypted with the information recovered from the index (algorithm, key size, mode…) and finally replaced with the plaintext. When the document is finally shown to the user, it is completely readable, and he can work with it as it was a normal one.

PRoCEEDInGS

Page 56: CloudViews 2010 Proceedings

› 46

3.2 Using the secure Google Docs

In this section we will describe the functionality of the add-on, from the user’s perspective.

The first step to be able to use the add-on is to install the .xpi file, which is a compressed file that follows the typical structure for a Firefox add-on. Once the add-on has been installed and the user accesses to Google Docs with his Google account (http://docs.google.com), it is necessary to activate the add-on by pressing a new button with a lock image that appears at the status bar or alternatively through the browser’s tools menu.

Once it has been enabled, the main difference the user can find with respect to the

normal use of Google Docs is that the index table with his/her documents contains more information; indicating which ones have been previously ciphered and which algorithms had been used in each case (see Fig. 3).

.

Fig. 3. Google Docs interface showing the index table of available documents Supported algorithms are shown in Table 2 with their main properties. The user can choose any of them, and his/her election will influence on the security level and the speed of the encryption process [9]. The choice of the user’s password is very important too, since the most secure environment could be compromised by the use of a weak password.

If the add-on is enabled, when the user is about to save the changes in a new document, or in one that had not been ciphered till that moment, a new popup window appears, asking the user for the password to cipher the information, and the possible encryption algorithms with their corresponding options (as the key size, or the mode if it was the case).

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 57: CloudViews 2010 Proceedings

47 ‹

Table 2. List of supported encryption algorithms and its main features.

Name Block size Key size Security Speed

Speed depends on key size?

AES Advanced Encryption Standard

128 bits 128, 192, 256 bits

Secure Fast Yes

DES Data Encryption Standard

64 bits 56 bits Insecure Slow -

Triple DES

Triple Data Encryption Algorithm

64 bits 56-168 bits

Moderately secure

Very Slow

No

Blowfish - 64 bits 32-448 bits

Moderately secure

Fast No

RC4 Rivest Cipher 4 64 bits 8-2048 bits

Insecure Very fast

No

TEA Tiny Encryption Algorithm

64 bits 128 bits Insecure Fast No

xxTEA Corrected Block TEA

arbitrary, (min 64 bits)

128 bits Moderately secure

Fast No

After this step, the user will be able to work with the data as usual, being

completely transparent the process of ciphering the data. If the add-on is disabled, and the user tries to access to his ciphered documents, the result will be unintelligible to him, as it can be observed in Fig. 4.

Fig. 4. User’s ciphered document opened without using the add-on

Once the parameters of the encryption have been set for a document, it is also possible to modify them. For example, changing the password or the algorithm that was used the last time the document was saved. If the user wants to eliminate the ciphering of a concrete document (deleting its password) he can do it as well.

PRoCEEDInGS

Page 58: CloudViews 2010 Proceedings

› 48

4 Conclusion and future work

In this paper we have presented a new security mechanism for SaaS applications that brings the possibility to the users of Google Docs service to have an additional privacy layer to protect their documents on the cloud server side, with a very simple interface. Without the user’s password used to encrypt the documents, the information cannot be recovered, even by the person concerned; so if the user forgets it, the data would not be readable.

This application is currently being improved with the possibility to share encrypted documents with other users, with the only condition that all users have installed the Firefox add-on and know the shared password.

As a future enhancement of the service, we are working in the usage of the same

solution to manage the spreadsheets, but in this case some interesting problems arise: it is possible to encrypt the data of the spreadsheet, but the operations usually involved in this type of documents are not performed on client side, instead they are carried out on Cloud servers. So, it would be required a specific platform to allow processing encrypted operations with encrypted data [10,11], and this feature depends exclusively on the provider.

5 References

1. Mei-Ling Liu , “Computacion Distribuida. Fundamentos Y Aplicaciones”, Ed. Pearson Educación, 2004

2. F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, R. E. Gruber, "Bigtable: A Distributed Storage System for Structured Data", Proceedings on 7th Symposium on Operating Systems Design and Implementation (OSDI’06), November 6-8, 2006, Seattle, WA (USA)

3. Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., and Warfield, A. 2003. Xen and the art of virtualization. SIGOPS Oper. Syst. Rev. 37, 5 (Dec. 2003), 164-177

4. Tim O’Reilly, “The Fuss About Gmail and Privacy: Nine Reasons Why It's Bogus”, http://oreillynet.com/pub/wlg/4707

5. Firefox add-ons website: https://addons.mozilla.org 6. T.Ristenpart,E.Tromer,H.Shacham,andS.Savage.“Hey, you, get off of mycloud: exploring

information leakage in third-party compute clouds, ”InCCS’09:proceedings of, the 16th ACM conf. on Computer and comm. security, pages 199–212, NewYork, NY, USA, 2009

7. T. Negrino, D. Smith, “Javascript”, Pearson – Prentice Hall, 5th ed., Madrid, 2005 8. Mozilla Development Center: XUL [Online]. Available:

https://developer.mozilla.org/en/XUL. [Accessed: April 9, 2010] 9. A. A. Tamimi, “Performance Analysis of Data Encryption Algorithms”. Available:

www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf.pdf [Accessed May 11, 2010] 10. Ernest F. Brickell, Yacov Yacobi. “On Privacy Homomorphisms (Extended Abstract)”,

Advances in Cryptology – EUROCRYPT'87, LNCS, Springer-Verlag 1987, pp. 117-125 11. Juan Ramón Troncoso-Pastoriza, Stefan Katzenbeisser, and Mehmet Celik. “Privacy

preserving error resilient DNA searching through oblivious automata”. In 14th ACM Conference on Computer and Communications Security, pages 519-528, Alexandria, Virginia, USA, October 29-November 2 2007. ACM Press.

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 59: CloudViews 2010 Proceedings

49 ‹

Web based collaborative editor for LATEXdocuments

Fabio Costa1 and Antonio Pinto2

1 CIICESI, Escola Superior de Tecnologia e Gestao de Felgueiras, Politecnico doPorto

[email protected] CIICESI, Escola Superior de Tecnologia e Gestao de Felgueiras, Politecnico do

PortoINESC Porto, [email protected]

Abstract. Document editing is one of the tasks that is now seen aspossible in a cloud computing environment. This is mainly due to novelapplications such has Google Docs that are now offered as a service.Collaborative document editing implies that multiple authors are allowedto perform document edition, supporting functionalities such as revision,commenting and modification management. Cloud computing consist inthe idea of moving all files and applications to the “Cloud”, and to allowuser access from any system and platform, requiring only an Internetconnection. Moreover, these applications are expected to work similarlyas desktop applications. The capability of these applications being ableto work without connectivity to the Internet appears has the key issueto address.

1 Introduction

Novel applications such has Google Docs are now offering document edit-ing functionalities as a service. The large adoption of this type of ap-plications triggered the evolution of web standards to cope with theirrequirements, namely HTML. This type of applications are classified asa Software as a Service (SaaS). SaaS is software that is available directlyfrom the Internet, not requiring a typical software installation. Such elim-inates compatibility problems between different platforms, being the thebrowser the homogenization layer. This model reduces implementationscosts and makes software maintenance easier [6,9].

The present work consists in developing a web application that allowscreation, storage, and collaborative editing of LATEX files with revisionhistory support. The key innovation of this work relies on the foreseencapability to work off line. There are some rich-full LATEX editors on line[2,3,4], others focus on particular uses of LATEX such has equation editing[5]. However none allows its use without a permanent Internet connection.

PRoCEEDInGS

Page 60: CloudViews 2010 Proceedings

› 50

Listing 1.1. Google Gears sample manifest file

{” betaMani fe s tVers ion ” : 1 ,” ve r s i on ” : ”my ve r s i on s t r i ng ” ,” r e d i r e c tU r l ” : ” l o g i n . html” ,” e n t r i e s ” : [{ ” u r l ” : ”main . html” , ” s r c ” : ” ma i n o f f l i n e . html” } ,{ ” u r l ” : ” . ” , ” r e d i r e c t ” : ”main . html” } ,{ ” u r l ” : ”main . j s ” }{ ” u r l ” : ” formHandler . html” , ” ignoreQuery ” : t rue } ,]

}

2 Off line operation

The support for temporary off line operation, and the required data syn-chronization when the service comes back on line, can be supported byGoogle Gears [8] or, more recently, by HTML 5 [7].

2.1 Google Gears

Google Gears is a JavaScript API package that includes a local server, adatabase and a worker pool. The main purpose of the local server is toenable web applications to start and operate without Internet connectiv-ity by caching and serving resources locally, using HTTP. The databaseallows the developer to store users’s structured data on the client sidethat, when the application reconnects to the Internet, will demand syn-chronization with the main database server. The worker pool allows forthe execution of JavaScript functions in the background without hamper-ing the main page execution. For instance, this asynchronous operationof the worker pool is helpful when synchronizing large amounts of dataand still maintaining the application responsiveness.

The local server is the component that manages the application spe-cific cache. To do so, Google Gears makes available the ResourceStoreand the ManagedResourceStore classes. The first implements a typicalURL cache, storing locally ad-hoc URLs such as PDF files. The latterimplements mechanisms to automatically download and update a set ofURLs identified in a manifest file.

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 61: CloudViews 2010 Proceedings

51 ‹

Listing 1.2. HTML 5 sample manifest file

1 CACHE MANIFEST2 NETWORK:3 comm. cg i4 CACHE:5 images /sound−i con . png6 images /background . png7 s t y l e /default . c s s

An example manifest is shown in Listing 1.1. The manifest file iscomposed of attribute-value pairs that identify the contents to be cached.Namely, the attribute entries lists the URLs of all resources to be usedby the application while operating without connectivity.

2.2 HTML 5

On the other hand, the newer version of HTML (HTML 5) contains fea-tures that help web applications to operate off line, using manifest fileson the web server similarly to Google Gears. Manifests include a list offiles required by the application when working off line. The web browserthen stores these files on a local cache so that they can still be accessedeven if the Internet connection is lost.

An example HTML5 manifest is shown in Listing 1.2. These manifestsmay composed of up to three sections: cache; network; fall back. The firstsection (Lines 4 to 7) represents the resources that will be downloadedand cached locally, which will be used instead of the online resourceswhenever there is no Internet connection. The network section (Lines 2and 3) enumerates the resources that must never be cached locally, alwaysrequiring an Internet connection to be used. A fall back section can alsobe used to identify substitutes for online resources that were not cachedsuccessfully, for whatever reason.

Listing 1.3 exemplifies how to associate an HTML5 cache manifest toan HTML page. Such association, shown in line 2, must be done for everyHTML page that will require offline operation.

PRoCEEDInGS

Page 62: CloudViews 2010 Proceedings

› 52

Listing 1.3. Sample inclusion of a manifest file in HTML

< !DOCTYPEHTML><html mani fe s t=” cache . mani f e s t ”><body>. . .</body></html>

2.3 Summary

Both Google Gears and HTML5 support the development of web appli-cations that are capable of operating in scenarios of intermittent connec-tion to the Internet. The use of manifest files that list the resources to becached locally is present in both approaches.

The major drawback of HTML5 resides in the fact that it requiresan active Internet connection in the initial access to the web applica-tion. Google Gears, by including its local server, is capable of serving theapplication even if there is no active Internet connection.

The major drawback of Google Gears resides in the fact that it isnot a open standard, what reduces its availability in the client desktops.Google Gears also demands its installation in the client desktops prior toits use.

3 Related work

Multiple web based LATEX documents exist online, namely ScribTeX [3],MonkeyTex [2] and Verbosus [4]. Table 1 compares their functionalities.In particular, it shows that none of the three enable off line operation,Verbosus does not allow document editing by more than one user, onlyScibTex enables users to impose document access permissions, and Mon-keyTex does not render LATEX files in to PDF files.

Off line operation File sharing Permission management PDF rendering

ScribTex N Y Y Y

Verbosus N N N Y

MonkeyTex N Y N NTable 1. Comparison of online Latex document editors

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 63: CloudViews 2010 Proceedings

53 ‹

4 Proposed solution

The proposed solution consists in a web application for collaborative edit-ing of LATEX documents, using the novel features of HTML5 combinedwith the Latex document segmentation and latexdiff [1]. The require-ments identified for this application include:

1. User and session authentication2. User management3. User’s permissions management4. Revision history5. File differences mark-up

The first requirement imposes that a user must not access the appli-cation without being previously authenticated. The second requirementimposes that each document owner must be able to identify other usersthat will be able to collaboratively edit one or more Latex documents.The third requirement imposes that the each document owner must beable to specify which type of operations the remaining users are allowedto perform over the document. The forth requirement imposes that thedocument owner must be able to view all modifications made to docu-ments. The fifth requirement imposes that users, when editing the samedocument, must be able to visibly identify changes introduced by otherusers.

The proposed solution, named LATEX Web Editor (LWE), will makeuse of: HTML5 to support off line operation of the application; latexdiffto generate PDF files that visually mark up significant differences; and ofthe Latex document segmentation, which will enable synchronization andmanipulation of document parts while being edited by multiple users.

5 Conclusion

We propose to develop a web based collaborative LATEX editor that mayoperate despite of temporary losses of Internet connection, using the novelfeatures of HTML5 that enable off line operation.

References

1. latexdiff, available on line at http://www.ctan.org/tex-archive/support/latexdiff,April 2010.

2. Monkeytex, available on line at http://monkeytex.bradcater.webfactional.com,April 2010.

PRoCEEDInGS

Page 64: CloudViews 2010 Proceedings

› 54

3. Scribtex, available on line at http://www.scribtex.com, April 2010.4. verbosus, available on line at http://www.verbosus.com, April 2010.5. CodeCogs. Online equation editor, available on line at

http://www.codecogs.com/components/equationeditor/equationeditor.php, April2010.

6. M. Corlan D. Nickul and J. Wilber. Software as a service: A pattern for moderncomputing. May 2009.

7. Ian Hickson. Html5 (including next generation additions still in development).February 2010.

8. Omar Kilani. Taking web applications offline with gears. August 2007.9. Lizhe Wang and Gregor von Laszewski. Scientific cloud computing: Early definition

and experience. October 2008.

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 65: CloudViews 2010 Proceedings

55 ‹

Managing Cloud Frameworks through Mainstream and Emerging NSM Platforms

Pedro Assis

School of Engineering Porto Polytechnic Institute

Portugal [email protected]

Abstract. In the future Cloud interoperability shall be a key requirement, nota-bly in hybrid model federated spaces—a scenario that can be envisaged for “academic Clouds.” This paper proposes the integration of Cloud computing software frameworks, commonly called Infrastructure as a Service, with Net-work and Systems Management (NSM) platforms. In spite of current efforts addressing Cloud interoperability, the author argues that state of the art man-agement technologies can provide such support. This proposal envisages the development of several adapters to expose Cloud frameworks data as SNMP agent, CIM provider and SPARQL endpoint. Hence, Clouds monitoring, confi-guration and event handling heterogeneities are endorsed through the integra-tion with mainstream and emerging management domains. Design issues con-cerning the development of a CIM provider for OpenNebula are highlight.

Keywords: Network and Systems Management Standards, Cloud Computing, IaaS Management

1 Introduction

According to Mell and Grace, NIST (National Institute of Standards and Technology) researchers, Cloud Computing is both a deployment and service model [1] that aims ICT (Information & Communication Technologies) platforms transformation into elastic, highly available, fault tolerant, secure and multi-tenant systems. As such evo-lution takes place; it is expect that ICT technicians will focus their work on their companies’ core business and not on technology complexity. This complexity has been referred by IBM researchers Kephart and Chess as the “main obstacle to further progress in the ICT industry” [2]: Complexity yields from the deployment of larger, more sophisticated computer-based systems, revealing an increasing need to access “everything”, “anywhere” at “anytime.” The IBM Autonomic Computing Manifest identifies system complexity and Human inability to manage it as key issues that must be addressed. According to IBM, to enable sustain evolution of ICT platforms the solution, and challenge, is to develop self-managing computer-based systems. This

PRoCEEDInGS

Page 66: CloudViews 2010 Proceedings

› 56

vision, named Autonomic Computing, relates with natural self-organizing systems, which account for large numbers of interacting entities at different levels.

As Cloud computing technology matures different platforms are being deployed that make use of specific interfaces and tools to implement management functions. This proposal addresses management interoperability between Cloud computing software frameworks (e.g. OpenNebula, Eucalyptus and Nimbus), and between these and mainstream NSM platforms—Simple Network Management Protocol (SNMP) and Web-Based Enterprise Management (WBEM)—, as well as with emerging se-mantic Web technologies that are being applied to NSM—Resource Description Framework (RDF) and Web Ontology Language (OWL). The development of several adapters to expose Cloud Computing Frameworks (CCF) data as SNMP (sub)agent, CIM (Common Information Model) provider and SPARQL (Simple Protocol and RDF Query Language) endpoint, promotes the integration of CCF management with current management domains.

It is common sense that Cloud computing adoption will benefit from synergies with other research and standardization initiatives, namely in what concerns manage-ment. Why? First, promoting Cloud computing frameworks integration with main-stream management domains, CCF management shall profit from widely deployed management standards and widespread knowledge regarding their use. Secondly, CCFs will capitalize from emerging management technologies and tools, which ad-dress contemporary management requirements. Thirdly, NSM platforms offer a com-mon “interface” to unify Cloud frameworks monitoring, configuration and event handling.

Although the proposed approach sounds valuable, several questions must be inves-tigated in the course of this research: What Cloud computing scenarios are most likely to profit from this work? What initiatives, including standardization efforts, are taking place? What are the CCF management requirements? Can Cloud computing manage-ment be seamlessly integrated in current management domains? To answer these questions two case studies are presented. The first one relates with the identification and analysis of a scenario that shall benefit from this research—a case study on Euro-pean academic Clouds. The second aims to evaluate architectural and design issues of OpenNebula’s management integration within the WBEM/CIM platform. This effort is based on OpenPegasus management broker.

The remaining of this paper offers an overview of network and systems manage-ment (Section 2) based on three key questions: Why, What and How. In Section 3 a brief introduction to Cloud computing is given. A Cloud federation hybrid model for academic Clouds is discussed in Section 4. In Section 5, related research and project proposal are “put on display.” As proof-of-concept a case study on WBEM/CIM and OpenNebula CCF is discussed in Section 6. Main conclusions and further work are presented in Section 7.

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 67: CloudViews 2010 Proceedings

57 ‹

2 Network and Systems Management: Why, What and How?

Management functional areas follow OSI’s classification—Fault, Configuration, Accounting, Performance and Security (FCAPS). Fault, accounting and performance account for different views (e.g. abnormal operation detection) of system monitoring, while configuration and security are related to system control (e.g. user access). The deployment of such management functions enable network and systems administra-tors to pursuit users’ and organizations’ requirements. Some of the most cited charac-teristics a system must provide from the users’ perspective are (no particular order) improved automation, personalization and easy to use; security and monitoring fea-tures; adequate response time and restore capability. On the other hand, from the organizations’ point of view, the most relevant features are to be able to control cor-porate strategic assets, complexity and costs; reduce downtime and improve services; and support integrated management.

In the 1990s, to address the evolution of networks and computer-based systems, to which centralized management paradigm did not comply with, distributed manage-ment found its way out of research labs. Although earlier versions of TCP/IP SNMP only supported a “weak” form of distribution (based on the Client/Server paradigm), SNMPv3, RMON (Remote Monitoring) and OSI CMIP (Common Management In-formation Protocol) support management by delegation [3] (hierarchical manage-ment).

In recent times, the challenge has been to leap from mere delegation to collabora-tion (between entities involved in the management process). Collaboration requires a “strong” form of distributed paradigm based either on object distribution or code mobility. In what concerns object distribution, SNMP Script MIB, Sun JMAPI (Java Management API), OMG CORBA (Common Object Request Broker Architecture) and DMTF WBEM are examples of such approach. These systems support object distribution across heterogeneous environments (object model), supporting a set of interactions and access to common services (reference model). In what concerns code mobility, there two kinds of mobility: strong and weak [4]. In the first case, the man-agement code plus the execution state migrates between manageable nodes. Tele-script, Agent Tcl and Emerald are examples of frameworks that support strong mo-bility. On the other hand, weak mobility concerns with code migration, meaning that the embedded management tasks are reinitialized every time they move to another location. Mole, Tacoma, M0, Facile, Obliq and Safe-Tcl are examples of such plat-forms.

Although object and code distribution reveal true distributed systems, these ap-proaches only address the “What” and “How” questions—answers to these questions should disclose object and code distribution strategies, respectively. To achieve true collaboration the “Why” question must also be tackled. Answers to such question are goal driven and should provide the required guidelines to the previous questions. The Semantic Web effort (RDF, OWL, SPARQL, etc.), and Distributed Artificial Intelli-gence (DAI), can play an important role in the deployment of semantic management environments. Further information regarding distributed network and systems man-agement taxonomies can be found in [5].

PRoCEEDInGS

Page 68: CloudViews 2010 Proceedings

› 58

3 Cloud Computing

Cloud computing paradigm has been under scrutiny of researchers and business. In-between criticisms and praises, Cloud computing is affirming itself as being capable of integrate, in its ecosystem, existing technologies and tools. In the author’s view, the technologies and standards reuse, on demand provisioning (elasticity) and new busi-ness model (pay-as-you-go), are among Cloud computing highlights that justify this paradigm added value for ICT evolution.

The roots of Cloud computing lay in utility computing back in the 1990s, as Appli-cation Service Providers (ASP) started to deliver software as a service. Web services followed, and with them the promise of a new model for software delivery based on a registry for dynamic binding and discovery. Tightly couple with Web services, Ser-vice-Oriented Architecture (SOA) generalized the service provider-consumer pattern. Finally, Grid computing stands side-by-side with Cloud computing, although the latest offers much more than a simple batch submission interface. According to Kea-hey et al. “Cloud computing represents a fundamental change from the Grid compu-ting assumption: when a remote user “leases” a resource, the service provider turns control of that resource over to the user” [6].

In the real world, Cloud computing should provide the means to handle user de-mand for services, applications, data, and infrastructure in such a way that these re-quests can be rapidly orchestrated, provisioned, and scale up/down through a poll of resources related with computing, networking, and storing facilities.

Cloud services are made available through different deployment models. Mell and Grace envisage the followings—Private, Community, Public and Hybrid Clouds. A Private Cloud is operated by a single entity, while a Community Cloud is operated by a set of organizations that share common interests. Public Clouds are made available to the public or large industry group, they are own by an organization that sells Cloud services. A Hybrid Cloud is a composition of two or more Clouds as described before. Such organization models do not require in-house Cloud infrastructure, neither its management nor control. This can be provided by a third party under an outsourcing agreement.

OpenCrowd’s (www.opencrowd.com) taxonomy establishes four areas for Cloud computing—Infrastructure services (e.g. storage and computational resources), Cloud services (e.g. appliances, file storage and Cloud management), Platform services (e.g. business intelligence, database, development and testing); and Software services (e.g. billing, financial, legal, sales, desktop productivity, human resources and content management). On the other hand, NIST advises that Cloud computing should offer three main types of services, each addressing specific user needs—Infrastructure as a Service (IaaS), Platform as a Service (PaaS) and Software as a Service (SaaS). IaaS offers the provision of raw computing resources including processing, storage, and network. The consumer has control over the assigned resources, not over the underly-ing Cloud platform. Among the examples of Cloud computing software frameworks are OpenNebula, Eucalyptus and Nimbus. PaaS provides a development platform, comprising programming languages and tools, which enables consumer to develop and deploy applications onto the Cloud infrastructure. Following the e-Science initia-

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 69: CloudViews 2010 Proceedings

59 ‹

tive (www.eu-egee.org), European higher education institutions currently providing Grid Computing services would be integrated in this virtualized infra-structure, offer-ing Grid services as platform as a service. This way, it would be possible the cohabi-tation of native Grid applications as resources in a Cloud computing ecosystem, like-wise Google Apps, Microsoft Windows Azure, SalesForce.com and others. Finally, SaaS provisions applications/services running on top of the Cloud platform. Consum-er doesn’t have any control, but over user configuration data (e.g. Facebook, Gmail). The main difference between these two taxonomies is due to the emphasis that Open-Crowd places on the need to “create customize Clouds,” while Mel and Grace work does not.

4 Cloud Federation Hybrid Model: A case study on European academic Clouds

Cloud computing is an opportunity to promote Higher Education Institutions (HEIs) cooperation in what concerns knowledge and resource disclosure, as it provides tech-nological solutions to share infrastructures, applications, services and data. Cloud technology enhances the ability to cooperate, speed up processes, increase services availability, and resources scaling with potential reduction costs. A federated space of academic Clouds will embrace HEIs private Clouds, as well as public resources. Such hybrid model requires the interoperability among infrastructures to overcome tech-nology heterogeneity.

A federation of European academic Clouds would involve higher education institu-tions with different backgrounds that, eventually, had no prior contact with Cloud computing technology. Such multidisciplinary cooperation between partners is a strong point: The input of participating institutions does not originate in one specific area but it rather comprises different areas, bringing together institutions that had not had joint projects and promote cooperation between different areas.

HEI globalization allows students and higher education staff to acquire the de-manded proficiency to ensure their success in a global workplace. Such competences are no longer confined to scientific and technical issues, but include language skills, as well as social, cultural, political and ethic knowledge. Cross-credit processes and international dual awards are within initiatives that HEIs are already deploying and which require technological support to make such efforts effective. Cloud computing SaaS services might present feasible solutions to this use-scenario, as it is required to develop common interfaces to promote interoperability among HEIs academ-ic/administrative applications and information systems. One area that must be ad-dressed is the enhancement of the European Authentication and Authorization Infra-structure (AAI) to support secure academic information transactions using standard procedures related with metadata description and information mapping, authentica-tion, and data confidentiality [7]. It is likely that some of the open issues can be chal-lenged using Semantic Web and Policy management standards, namely in what con-cerns the enrichment of information description, data consolidation, account man-agement interoperability and Service Level Agreements (SLA).

PRoCEEDInGS

Page 70: CloudViews 2010 Proceedings

› 60

Internationalization requires a steady flow of financial support for institutions and mobility scholarships. Eventually, in years to come, this can reveal to be problematic for South European countries like Portugal. According to OECD (Organization for Economic Co-operation and Development), overall funding per students in OECD countries “has slowed down since the early 1990s” [8]. The same study concludes that direct public funding in 2003 was still the main source of revenue for most of the European public HEIs, namely Portuguese (about 90%), while in Asia/Pacific such scenario was quite different. In 2003, Japan, Republic of Korea and Australia direct public funding was less than 40%, being the remaining at household’s expenditures. In OECD countries private funding has a small impact on HEIs budget, except in the United States, but it has grown 5% since early 1990s to 2003. Despite these facts, internationalization is not only about costs, but an investment that provides important direct revenues: Australia international students (in-bound) accounted, in the academ-ic year 2007-08, for the third place in the export balance ($14.1 billion).

The trend should be the establishment of partnerships and through them improve HEIs portfolio, attract more foreign students, and reduce operational costs by sharing “academic commodities.” It is in this context that Cloud computing paradigm can make a difference: Datacenters consolidation, cluster resources sharing, and the usage of third party (business) Clouds for academic services (email and others) shall allow, in the mid term, financial advantages as costs associated with software and hardware acquisitions (on-site installations), and technical staff are reduced. The University of Westminster reports, “The cost of using Google Mail was literally zero. It was esti-mated that providing the equivalent storage on offer on internal systems would cost the University around £1,000,000” [9].

To gain economical advantages from Cloud computing, HEIs can, on one hand, start to use free/low cost services provided by business (education programs), on the other hand, migrate theirs monolithic datacenters to the Cloud (private). However, this is just “the tip of the iceberg.” The deployment of a Cloud community whose members, assuming both the provider and consumer roles, openly cooperate in a Cloud federation supporting transparent and elastic provision, i.e. allowing the dy-namic scale up/down of HEI’s resources (IaaS), will lead to the full potential of Cloud computing paradigm. In this case, each federated HEI should take the provider role and contribute to a common resource poll, accept common management and control policies, deploy common provision rules, and agree with SLA principles. In this con-text, a federation of identity providers must be established, similar to the AAI plat-form deployed to support the federated space of Learning Management Systems (LMS).

5 NSM & CCF Partnership: Related research and project proposal

Although several initiatives are taking place toward Cloud computing frameworks interoperability, none is actually widespread supported as each IaaS platform has a specific management console and internal API. CCF management interoperability is

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 71: CloudViews 2010 Proceedings

61 ‹

being addressed, among others, by Open Grid Forum OCCI (Open Cloud Computing Interface) working group, DMTF Open Cloud Standards Incubator and Zend’s Simple Cloud API. OCCI (www.occi-wg.org) is developing an “API specification to remote management of Cloud computing infrastructure” to support the deployment, autonom-ic scaling and monitoring required by the life-cycle management of virtualized infra-structures. With the Open Cloud Standards Incubator (www.dmtf.org/about/cloud-incubator), DMTF is working toward the standardization of Cloud management and interactions to facilitate interoperability. Finally, Simple Cloud API is designing a common API to support file and document storage, as well as simple queue services.

The present proposal (Figure 1) has similar goals as the above initiatives as it ad-dresses CCF management interoperability through its integration with current man-agement domains. However, this proposal main focus is to reuse current management technologies, promote the identification of eventual shortcomings and layout feasible solutions. Pursuing such “integrated management scenario,” it is proposed the devel-opment of a set of adapters to “expose” CCF monitoring, configuration and event handling data as a SNMP (Simple Network Management Protocol) agent (or sub-agent), CIM (Common Information Model) provider, and SPARQL (Simple Protocol and RDF Query Language) endpoint.

The SNMP adapter consists of a Management Information Base (MIB) module and a SNMP (sub)agent. The MIB describes Cloud framework’s data model, limited to what is exposed through the proper API. The adapter processes SNMP Packet Data Units (PDU), BER (Basic Encoding Rules) encoded, invoking native (CCF) scripts. As far as WBEM adapter is concerned, it comprises a CIM provider, a CIM-to-XML (Extensible Markup Language) encoder/decoder and a HyperText Transfer Protocol (HTTP) server. CIM provider interacts with Cloud computing framework, while the remaining modules support the interaction with WBEM platform using HTTP/XML.

Fig. 1. Integration of CCF management with current NSM platforms

PRoCEEDInGS

Page 72: CloudViews 2010 Proceedings

› 62

Finally, the SPARQL endpoint enables the storage of CCF’s data in RDF/OWL trip-lets stores enabling its integration and reasoning with other data sources (eventually in the Linked Data space). To provide an integrated view of heterogeneous IaaS infra-structures, it is required to describe the exposed (manageable) data according to a common RDF/OWL ontology or explore ontology merging mechanisms.

6 A Case Study on WBEM/CIM and OpenNebula

The Distributed Management Task Force WBEM/CIM standards—from an industry initiative in 1996, to an industry standard in 1999—use Web technologies to promote management interoperability between different management domains. Such standards reflect the DMTF understanding about the fundamental requirements for management success: common data description; on-the-wire encoding; and a set of operations to manipulate data.

The Common Information Model infrastructure (currently in version 2.6) is an ob-ject-oriented modeling tool, described by MOF (Management Object Format) lan-guage and supported by an UML profile (version 1.0.0b) developed by DMTF in partnership with OMG (Object Management Group). CIM establishes a three layer model: Core, Common and Extension models. Core model (version 2.25.0) captures notions applicable to all areas of management; while Common model (version 2.25.0) captures notions common to particular management areas, but independent of tech-nology or implementation. The Extension models are technology-specific extensions of the previous ones.

The Web-Based Enterprise Management (WBEM) allows CIM implementations to operate in an open, standardized manner. To achieve this it encapsulates in HTTP messages XML documents describing CIM constructs and operations (versions 2.3.1 and 1.3.1, respectively). HTTP messages expose CIM operations information in its headers to allow efficient firewall/proxy handling. Also, a CIM query language (ver-sion 1.0.0) specification is available.

The WBEM/CIM framework architecture is presented in Figure 2. In this architec-ture, the OpenNebula provider will act as a broker between CIM Object Manager (CIMOM) and OpenNebula CCF. Several open source projects provide integrated WBEM/CIM frameworks: OpenPegasus (currently in version 2.10.0), developed and maintained by The Open Group; the WBEM Services (version 1.0.2), based on JSR 48, by Sun Microsystems; the OpenWBEM (version 3.2.2), by Quest Software and Novell; and the SNIA CIMOM, now obsolete. OpenNebula’s provider specification

Fig. 2. WBEM/CIM framework architecture

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 73: CloudViews 2010 Proceedings

63 ‹

establishes OpenPegasus as the development framework (Figure 3) due to CMPI (Common Manageability Protocol Interface) support, performance issues, complete-ness and development platform. Also, this project is actively supported, providing updates in a timely manner. Besides full-fledge WBEM/CIM frameworks, it is also an interesting option the IBM WBEM/CIM server called SFCB (Small Footprint CIM Broker) developed by SBLIM project. Such product is available for many Linux dis-tributions and supports CMPI interface. According to results published by IBM, SFCB has a smaller footprint and in some scenarios better response time than Open-Pegasus (tests scenarios are described in [10]).

The CIMOM/OpenNebula provider interface (Figure 3) will be based on The Open Group’s CMPI, version 2.0 [11]. The main advantage of using such interface is the provider re-use in any WBEM-based management server supporting this interface—“write once run everywhere.” This interface supports C bindings, although C++ and others are also possible (but not standardized), reduces the effort to write a provider (e.g. memory management issues), offers interoperability between CMPI-compliant CIMOMs, supports all common CIMOM functions, it is scalable (i.e. thread-safe), and remote management (using Remote CMPI) is also possible.

The OpenNebula provider/CCF interface (Figure 3) can be based either on XML-RPC interface (version 1.4), which has support for Java, Ruby and C/C++, or OCA (OpenNebula Cloud API), version 1.4, which has Ruby and Java bindings. Both inter-faces provide access to the data related with framework’s monitoring, configuration and event handling. Such data is organized in a set of categories, e.g. host manage-ment, virtual machine management, virtual network management, and user manage-ment. These concepts are suitably described by a set of CIM classes, namely CIM_Application, CIM_User, CIM_Device, CIM_Network, CIM_Security, and CIM_System. But, to ensure consistent description of management domains and attain interoperability between management applications the guidelines provided by DMTF management profiles must be followed. These profile documents identify the classes that must be instantiated, as well as properties, methods and values that must be manipulated to “represent and manage a given domain.” DMTF provides several profiles documents related with virtualization, namely:

− Resource Allocation Profile and Allocation Capabilities Profile (both abstract patterns)

Fig. 3. OpenNebula provider interfaces

− System Virtualization Profile and Virtual System Profile (autonomous profiles)

7 Conclusion

In this paper the author presented a feasible approach, based on an OpenNebula CIM provider, to endorse Clouds management interoperability through their integration with network and systems management domains: NSM platforms offer common inter-faces to unify Cloud frameworks monitoring, configuration and event handling. Also, the development of NSM adapters shall allow the integration of CCF platforms man-agement with the management of the remaining resources, virtualized or not.

The proposal merits were identified and an application scenario based on a hybrid model federated space of academic Clouds was analyzed. Although the OpenNebula CIM provider is still an ongoing effort and some use-scenarios are still to be ad-dressed, from the study (based on on-the-wire encoding, data descriptions and sup-ported operations) and implementation made so far (based on OpenPegasus manage-ment broker), the author in confident of the WBEM/CIM suitability to seamlessly support all OpenNebula’s management functions. In the future a similar provider is envisaged for EC2-based Clouds (Figure 3). Only then true interoperability in a fede-rated Clouds scenario can be evaluated.

References

1. Mell, P. and. Grace, T. (2009) NIST Definition of Cloud Computing v15. NIST. 2. Kephart, J. O. and Chess, D. M (2003) "The Vision of Autonomic Computing", IEEE

Computer, Vol. 36, Nº.1, pp 41-50. 3. Goldszmidt, G. and Yemini, Y. (1995) “Distributed Management by Delegation”, Pro-

ceedings of the 15th International Conference on Distributed Computing Systems. 4. Ghezzi, C. and Vigna (1997) “Mobile Code Paradigms and Technologies: A Case Study”,

Mobile Agents 97, Rothermel, K. and Popescu-Zeletin, R. (Eds.), Lecture Notes In Com-puter Science, Vol. 1128, Springer-Verlag.

5. Martin-Flatin, J-P and Znaty, S. (2000) “Two Taxonomies of Distributed Network and Systems Management Paradigms”, Emerging Trends and Challenges in Network Man-agement, Ho, L. and Ray, P. (Eds.), Plenum Publishers.

6. Keahey, K., Tsugawa, M., Matsunaga, A. and Fortes, J. (2009) “Sky Computing”, IEEE Internet Computing, Vol. 13, Nº. 5, pp 43-51.

7. Armbrust, M., Fox, A., Griffith, R., Joseph, A. D., Katz, R. H., Konwinski, A., Lee, G., Patterson, D. A., Rabkin, A., Stoica, I. and Zaharia, M. (2009) Above the Clouds: A Berkeley View of Cloud Computing. Electrical Engineering and Computer Sciences, Uni-versity of California at Berkeley, Technical Report No. UCB/EECS-2009-28.

8. Kärkkäinen, K. (2006) “Emergence of Private Higher Education Funding Within the OECD Area”. OECD.

9. Sulton, N. (2010) “Cloud computing for education: A new dawn?”, International Journal of Information Management, Elselvier, Vol. 30, pp 109-116.

PRoCEEDInGS

Page 74: CloudViews 2010 Proceedings

› 64

10. Schuur, A. (2005) SFCB: Small Footprint CIM Broker. Linux Technology Center System Management, IBM.

11. The Open Group (2006) Systems Management: Common Manageability Programming Interface (CMPI), Issue 2.0.

− System Virtualization Profile and Virtual System Profile (autonomous profiles)

7 Conclusion

In this paper the author presented a feasible approach, based on an OpenNebula CIM provider, to endorse Clouds management interoperability through their integration with network and systems management domains: NSM platforms offer common inter-faces to unify Cloud frameworks monitoring, configuration and event handling. Also, the development of NSM adapters shall allow the integration of CCF platforms man-agement with the management of the remaining resources, virtualized or not.

The proposal merits were identified and an application scenario based on a hybrid model federated space of academic Clouds was analyzed. Although the OpenNebula CIM provider is still an ongoing effort and some use-scenarios are still to be ad-dressed, from the study (based on on-the-wire encoding, data descriptions and sup-ported operations) and implementation made so far (based on OpenPegasus manage-ment broker), the author in confident of the WBEM/CIM suitability to seamlessly support all OpenNebula’s management functions. In the future a similar provider is envisaged for EC2-based Clouds (Figure 3). Only then true interoperability in a fede-rated Clouds scenario can be evaluated.

References

1. Mell, P. and. Grace, T. (2009) NIST Definition of Cloud Computing v15. NIST. 2. Kephart, J. O. and Chess, D. M (2003) "The Vision of Autonomic Computing", IEEE

Computer, Vol. 36, Nº.1, pp 41-50. 3. Goldszmidt, G. and Yemini, Y. (1995) “Distributed Management by Delegation”, Pro-

ceedings of the 15th International Conference on Distributed Computing Systems. 4. Ghezzi, C. and Vigna (1997) “Mobile Code Paradigms and Technologies: A Case Study”,

Mobile Agents 97, Rothermel, K. and Popescu-Zeletin, R. (Eds.), Lecture Notes In Com-puter Science, Vol. 1128, Springer-Verlag.

5. Martin-Flatin, J-P and Znaty, S. (2000) “Two Taxonomies of Distributed Network and Systems Management Paradigms”, Emerging Trends and Challenges in Network Man-agement, Ho, L. and Ray, P. (Eds.), Plenum Publishers.

6. Keahey, K., Tsugawa, M., Matsunaga, A. and Fortes, J. (2009) “Sky Computing”, IEEE Internet Computing, Vol. 13, Nº. 5, pp 43-51.

7. Armbrust, M., Fox, A., Griffith, R., Joseph, A. D., Katz, R. H., Konwinski, A., Lee, G., Patterson, D. A., Rabkin, A., Stoica, I. and Zaharia, M. (2009) Above the Clouds: A Berkeley View of Cloud Computing. Electrical Engineering and Computer Sciences, Uni-versity of California at Berkeley, Technical Report No. UCB/EECS-2009-28.

8. Kärkkäinen, K. (2006) “Emergence of Private Higher Education Funding Within the OECD Area”. OECD.

9. Sulton, N. (2010) “Cloud computing for education: A new dawn?”, International Journal of Information Management, Elselvier, Vol. 30, pp 109-116.

2nD CLoUD CoMPUTInG InTERnATIonAL ConFEREnCE

Page 75: CloudViews 2010 Proceedings

EuroCloud

EuroCloud

EuroCloud

EuroCloud

Page 76: CloudViews 2010 Proceedings

EuroCloud

EuroCloud

EuroCloud

EuroCloud CLOUDVIEWS

2010 CLOUD ECOSYSTEM

CLO

UD

ECO

SYST

EM P

RO

CEED

ING

S CL

OU

DV

IEW

S 2

010

2nd CLOUD COMPUTINGINTERNATIONAL

CONFERENCE

PROCEEDINGS CLOUDVIEWS 2010 2ND CLOUD COMPUTING INTERNATIONAL CONFERENCE MAY 20-21 2010, PORTO, PORTUGAL