globus virtual workspaces

53
Cloud Computing and Virtualization with Globus Oakland, May 2008 Kate Keahey ([email protected]) Tim Freeman ([email protected]) University of Chicago Argonne National Laboratory

Upload: phamquynh

Post on 14-Feb-2017

245 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Globus Virtual Workspaces

Cloud Computing and Virtualization

with Globus

Oakland, May 2008

Kate Keahey ([email protected])Tim Freeman ([email protected])

University of ChicagoArgonne National Laboratory

Page 2: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Cloud Computing Tutorial Hands-on

To participate in the hands-on part of the tutorial, send your PKI X509 subject line to [email protected]

The first 10 requests will be given access to the nimbus cloud

Hurry!

Page 3: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Overview Motivation The Workspace Ecosystem: Abstractions and

Background The Workspace Deployment Tools Managing Resources with Virtual Workspaces Appliance management and contextualization Virtual Cluster Management with Workspace Tools Application Example: the STAR experiment Cloud Computing Run on the cloud: hands-on tutorial

Page 4: Globus Virtual Workspaces

Motivation

Page 5: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

1) Configuration: finding environment tailored to my application

2) Leasing: negotiating a resource allocation tailored to my needs

?

A Good Workspace is Hard to Find

Page 6: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Consumer’s Perspective:Quality of Life

Real life applications are complex STAR example: Developed over more than 10 years,

by more than 100 scientists, comprises ~2 M lines of C++ and Fortran code

… and require complex, customized environments Rely heavily on the right combination of compiler

versions and available libraries Environment validation

To ensure reproducibility and result uniformity across environments

Page 7: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Consumer’s Perspective:Quality of Service

There is life beyond submitting batch jobs Resource leases rather than job submission

Control of resources Explicit SLA: different sites offer different

quality of service Satisfying peak demand

Experiment season, paper deadlines, etc.

Page 8: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Provider’s Perspective Providing resources is easy, providing environments is

hard User comment: “I have 512 nodes I cannot use” ;-)

Fine-tuning environments for different communities is expensive Evaluating, installing and maintaining software packages

etc. Reconciling conflicts Coordinating update schedules for different communities

is a nightmare It may be hard to justify configuring/dedicating

resources if they are only needed 1% of the time -- even if the 1% is very important for one of your users

Page 9: Globus Virtual Workspaces

The Workspace Ecosystem: Abstractions and Background

Page 10: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Virtual Workspaces A dynamically provisioned environment

A complete environment: a complete (software) environment as required by our community or applications provisioned on demand.

Resource allocation: provision the resources the workspace needs (CPUs, memory, disk, bandwidth, availability), allowing for dynamic renegotiation to reflect changing requirements and conditions.

Deployment point of view Appliances/virtual appliances

A complete environment that can be packaged in various formats

Packaging point of view

Page 11: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Workspace Implementations Traditional tools

Base environment (discovery)

Automated configuration

Typically long deployment time

Isolation Performance isolation

Runtime environment

Virtual machines Complete

environment Contextualization

Short deployment time

Very good isolation Runtime performance

impact

Paper: “Virtual Workspaces: Achieving Quality of Service and Quality of Life in the Grid”

Page 12: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

The Virtues of Virtualization

Hardware

Virtual Machine Monitor (VMM) / Hypervisor

Guest OS(Linux)

Guest OS(NetBSD)

Guest OS(Windows)

VM VM VM

AppApp AppAppAppXenVMWareUMLKVMetc.

Parallels

Bring your environment with you Excellent enforcement and isolation Fast to deploy, enables short-term leasing Have a performance impact but it is acceptable for most

modern hypervisors Suspend/resume, migration

Page 13: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Creating a Virtual Cluster that Works

Resource

VM

Obtain a lease on a raw resource

Deploy VMs onto the resource

Put the VMs in context

VMVM

Deploy virtual machines

Contextualization layer

Create a functioning virtual ensemble

Page 14: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

The Workspace Ecosystem

Resource Providers:Grid providers: TeraGrid, OSG, etc.

Commercial providers: EC2, Sun, etc.

Appliance Providers:off-the-shelf environment bundles

certified/endorsed for safetyleverage appliance software

commercial and open “marketplaces”

Appliance Deployment:Mapping environments onto leased computing resources

Coordinating creation of virtual resourcesA mix of open source software and proprietary tools

communicating via common protocols

Page 15: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Roles and Responsibilities Division of labor

Resource providers provide resources Virtual organizations provide appliances Middleware that maps appliances onto resources

Appliance management software Appliance creation, maintenance, validation, etc. Not an appliance provider

Shifting the work around Into the hands of the parties most motivated and

qualified to do it

Page 16: Globus Virtual Workspaces

Workspace Deployment Tools

Page 17: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Virtual Workspaces: Vital Stats

Virtual Workspace software allows an authorized client to dynamically deploy and manage workspaces Virtual Workspace Service (VWS), workspace control,

Context Broker Currently implements workspaces as Xen VMs

KVM coming this summer Also, contextualization layer Globus incubator project Started ~2003, first release in September 2005 Current release 1.3.1 (March ‘08) Download it from:

http://workspace.globus.org

Page 18: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Using Workspaces(Deployment)

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Workspace

-Workspace metadata -Pointer to the image-Logistics information

-Deployment request-CPU, memory, node count, etc.

VWSService

Page 19: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Using Workspaces(Interaction)

Poolnode

Trusted Computing Base (TCB)

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

The workspace service publishesinformation on each workspace

as standard WSRF ResourceProperties.

Users can query thoseproperties to find out

information about theirworkspace (e.g. what IP

the workspace was bound to)

Users can interact directly with their

workspaces the same way the would with a

physical machine.

VWSService

Page 20: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Workspace Service (what sits inside)

Poolnode

Trusted Computing Base (TCB)

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

VWSService

Workspace WSRF front-end that allows clients

to deploy and manage virtual workspaces

Resource manager for a pool of physical nodesDeploys and manages

Workspaces on the nodes

Contextualizationcreates a common context

for a virtual cluster

Each node must have a VMM (Xen) installed, as

well as the workspace control program that manages

individual nodes

Workspace back-end:

Page 21: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Workspace Service Components

GT4 WSRF front-end Leverages GT core and services, notifications, security, etc. Roughly follows the OGF WS-Agreement provisioning

model Lease descriptions Publishes available lease terms

Workspace Service back-end Works with multiple Resource Managers Workspace Control for on the node functions

Contextualization Put the virtual appliance in its deployment context

Page 22: Globus Virtual Workspaces

Managing Resources with Virtual Workspaces

Page 23: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Workspace Back-Ends Default resource manager (basic slot fitting)

“datacenter technology” equivalent Used for OSG Edge Services project

Challenge: finding Xen-enabled resources Amazon Elastic Compute Cloud (EC2)

Software similar to Workspace Service (no virtual clusters, contextualization, fine-grain allocations, etc.)

Solution: develop a back-end to EC2 Grid credential admission -> EC2 charging model Address contextualization needs

Challenge: integrating VMs into current provisioning models

Solution: gliding in VMs with the Workspace Pilot

Page 24: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

The Workspace Pilot Challenge: how can I provide a “cloud” using

virtualization without disrupting the current operation of my cluster?

Flying Low: the Workspace Pilot Integrates with popular LRMs (such as PBS) Implements “best effort” leases Glidein approach: submits a “pilot” program that claims a

resource slot Includes administrator tools

Deployment Testing @ U of Victoria (Atlas), Ian Gable and collaborators Adapting for the use of the Atlas experiment @ CERN, Omer

Khalid TeraPort (small partition)

Page 25: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Workspace Pilot in Action

VWS

LRM/PBS

Xen dom0

Xen dom0

Xen dom0

VM

VMVM

VM

Level 1:provision raw

resources

Level 2:provision VMs

Page 26: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

The Pilot Program Uses Xen balloon driver to reduce/restore domain0 memory so

that guest domains (VMs) can be deployed Secure VM deployment

The pilot requires sudo privilege and thus can be used only with site administrator’s approval

The workspace service provides fine-grained authorization for all requests

Signal handling SIGTERM: pilot exceeded its allotted time

Notifies VWS, allows it to clean up After a configurable time period takes things into its hands.

Default policy: one VM per physical node Available for download

Workspace Release 1.3.1: http://workspace.globus.org/downloads/index.html

Page 27: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Workspace Control VM control

Starting, stopping, pausing, etc. Integrating a VM into the network

Assigning MAC addresses and IP addresses DHCP delivery tool Building up a trusted (non-spoofable) networking layer

VM image propagation Image management and reconstruction

creating blank partitions, sharing partitions Contextualization information management Talks to the workspace service via ssh Can be used as a standalone component

Page 28: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Workspace Back-Ends Default resource manager (basic slot fitting)

“datacenter technology” equivalent Used for OSG Edge Services project

Challenge: finding Xen-enabled resources Amazon Elastic Compute Cloud (EC2)

Software similar to Workspace Service (no virtual clusters, contextualization, fine-grain allocations, etc.)

Solution: develop a back-end to EC2 Grid credential admission -> EC2 charging model Address contextualization needs

Challenge: integrating VMs into current provisioning models Solution: gliding in VMs with the Workspace Pilot Long-term solutions

Leasing model with explicit terms Semantically rich leases: advance reservations, urgent leases,

renegotiable leases, etc. Cost-effective lease semantics

Page 29: Globus Virtual Workspaces

Appliance Management and Contextualization

Page 30: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Where Do Appliances Come From?

Marketplaces(VMWare, EC2, Workspace …)

appliancedescription

Appliance Provider(a user, a VO, a Grid…)

Good… but: maintenance? ease of use? formats?

Page 31: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Where Do Appliances Come From?

Marketplaces(VMWare, EC2, Workspace …)

appliancedescription

Appliance Provider(a user, a VO, a Grid…)

Appliance ManagementSoftware

(OSFarm, rPath,…))

Xen VMware CDROM

Better

Page 32: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Deploying Appliances Appliances need to be “portable”

So that they can be reused in many contexts

Making the appliance context-aware: Other appliances Site-specific information (e.g. a DNS

server) User/group/VO/Grid-specific information

(e.g. public keys, host certs, gridmapfiles, etc.)

Security issues Who do I trust to provide legitimate

context information? How do I make sure that appliances

adhere to my site policies?

VM

VMVM

VM

site

Virtual Organization

Page 33: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Where Do Appliances Come From?

Marketplaces(VMWare, EC2, Workspace …)

appliancedescription

applianceassertions

appliancecontextualization

Appliance Provider(a user, a VO, a Grid…)

Appliance ManagementSoftware

(OSFarm, rPath, CohesiveFT…))

Xen VMware CDROM

Page 34: Globus Virtual Workspaces

Creating Virtual Clusters with Workspace Tools

Page 35: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Make Me a Working Cluster You got some VMs and you’ve deployed them… Now

What? What network are they connected to? Do they actually represent something useful? (like a ready-to-use OSG

cluster?) Do the VMs know about each other? Can they share some disk? How do they integrate into the site storage/account system? Do they have host certificates? And a gridmapfile? And all the other things that will integrate them into my VO?

Challenge: what is a virtual cluster? A more complex virtual machine

Networking, shared storage, etc. Available at the same time and sharing a common context Example: an OSG cluster

Solutions Ensemble management Exporting and sharing common context Sophisticated networking configurations.

Paper: “Virtual Clusters for Grid Communities”, CCGrid 2006

Page 36: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Contextualization Challenge: Putting a VM in the deployment context of the Grid, site,

and other VMs Assigning and sharing IP addresses, name resolution, application-level

configuration, etc. Solution: Management of Common Context

Paper: “A Scalable Approach To Deploying And Managing Appliances”, TeraGrid conference 2007

Configuration-dependent provides&requires

Common understanding between the image “vendor” and deployer

Mechanisms for securely delivering the required information to images across different implementations

contextualization agent

Common Context

IPhostname

pk

Page 37: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Appliance

context agent

Contextualizing Appliances

Appliancecontext template

application-specificcontext agents

appliancecontent

disk image

Appliance Provider

Appliance Deployer

Context Broker

Resource Provider

generic context

appliance context

appliance context

Page 38: Globus Virtual Workspaces

Application Example: Virtualization with the STAR experiment

Page 39: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Virtual Workspaces for STAR STAR image configuration

A virtual cluster composed of one OSG headnode and multiple STAR worker nodes

Using the workspace service over EC2 to provision resources Allocations of up to 100 nodes Dynamically contextualized for out-of-the-box cluster

Page 40: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Deployment stages: Create an “ensemble” defining the virtual cluster Deploy the virtual machines Contextualize to provide an out-of the-box cluster

Contextualization: Cluster applications: NFS & PBS Grid information: gridmapfile and host certificates

Runs Using VWS on the nimbus cloud for small node allocations

(VWS + default + Context Broker) Using VWS with EC2 backend for allocations of ~100 nodes

(VWS + EC2 backend + Context Broker)

Virtual Workspaces for STAR

Page 41: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Running jobs : 300Running jobs : 300

PDSF

Fermi

VWS/EC2 BNLRunning jobs : 230

Running jobs : 150 Running jobs : 50

Running jobs : 150

Running jobs : 300Running jobs : 282Running jobs : 243Running jobs : 221Running jobs : 195Running jobs : 140Running jobs : 76Running jobs : 0

Running jobs : 200 Running jobs : 50

Running jobs : 150Running jobs : 142Running jobs : 124Running jobs : 109Running jobs : 94Running jobs : 73Running jobs : 42

Running jobs : 195Running jobs : 183Running jobs : 152Running jobs : 136Running jobs : 96Running jobs : 54Running jobs : 37Running jobs : 0 Running jobs : 42Running jobs : 39Running jobs : 34Running jobs : 27Running jobs : 21Running jobs : 15Running jobs : 9Running jobs : 0

Running jobs : 0

Job Completion :

File Recovery :

WSU

with thanks to Jerome Lauret and Doug Olson of the STAR project

Page 42: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

NerscPDSF

EC2(via Workspace

Service)

WSU

Accelerated display of a workflow job state Y = job number, X = job state

with thanks to Jerome Lauret and Doug Olson of the STAR projectwith thanks to Jerome Lauret and Doug Olson of the STAR project

Page 43: Globus Virtual Workspaces

Cloud Computing

Page 44: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

The Workspace Cloud Client We took the workspace client and made it easy to

use Narrowing down the functionality Wrapper on top of the workspace client

Allows scientists to lease VMs roughly following Amazon’s EC2 model (simplified)

PKI X509 credentials and quotas instead of payment The goal is to restore/evolve this functionality as user

requests come in Saving VMs, network configurations In the future: richer leases, etc.

“Cloudkit” coming out in next release, due soon

Page 45: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Nimbus @ University of Chicago Objectives

Make it easy for scientific community to experiment with this mode of resource provisioning

Learn about the requirements of scientific projects and evolve the infrastructure

Features, SLAs, security and sharing concerns, etc. Vital Stats

Deployed on 16 nodes of TeraPort cluster @ UC Powered by the workspace set of tools Image management handled via gridFTP Made available mid-March ‘08 http://workspace.globus.org/clouds/

To obtain access mail [email protected] Available to scientific, educational projects, open source testing, etc.

Page 46: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Science Clouds A group of clouds making resources available “on the nimbus

model” Nimbus, Stratus@UFL (Mauricio Tsugawa), FZK in Germany (almost

done, Lizhe Wang), others expressed interest EC2

Some differences in setup, policies UFL requires private networks (using OpenVPN)

Currently you’d use the same credential for the cloud and for the virtual private network

EC2 requires payment Cloud federation

Moving an app from a hardware platform to a cloud is relatively hard Need image, learn new paradigm, etc.

Moving between clouds is relatively easy … if you have “rough consensus” on interfaces, image formats, etc.

Page 47: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Who runs on the clouds and what do they do?

Page 48: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Related Projects Portal development (Josh Boverhof, LBNL) Workspace KVM backend (Michael Fenn,

Clemson University) Integration with the Nebula project

(University of Madrid)

Page 49: Globus Virtual Workspaces

Let’s get on the cloud!

Page 50: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Parting Thoughts

Page 51: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Parting Thoughts Come and run on science clouds Not just cloud computing

A bunch of technologies have to come together to make cloud computing widespread

The way we do computing is changing Today we build horseless carriages Tomorrow we might do things differently

Page 52: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Credits Workspace team:

Kate&Tim Guest appearances

Ian Foster, Frank Siebenlist With thanks to many collaborators:

Jerome Lauret (STAR, BNL), Doug Olson (STAR, LBNL), Marty Wesley (rPath), Stu Gott (rPath), Ken Van Dine (rPath), Predrag Buncic (Alice, CERN), Haavard Bjerke (CERN), Rick Bradshaw (Bcfg2, ANL), Narayan Desai (Bcfg2, ANL), Duncan Penfold-Brown (Atlas,uvic), Ian Gable (Atlas, uvic), David Grundy (Atlas, uvic), Ti Legget (University of Chicago), Greg Cross (University of Chicago), Lizhe Wang (FZK), Marcel Kunze (FZK), Mauricio Tsugawa (UFL), Jose Fortes (UFL), Renato Figueiredo (UFL), Omer Khalid (CERN), Artem Harutyunyan (CERN), Mike Fenn (U of Clemson), Sebastien Goasguen (U of Clemson), Josh Boverhof (LBNL), Leve Hajdu (STAR, BNL), Lidia Didenko (STAR, BNL), David Bartle (Atlas, uvic), Lee Liming (ANL), Frank Wuerthwein (OSG, SDSC), Abhishek Rana (OSG, SDSC), Jeff Chase (Duke), and many others.

Page 53: Globus Virtual Workspaces

05/14/08 Virtual Workspaces: http//workspace.globus.org

Sponsors NSF SDCI “Missing Links” NSF CSR “Virtual Playgrounds” TeraGrid DOE SciDAC CEDPS