grid computing anda iamnitchi federated distributed systems, fall ‘06 including slides adapted...

45
Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Upload: magdalene-hudson

Post on 25-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Grid Computing

Anda IamnitchiFederated Distributed Systems, Fall ‘06

Including slides adapted from presentations byIan Foster, Lee Liming, Paul Jeffreys

Page 2: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Front page FT, 7th March 2000

Page 3: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys
Page 5: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

But…

Page 6: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

What is the Grid?

“Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations”

“When the network is as fast as the computer's internal links, the machine disintegrates across the net into a set

of special purpose appliances” (George Gilder) “The Anatomy of the Grid”, Foster, Kesselman, Tuecke, 2001

Page 7: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Motivation (1):Revolution in Science

• Pre-Internet– Theorize &/or experiment, alone

or in small teams; publish paper

• Post-Internet– Construct and mine large databases of

observational or simulation data– Develop simulations & analyses– Access specialized devices remotely– Exchange information within

distributed multidisciplinary teams

Page 8: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Motivation (2):Revolution in Business

• Pre-Internet– Central data processing facility

• Post-Internet– Enterprise computing is highly distributed,

heterogeneous, inter-enterprise (B2B)– Business processes increasingly

computing- & data-rich– Outsourcing becomes feasible =>

service providers of various sorts

Page 9: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

The (Power) Grid:On-Demand Access to Electricity

Time

Qua

lity,

eco

nom

ies

of s

cale

Page 10: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

By Analogy, A Computing Grid

• Decouple production and consumption– Enable on-demand access– Achieve economies of scale– Enhance consumer flexibility– Enable new devices

Page 11: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Not Exactly a New Idea …

• “The time-sharing computer system can unite a group of investigators …. one can conceive of such a facility as an … intellectual public utility.”– Fernando Corbato and Robert Fano, 1966

• “We will perhaps see the spread of ‘computer utilities’, which, like present electric and telephone utilities, will service individual homes and offices across the country.”– Len Kleinrock, 1967

Page 12: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

But Things are Different Now …

Page 13: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Computing isn’t Really Like Electricity

• I import electricity but must export data• “Computing” is not interchangeable but highly

heterogeneous: data, sensors, services, …• This complicates things; but also means that the sum

can be greater than the parts – Real opportunity: Construct new capabilities dynamically from

distributed services

• Raises fundamental questions– Achieving economies of scale– Quality of service across distributed services– Applications that exploit synergies

Page 14: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

How Can We Tell Hype from Facts?

• Everyday problem, isn’t it?• Learn/verify the facts• Know the context

– Multi-institutional (== federated)• Thus, a cluster? (Sun Grid Engine!!!)

– Dynamic (somewhat)• Look at results

– Research innovation (in computer and computational science)

– Scientific discovery– Existing/deployed grids

Page 15: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

“We must addressscale & failure”

P2P and Grids: Resource Sharing Across Administrative Domains

“We need infrastructure”

“On Death, Taxes and the Convergence of P2P and Grids”, Foster, Iamnitchi 2003

Page 16: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Compare & Contrast (1):Definitions

Grid:

P2P:

• “Infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities” (1998)

• “A system that coordinates resources not subject to centralized control, using open, general-purpose protocols to deliver nontrivial QoS” (2002)

• “Applications that takes advantage of resources at the edges of the Internet” (2000)

• “Decentralized, self-organizing distributed systems, in which all or most communication is symmetric” (2002)

Page 17: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Compare and Contrast (2):Details of Deployed Systems

• Target communities and incentives

• Resources engaged

• Applications

• Scale and failure

• Services and infrastructure

Page 18: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Target Communities & IncentivesGrid• Established communities

– Science, some industry– Homogeneous– Restricted participation

• Good behavior: – Implicit incentives– Means to enforce it

Consequences:• Trust• Well-defined “tax base”• Less flexibility?

P2P• Anonymous individuals

• No implicit incentives for good behavior

Consequences: • No trust• Free riding• Implicit incentives for cheating: Seti@home,

music sharing

Page 19: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

ResourcesGrid• More diverse (in type):

– Files, storage, computing power, network, instruments

• More powerful• Good availability• Well connected• Technical support

Consequence:• Costly resource integration

P2P• Computing cycles XOR files• Less powerful• Intermittent participation

– Gnutella: avg. lifetime 1h (‘01)– MojoNation: 1/6 users always on– Overnet: 50% nodes available 70% of time over a

week (‘02)

• Variably connected• Some technical support as community effort

Consequence:• Ease of integration of new resources an early

priority

Page 20: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

ApplicationsGrid• Often complex & involving various combinations

of– Data manipulation– Computation– Tele-instrumentation

• Wide range of computational models, e.g.– Embarrassingly ||– Tightly coupled – Workflow

Consequences:– Complexity often inherent in the application

itself– (Inevitably?) Complex infrastructure to

support applications

P2P• Some

– File sharing– Number crunching– Content distribution– Measurements

• “Toy” applications only?– Albeit very popular “toys”!

Consequence:– Complexity often derives from scale

Page 21: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Scale and FailureGrid• Moderate number of entities

– 100s institutions, 1000s users• Large amounts of activity

– 4.5 TB/day (D0 experiment)• Approaches to failure reflect

assumptions– E.g., centralized components

P2P• Large numbers of entities:

– Millions of users• Moderate activity

– E.g., 1-2 TB in Gnutella (’01)• Diverse approaches to failure

– Some centralized (SETI, …)– Some highly self-configuring

FastTrack 3,488,719

eDonkey 1,661,132

iMesh 1,211,965

Overnet 1,146,880

MP2P 250,927

Gnutella 219,009

DirectConnect 204,237

(www.slyck.com,  January 25, 2004)

Page 22: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Grids for Physics: LHC Computing Grid

Page 23: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Services and InfrastructureGrid• Standard protocols (Global Grid Forum,

etc.)• De facto standard software (open source

Globus Toolkit)• Shared infrastructure (authentication,

discovery, resource access, etc.)

Consequences:• Reusable services• Large developer & user communities• Interoperability & code reuse

P2P• Each application defines & deploys

completely independent “infrastructure”

• JXTA, BOINC, XtremWeb?• Efforts started to define common

APIs, albeit with limited scope to date

Consequences:• New (albeit simple) install per

application • Interoperability & code reuse not

achieved

Page 24: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Convergent Environment: Large, Dynamic, Self-Configuring Grids

Scale & volatility

Functionality &infrastructure

Grids

P2P

•Large scale•Weaker trust assumptions•Ease of integration

•No centralized authority•Intermittent resource/user participation•Diversity in:

•Shared resources•Sharing characteristics

•Variable technical support•Infrastructure (sharable services)

•Support for diverse applications

Page 25: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Existing Technologies are Helpful,but Not Complete Solutions

• Peer-to-peer technologies– Limited scope and mechanisms

• Enterprise-level distributed computing– Limited cross-organizational support

• Databases– Vertically integrated solutions

• Web services– Not dynamic

• Semantic web– Limited focus

Page 26: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

What’s Missing is Support for …

• Sharing & integration of resources, via– Discovery– Provisioning– Access (computation, data, …)– Security – Policy– Fault tolerance– Management

• In dynamic, scalable, multi-organizational settings

Page 27: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Building the Grid

• Open source software– Globus Toolkit® , UK OGSA DAI, Condor, …

• Open standards– OGSA, other GGF, IETF, W3C standards, …

• Open communities– Global Grid Forum, Globus International, collaborative

projects, …

• Open infrastructure– UK eScience, NSF Cyberinfrastructure, StarLight, AP-

Grid, …

Page 28: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Globus Toolkit® History

0

5000

10000

15000

20000

25000

30000

1997 1998 1999 2000 2001 2002

Do

wn

loa

ds

pe

r M

on

th f

rom

ftp

.glo

bu

s.o

rg

DARPA, NSF, and DOE begin funding Grid work

NASA beginsfunding Grid work,DOE adds support

The Grid: Blueprint for a New Computing

Infrastructure published

GT 1.0.0Released

Early ApplicationSuccesses Reported

NSF & European CommissionInitiate Many New Grid Projects

Anatomy of the GridPaper Released Significant

CommercialInterest inGrids

Physiology of the GridPaper Released

GT 2.0Released

Does not include downloads from:NMI, UK eScience, EU Datagrid,IBM, Platform, etc.

Page 29: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

How It Started

While helping to build/integrate a diverse range of distributed applications, the same problems kept showing up over and over again.

– Too hard to keep track of authentication data (ID/password) across institutions

– Too hard to monitor system and application status across institutions

– Too many ways to submit jobs

– Too many ways to store & access files and data

– Too many ways to keep track of data

– Too easy to leave “dangling” resources lying around (robustness)

Page 30: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Forget Homogeneity!

• Trying to force homogeneity on users is futile. Everyone has their own preferences, sometimes even dogma.

• The Internet provides the model…

Page 31: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

What Does the Globus Toolkit Cover?

GoalToday

Page 32: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Theory -> Practice

Page 33: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

building a grid(in practice)

Page 34: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Methodology• Building a Grid system or application is currently an

exercise in software integration.– Define user requirements

– Derive system requirements or features

– Survey existing components

– Identify useful components

– Develop components to fit into the gaps

– Integrate the system

– Deploy and test the system

– Maintain the system during its operation

• This should be done iteratively, with many loops and eddys in the flow.

Page 35: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

How it Really Happens

WebBrowser

ComputeServer

DataCatalog

DataViewer

Tool

Certificateauthority

ChatTool

CredentialRepository

WebPortal

ComputeServer

Resources implement standard access & management interfaces

Collective services aggregate &/or

virtualize resources

Users work with client applications

Application services organize VOs & enable

access to other services

Databaseservice

Databaseservice

Databaseservice

SimulationTool

Camera

Camera

TelepresenceMonitor

RegistrationService

Page 36: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

How it Really Happens(without Globus)

WebBrowser

ComputeServer

DataCatalog

DataViewer

Tool

Certificateauthority

ChatTool

CredentialRepository

WebPortal

ComputeServer

Resources implement standard access & management interfaces

Collective services aggregate &/or

virtualize resources

Users work with client applications

Application services organize VOs & enable

access to other services

Databaseservice

Databaseservice

Databaseservice

SimulationTool

Camera

Camera

TelepresenceMonitor

RegistrationService

A

B

C

D

E

Application Developer

10

Off the Shelf 12

Globus Toolkit 0

Grid Community

0

Page 37: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

How it Really Happens(with Globus)

WebBrowser

ComputeServer

GlobusMCS/RLS

DataViewer

Tool

CertificateAuthority

CHEF ChatTeamlet

MyProxy

CHEF

ComputeServer

Resources implement standard access & management interfaces

Collective services aggregate &/or

virtualize resources

Users work with client applications

Application services organize VOs & enable

access to other services

Databaseservice

Databaseservice

Databaseservice

SimulationTool

Camera

Camera

TelepresenceMonitor

Globus IndexService

GlobusGRAM

GlobusGRAM

GlobusDAI

GlobusDAI

GlobusDAI

Application Developer

2

Off the Shelf 9

Globus Toolkit 4

Grid Community

4

Page 38: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

What Is the Globus Toolkit?

• The Globus Toolkit is a collection of solutions to problems that frequently come up when trying to build collaborative distributed applications.

• Not turnkey solutions, but building blocks and tools for application developers and system integrators.– Some components (e.g., file transfer) go farther than others (e.g.,

remote job submission) toward end-user relevance.

• To date (v1.0 - v4.0), the Toolkit has focused on simplifying heterogeneity for application developers.

• The goal has been to capitalize on and encourage use of existing standards (IETF, W3C, OASIS, GGF).– The Toolkit also includes reference implementations of new/proposed

standards in these organizations.

Page 39: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

How To Use the Globus Toolkit

• By itself, the Toolkit has surprisingly limited end user value.– There’s very little user interface material there.– You can’t just give it to end users (scientists, engineers,

marketing specialists) and tell them to do something useful!

• The Globus Toolkit is useful to application developers and system integrators. – You’ll need to have a specific application or system in

mind.– You’ll need to have the right expertise.– You’ll need to set up prerequisite hardware/software.– You’ll need to have a plan.

Page 40: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Data Management

SecurityCommonRuntime

Execution Management

Information Services

Web Services

Components

Non-WS

Components

Pre-WSAuthenticationAuthorization

GridFTP

GridResource

Allocation Mgmt(Pre-WS GRAM)

Monitoring& Discovery

System(MDS2)

C CommonLibraries

GT2

WSAuthenticationAuthorization

ReliableFile

Transfer

OGSA-DAI[Tech Preview]

GridResource

Allocation Mgmt(WS GRAM)

Monitoring& Discovery

System(MDS4)

Java WS Core

CommunityAuthorization

ServiceGT3

ReplicaLocationService

XIO

GT3

CredentialManagement

GT4

Python WS Core[contribution]

C WS Core

CommunitySchedulerFramework

[contribution]

DelegationService

GT4

Globus Toolkit Components

Page 41: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Incr

ease

d fu

nctio

nalit

y,st

anda

rdiz

atio

n

Customsolutions

1990 1995 2000 2005

Open GridServices Arch

Real standardsMultiple implementations

Web services, etc.

Managed sharedvirtual systems

Computer science research

Globus Toolkit

Defacto standardSingle implementation

Internetstandards

The Emergence ofOpen Grid Standards

2010

Page 42: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Grid Communities

• Global Grid Forum– Standards, information exchange, advocacy– 1000+ participants in tri-annual meetings

• Application communities– E.g., physics, earthquake engineering,

biomedical, etc.

• Software development and support– NSF Middleware Initiative, UK eScience,

Globus Toolkit, EGEE, …

Page 43: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Grid Communities & Technologies

• Yesterday– Small, static communities, primarily in science– Focus on sharing of computing resources– Globus Toolkit as technology base

• Today– Larger communities in science; early industry– Focused on sharing of data and computing– Open Grid Services Architecture

• Tomorrow– Large, dynamic, diverse communities that share a wide variety of

services, resources, data– Challenging computer science research issues

Page 44: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Grid Dynamics:Vision vs. Reality

• Vision: On-demand access to computing– New communities form easily– On-demand resources from providers– Adapt easily to new missions, requirements

• Reality: Much manual configuration, e.g.:– Manually deployed services on dedicated hardware– Manually maintained access control lists– Sysadmin-maintained allocation policies– Human-mediated resource reservation

Ian Foster
Page 45: Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Reading Sources

• http://www-fp.mcs.anl.gov/~foster/talks.htm

• http://www.globus.org/

• The Grid Book

• (other links on the course page)