grid computing - staff.fnwi.uva.nl · grid computing adam belloum nikhef topical lectures, national...

42
Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

Upload: others

Post on 21-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

GridComputing

AdamBelloumNikheftopicallectures,

NationalInstituteforNuclearPhysics,andHighEnergyPhysics

2007

Page 2: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

outline

•  e-Science •  Grid approach •  Grid computing •  Programming models for the Grid •  Grid-middleware •  Web Services •  Open Grid Service Architecture (OGSA)

Page 3: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

Doing Science in the 21th century

•  Nowadays Scientific Applications are – CPU intensive – Produce/process Huge sets of Data – Requires access to geographically distributed and

expensive instruments

Page 4: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

DOE X-ray grand challenge: ANL, USC/ISI, NIST, U.Chicago tomographic reconstruction

real-time collection

wide-area dissemination

desktop & VR clients with shared controls

Advanced Photon Source

archival storage

FromtheGridtutorialsavailableat:http:.//www.globus.org

OnlineAccesstoScientificInstruments

Page 5: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

CPU intensive Science: Optimization problem NUG30

•  The problem, a quadratic assignment problem (QAP) known as NUG30 –  given a set of n locations and n

facilities, the goal is to assign each facility to a location.

–  There are n! possible assignments

•  NUG30 proposed in 1968 as a test of computer capabilities, but remained unsolved because of its great complexity.

Nug30QuadraticAssignmentProblemSolvedby1,000https://scout.wisc.edu/archives/r7125

Page 6: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

Tosolvetheseproblems?

“VL-e project” UvA

Grid Services Harness multi-domain distributed

resources

Management of comm. & computing

Management of comm. & computing

Potential Generic part

Potential Generic part

Application Specific

Part

Application Specific

Part

NUG30 Online Access

Management of comm. & computing

Potential Generic part

LHC

Application Specific

Part

Page 7: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

Grid Services Harness multi-domain distributed resources

Application Application Application

Management of comm. & computing

Management of comm. & computing

Management of comm. & computing

NUG30 Online Access LHC

Page 8: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

outline

•  e-Science •  Grid approach •  Grid computing •  Programming models for the Grid •  Grid-middleware •  Web Services •  Open Grid Service Architecture (OGSA)

Page 9: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

•  Flexible,secure,coordinatedresourcesharingamongdynamiccollectionsofindividuals,institutions,andresources

FromtheGridtutorialsavailableat:http:.//www.globus.org

TheGridProblem

•  Enablecommunities(“VirtualOrganizations”)tosharegeographicallydistributedresourcesastheypursuecommongoals--assumingtheabsenceof:centrallocation,centralcontrol,existingtrustrelationships.

Page 10: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

“A Computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities”. Karl Kesselman & Ian Foster.

Some Definitions of the Grid?

“The overall motivation for Grids is to enable the routine interactions of resources geographically and organizationally dispersed to facilitate Large-scale Science and engineering” The Vision for a DOE Science Grid, William Johnston, Lawrence Berkeley Nat. Lab.

“Making possible a shared large wide-area Computational infrastructure a concept which has been named the Grid” Peter Dinda, Gorgia Tech, 2001.

Page 11: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

“TheGridisthenextevolutionarystepforsupercomputing.” JimGray,MicrosoftResearch

“TheGridrepresentsthefirstwaveofComputinginthewell-

connectedworldintowhichweareheading” DavidCuller,UniversityofCalifornia,Berkeley

“Itisclearthatdistributedinformationprocessingwilllieat

theheartofmanyofthetechnologyofthe21stcentury” WilliamJ.Feiereisen,ProgramManagerHighPerformanceComputingandCommunications,NASA

etc.

What people think of the Grid?

Page 12: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

“thetermGridhasbeenconflated,atleastinthepopularperception,toembraceeverythingfromadvancednetworkingtoartificialintelligence”.

“Onemaywonderwhetherthetermhasanyrealsubstance

andmeaning”

From“TheAnatomyoftheGrid:EnablingScalableVirtualOrganizations”Fosteretal

Misuse of the term Grid?

Page 13: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

•  “ThebiggesthurdlefortheGridrightnowisthatthereisalotofreallyhardworktodo.Therehasbeenconsiderableunderestimationofthelevelofdifficultyoftheproblemsthatonemustaddressinordertodeploythemostsophisticatedvisionofthegrid”

Prof.FrancineBerman,UCSD,DirectorNPACIandSDSC(July2002)

ThepromiseoftheGridhasbeennotbeenoversold,butthedifficultyofdevelopingtherequisiteGridinfrastructure

hasbeenunderestimated

“BAEgrid”, Alan Gould BAE System, Advanced Technology Center, London 2004

Amorerealisticpointofview

Page 14: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

The real Grid target

•  A Grid is a system that is able to

– Coordinate resources •  not subject to centralized control

– Use standard, open, general-purpose protocols and interfaces

– Deliver nontrivial qualities of service.

“IanFoster’s3pointchecklist”

Page 15: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

Coordinated Sharing

From“TheAnatomyoftheGrid:EnablingScalableVirtualOrganizations”Fosteretal

•  sharingrelationships–  client-server,peer-to-peer,andbrokered–  accesscontrol:fineAC,delegation,local/globalpolicies

•  Thesharingiscontrolledbytheprovidersandconsumers–  whatisshared?–  whoisallowedtoshare?–  andtheconditionsunderwhichsharingoccurs?

Page 16: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

outline

•  e-Science •  Grid approach •  Grid computing •  Programming models for the Grid •  Grid-middleware •  Web Services •  Open Grid Service Architecture (OGSA)

Page 17: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

What is Grid Computing

•  Grid computing is the use of hundreds, thousands, or millions of geographically and organizationally disperse and diverse resources to solve:

è problems that require more computing power than is available from a single machine or from a local area distributed system

Page 18: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

Potential Grid Application

•  An application which requires the grid solution is likely distributed (Distributed Computing) and fit in one of the following paradigms: – High throughput Computing – High performance Computing

Grid computing will be mainly needed for large-scale, high-performance computing.

Page 19: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

Distributed Computing

•  Distributed computing is a programming model in which processing occurs in many geographically distributed places. – Processing can occur wherever it makes the most

sense, whether that is on a server, Web site, personal computer, etc.

•  Distributed computing and grid computing either

– overlap or distributed computing is a subset of grid computing

From“TheAnatomyoftheGrid:EnablingScalableVirtualOrganizations”Fosteretal

Page 20: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

High Throughput Computing

•  HTC employs large amounts of computing power for very lengthy periods – HTC is needed for doing sensitivity analyses,

parametric studies or simulations to establish statistical confidence.

•  The features of HTC are – Availability of computing power for a long period of

time – Efficient fault tolerance mechanism

•  The key to HTC in grids – Efficiently harness the use of all available resources

across organizations.

Page 21: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

High Performance Computing

•  HPC brings enormous amounts of computing power to bear over relatively short periods of time. – HPC is needed for decision-support or applications

under sharp time-constraint, such as weather modeling •  HPC applications are:

– Large in scale and complex in structure. – Real time requirements. – Ultimately must run on more than one type of HPC

system.

Page 22: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

HPC/HTC requirements

•  HPC/HTC requires a balance of computation and communication among all resources involved. – Managing computation, – communication, – data locality

Page 23: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

outline

•  e-Science •  Grid approach •  Grid computing •  Programming models for the Grid •  Grid-middleware •  Web Services •  Open Grid Service Architecture (OGSA)

Page 24: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

Programming Model for the grid

•  To achieve petaflop rates on tightly/loosely coupled grid clusters, applications will have to allow: –  extremely large granularity or produce massive

parallelism such that high latencies can be tolerated.

•  This type of parallelism, and the performance delivered by it in a heterogeneous environment, is –  currently manageable by hand-coded applications

Page 25: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

Programming Model for the grid •  Aprogrammingmodelcanbepresentedindifferentforms:alanguage,alibraryAPI,oratoolwithextensiblefunctionality.

•  Thesuccessfulprogrammingmodelwill–  enablebothhigh-performanceandtheflexiblecompositionandmanagementofresources.

–  influencetheentiresoftwarelifecycle:design,implementation,debugging,operation,maintenance,etc.

–  facilitatetheeffectiveuseofallmannerofdevelopmenttools,e.g.,compilers,debuggers,performancemonitors,etc

Page 26: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

Grid Programming Issues

•  Portability, Interoperability, and Adaptability •  Discovery •  Performance •  Fault Tolerance •  Security

Page 27: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

Programming models

•  Shared-state models •  Message passing models •  RPC and RMI models •  Hybrid Models •  Peer to Peer Models •  Web Service Models •  ...

Page 28: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

outline

•  e-Science •  Grid approach •  Grid computing •  Programming models for the Grid •  Grid-middleware •  Web Services •  Open Grid Service Architecture (OGSA)

Page 29: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

•  Architectureidentifiesthefundamentalsystemcomponents,specifiespurposeandfunctionofthesecomponents,andindicateshowthesecomponentsinteractwitheachother.

•  Gridarchitectureisaprotocolarchitecture,withprotocolsdefiningthebasicmechanismsbywhichVOusersandresourcesnegotiate,establish,manageandexploitsharingrelationships.

•  Gridarchitectureisalsoaservicestandard-basedopenarchitecturethatfacilitatesextensibility,interoperability,portabilityandcodesharing.

“Introduction to Grid Technology” B.Ramamurthy

GridMiddlewareDefinition

Page 30: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

Application

Fabric “Controlling things locally”: Access to, & control of resources

Connectivity “Talking to things”: communication (Internet protocols) & security

Resource “Sharing single resources”: negotiating access, controlling use

Internet Transport

Application

Link

Internet Protocol Architecture

Collective “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services

Architecture

Page 31: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

Incr

ease

d fu

nctio

nalit

y,

stan

dard

izat

ion

Custom solutions

1990 1995 2000 2005

Open Grid Services Arch

Real standards Multiple implementations

Web services, etc.

Managed shared virtual systems Computer science research

Globus Toolkit

Defacto standard Single implementation

Internet standards

2010 “Grid Computing and Scaling Up the Internet” I. Foster, IPv6 Forum, an

Diego, 2003

EmergenceofOpenGridStandards

Page 32: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

Examples of Grid Middleware

•  Globus Toolkit (GT4.X) now (GT5.X) – www.globus.org

•  Legion/Avaki – http://www.avaki.com/ – http://legion.virginia.edu/

•  Grid Sun engine – http://www.sun.com/service/sungrid/overview.jsp

•  Unicore – http://www.unicore.org

Page 33: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

The Grid Middleware

•  Softwaretoolkitaddressingkeytechnicalareas–  Offeramodular“bagoftechnologies”–  Enableincrementaldevelopmentofgrid-enabledtoolsandapplications

–  DefineandstandardizegridprotocolsandAPIs

•  Focusisoninter-domainissues,notclustering–  Collaborativeresourceusespanningmultipleorganizations–  Integratescleanlywithintra-domainservices–  Createsa“collective”servicelayer

“Basics Globus Toolkit™ Developer Tutorial” Globus Team, 2003

Page 34: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

•  Focusonarchitectureissues–  ProvideimplementationsofgridprotocolsandAPIsasbasicinfrastructure

–  Usetoconstructhigh-level,domain-specificsolutions

•  Designprinciples–  Keepparticipationcostlow–  Enablelocalcontrol–  Supportforadaptation

Diverse global services

Core Globus services

Local OS

A p p l i c a t i o n s

“Basics Globus Toolkit™ Developer Tutorial” Globus Team, 2003

GlobusApproach

Page 35: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

Globus Security Infrastructure

MDS client API calls to locate resources

Query current status of resource

Create

RSL Library

Parse

Allocate & create processes

Process Monitor &

control

Site boundary Client MDS: Grid Index Info Server

Gatekeeper

MDS: Grid Resource Info Server

Local Resource Manager

MDS client API calls to get resource info

GRAM client API state change callbacks

Process

Process

Job Manager

1

2

3

4

5

6

7 8

GRAM client API calls to request resource allocation

and process creation.

GlobusToolkit2.0Components

Page 36: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

outline

•  e-Science •  Grid approach •  Grid computing •  Programming models for the Grid •  Grid-middleware •  Web Services •  Open Grid Service Architecture (OGSA)

Page 37: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

•  Increasinglypopularstandards-basedframeworkforaccessingnetworkapplications–  W3Cstandardization;Microsoft,IBM,Sun,others

•  WSDL:WebServicesDescriptionLanguage–  InterfaceDefinitionLanguageforWebservices

•  SOAP:SimpleObjectAccessProtocol–  XML-basedRPCprotocol;commonWSDLtarget

•  WS-Inspection–  Conventionsforlocatingservicedescriptions

•  UDDI:UniversalDesc.,Discovery,&Integration–  DirectoryforWebservices

“Globus Toolkit Futures: An Open Grid Services Architecture” Ian Foster et al. Globus Tutorial, Argonne National Laboratory, January 29, 2002

WebServices

Page 38: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

•  “Webservices”addressdiscovery&invocationofpersistentservices–  Interfacetopersistentstateofentireenterprise

•  InGrids,mustalsosupporttransientserviceinstances,created/destroyeddynamically–  Interfacestothestatesofdistributedactivities–  E.g.workflow,videoconf.,dist.dataanalysis

•  Significantimplicationsforhowservicesaremanaged,named,discovered,andused–  Infact,muchoftheworkisconcernedwiththemanagementofserviceinstances

“Globus Toolkit Futures: An Open Grid Services Architecture” Ian Foster et al. Globus Tutorial, Argonne National Laboratory, January 29, 2002

TheNeedtoSupportTransientServiceInstances

Page 39: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

outline

•  e-Science •  Grid approach •  Grid computing •  Programming models for the Grid •  Grid-middleware •  Web Services •  Open Grid Service Architecture (OGSA)

Page 40: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

Open Grid Services Architecture •  Service orientation to virtualize resources •  From Web services:

– Standard interface definition mechanisms: multiple protocol bindings, multiple implementations, local/remote transparency

•  Building on Globus Toolkit: – Grid service: semantics for service interactions – Management of transient instances (& state) – Factory, Registry, Discovery, other services – Reliable and secure transport

•  Multiple hosting targets: J2EE, .NET, …

“Globus Toolkit Futures: An Open Grid Services Architecture” Ian Foster et al. Globus Tutorial, Argonne National Laboratory, January 29, 2002

Page 41: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

•  Manageresourcesacrossdistributedheterogeneousplatforms

•  DeliverseamlessQoS•  Provideacommonbaseforautonomicmanagementsolutions

•  Defineopen,publishedinterfaces•  Exploitindustry-standardintegrationtechnologies

–  WebServices,SOAP,XML,...

•  IntegratewithexistingITresources

‘Open Grid Services Architecture Evolution, J.P. Prost, IBM Montpellier, France, Ecole Bruide 2004

OpenGridServicesArchitectureObjectives

Page 42: Grid Computing - staff.fnwi.uva.nl · Grid Computing Adam Belloum Nikhef topical lectures, National Institute for Nuclear Physics, and High Energy Physics 2007

Web Services

Business integration

Secure and universal access

Applications on demand

Grid Protocols

Vast resource scalability

Global Accessibility

Resources on demand

access manage share

Open Grid Services Architecture

‘Open Grid Services Architecture Evolution, J.P. Prost, IBM Montpellier, France, Ecole Bruide 2004

BestofTwoWorlds