promoting and standardizing grid computing ogsa - a view from the trenches andrew grimshaw ggf...

23
Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

Upload: berniece-jacobs

Post on 15-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

Promoting and Standardizing Grid Computing

OGSA - A View From The Trenches

Andrew GrimshawGGF Architecture Area Co-DirectorJanuary, 2005

Page 2: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

2

Agenda

• Background – quick• OGSA objectives and process• OGSA design teams• Opportunities for collaboration

Page 3: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

3

What is an architecture?

• In the computer systems world an architecture is the definition of the components, their interactions, and the design philosophy used in the development of the whole system. In a grid, high-performance secure, shared, collaboration distributed system, the architecture will define the services, their interactions, and the design philosophy. In other words, what are the pieces of the puzzle, how do they fit together, and what does the puzzle look like when complete. One of our design philosophies is that the pieces can be replaced, extended, and tailored to particular use cases. Further, a systems architecture is the architecture on which applications, application services, and specialized views or profiles of the architecture are built. OGSA is a grid system architecture.

Page 4: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

4

Architecture Requirements

• Simple• Secure• Standards-based• Multiple interoperable implementations• Scalable• Extensible• Site Autonomy• Persistence & I/O• Multi-Language• Legacy Support• Transparency• Heterogeneity• Fault-tolerance & Exception Management

Success Requires an integrated model at the foundation.

Manage Complexity!!

Page 5: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

5

The Importance of a Strong Foundation

Page 6: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

6

OGSA Aims and Perspective

• Goals− Interoperable solutions to Grid based applications

• Grid definitions sidebar

− Addressing loosely coupled distributed computing

• Philosophy− Standardization at the Architectural level

• Similar to profiling.• Developed before and/or during standards development

− Use existing standards and technology where possible

− Use case driven gap analysis− Gaps are filled proactively

• Not exclusively within the GGF (e.g. naming).

Page 7: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

7

OGSA Process

• Use Case Driven− 21 Detailed Use Cases (~ 6 pages each)

• Tier 1 Available at: http://www.ggf.org/documents/GWD-I-E/GFD-I.029.pdf

• Distributed Specification and Standardization− Identify and/or develop open and accessible standard

specifications− Active current work in GGF, OASIS, W3C, and DMTF.

• “Design Team” Working Model− Facilitate cross fertilization within and outside GGF.− Avoid redundant work applicable efforts− Focus mind share (the most valuable commodity)

• e.g. DAIS-WG and OGSA-Data Design Team

• Iterative Refinement− Abstract service evolving to concrete specifications

• Documents:− OGSA: Use Cases, Informal Specification, Recommendation

Page 8: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

8

OGSA –What is it?

• Two streams− Profiles− Design Teams Working Groups

• Process for design team, working group, profile development interaction− Draw circle

Page 9: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

9

Profiles

• Define a usage pattern and include specifications developed by working groups both within and external to GGF.

• Issue: How mature and “widely adopted”?• Three “in the pipe”

− Basic− Data− Execution Management

Page 10: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

10

Design Teams

• Naming – the foundation on which distributed systems are built

• Security – deeply dependent on WS-Security• Data of all types • Execution Management Services – EMS• Logging – spit off into a working group

Page 11: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

11

“A Rose by any other name would smell as sweet”

Terms• Resource• Abstract resource name• Human name (paths and attributes)• Resource address• Resource identity• Binding scheme• Bind time

Page 12: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

12

Why names?

• Transparencies− Location− Migration− Failure− Replication− Scalability− and so on

Page 13: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

13

Distributed naming is a well-understood area - properties

• Unique• Provide identity• Comparable• Location portable• Widely adopted• Scalable – high performance• Extensible• Dynamic binding• ….• Two and three level name schemes dominate

Page 14: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

14

Two level schemes

• Human name -> address− E.g., DNS, Unix file system (string->inode)

• abstract name -> address

Page 15: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

15

Three level schemes

• Human -> abstract -> address

• In OGSA, − Human -> address and Human -> abstract will

likely be handled by RNS – Resource Naming Service being developed by the GGF GFS-WG

Page 16: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

16

OGSA Security

• Process is not moving rapidly− Partially because they are waiting on WS Security

• Maybe too focused on one set of use cases (big government labs working together) (my opinion)

Page 17: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

17

OGSA Data & InfoD

• Use case driven• Many different data “types” and use

scenario’s from HEP to business intelligence• Strong consensus emerging with some issues

still around meta-data and information dissemination

• Strawman services defined for flat files, interacting with GFS. Pushing for early spec’s.

• Interacting with existing GGF WG’s including GFS, GSM, DIAS, Info-D

• Interacting begun with WSDM

Page 18: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

18

Info Services

• Troubleshooting• Event Management• Discovery• Logging – spun off

Page 19: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

19

EMS Overview

• Basic problem: provision, execute and manage services (including legacy applications) in a grid− Some use cases

• start up a cache service• on-demand, utility computing• start up and manage a set of legacy applications

• Want to be able to “instantiate” a service and have the grid figure it out, and provide management interfaces throughout the lifetime of the service

Page 20: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

20

EMS addresses issues such as:• Where can a service execute? What are the locations it can execute

because of resource restrictions such as memory, CPU and binary type, available libraries, and available licenses? Given the above, what policy restrictions are in place that may further limit the candidate set of execution locations?

• Where should the service execute? Once it is known where the service can execute, the question is where should it execute? This may involve different selection algorithms that optimize different objective functions or are trying to enforce different policies or service level agreements.

• Prepare the service to execute. Just because a service can execute somewhere does not necessarily mean it can execute there without some setup. Setup could include deployment and configuration of binaries, libraries, staging data, or other operations to prepare the local execution environment to execute the service.

• Get the service executing. Once everything is ready, actually start the service and register it in the appropriate places.

• Manage (monitor, restart, move, etc.). Once the service is started in must be managed and monitored. What if it fails? Or fails to meet its service agreements. Should it be restarted in another location? What about state? Should the state be “checkpointed” periodically to ensure restartability? Is the service participating in some sort of fault-detection and recovery scheme?

Page 21: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

21

EMS Services fall into three sets

• Resources that model processing, storage, executables, resource management, and provisioning

• Job management and monitoring services; and

• Resource selection services that collectively decide where to execute a service.

Page 22: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

22

Typical Pattern

Provisioning• Deployment• Configuration

Information Services

ServiceContainer

Persistent State Handle Service

Accounting Services

Execution Planning Services

Candidate Set Generator (Work -Resource mapping)

Job Manager

Reservation

Page 23: Promoting and Standardizing Grid Computing OGSA - A View From The Trenches Andrew Grimshaw GGF Architecture Area Co-Director January, 2005

23

Opportunities For Collaboration

• OMII and EGEE efforts intersect with OGSA design team efforts

• We all win if we can come to consensus• EMS

− The basic problem that everyone (Globus, SGE, LSF, Legion, EGEE, OMII) solves is the same.

− Solutions have many similarities− EMS team spent quite a bit of time hammering

those out− We’re here to make sure that OMII input is part of

design

• Similarly for data