grid computing - upm

66
Grid Computing María S. Pérez [email protected]

Upload: others

Post on 08-Apr-2022

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Grid Computing - UPM

Grid Computing

María S. Pérez

[email protected]

Page 2: Grid Computing - UPM

2

Overview

IntroductionArchitectureSecurityInformation SystemsData ManagementWorkload ManagementReferences

Page 3: Grid Computing - UPM

3

Introduction

At the beginning of the 20th century, if you wanted to get electricity you needed to live near an electric generator. Nowadays there are super generators that supply to numerous clients (Electric Power Grid)As regards information, World Wide Web allows us to share information everywhere around the worldNew challenges:

– Complex problems required to analyse a great amount of data

– Researchers are located in geographically separated places

Page 4: Grid Computing - UPM

4

Introduction

Grid Computing is based on the philosophy of information and electricity sharing, allowing us to access another kind of heterogeneous and geographically separated resourcesGrid provides the sharing of:

– Computational resources– Storage elements– Specific applications

Thus, Grid is based on:– Internet protocols– Ideas of parallel and distributed computing

Page 5: Grid Computing - UPM

5

IntroductionGrid technology is an important part of several researching areas because it provides computational and storage support to applications that needs a great computational capacity and analyses a great amount of data

Page 6: Grid Computing - UPM

6

Grid

A grid can be defined as: “coordinated resources that are not subject to a centralized control ... using standard, open and general-purpose protocols and interfaces ... deliver non-trivial qualities of services ...” [Ian Foster]

Extending the definition of grid:– Special form of distributed computing– Heterogeneous resources– Computational and storage resources geographically

distributed– Resources are usually connected by wide area networks

(WAN)– Servers, supercomputers, clusters, … are grid resources

Page 7: Grid Computing - UPM

7

Usual Grids

Grid Computing is used in applications with the following characteristics:– Have a community of distributed users– Need a great computational power– Need a great storage capacity

Above all, it is used in researching areas which face up to complex problems and store numerous data– High Energy Physics– Earth Observation– Bio-medicine

Page 8: Grid Computing - UPM

8

Overview

IntroductionArchitectureSecurityInformation SystemsData ManagementWorkload ManagementReferences

Page 9: Grid Computing - UPM

9

Grid architecture

Application

High level MiddlewareEDG, Crossgrid

Low level MiddlewareGlobus, Unicore, Legion

Operating systems Unix, Linux, Windows

Hardware

Page 10: Grid Computing - UPM

10

Grid architecture

Local resources access and control

Communication by means of Internet protocols and security

Resources sharing and access negotiation

Coordination of several resources Application

Application

Collective

Resource

Connectivity

Fabric

Transport

Network

Link

Page 11: Grid Computing - UPM

11

Elements

Resource providers – Publish the availability of their resources by

means of information systems– Define their own security policies

Broker– Register and categorize the published services

providing collective searchRequesters– Use brokering services to find and use resources

Page 12: Grid Computing - UPM

12Resources providersBrokerRequester

Communication

Example

Page 13: Grid Computing - UPM

13

Pilares básicos

Page 14: Grid Computing - UPM

14

Overview

IntroductionArchitectureSecurityInformation SystemsData ManagementWorkload ManagementReferences

Page 15: Grid Computing - UPM

15

Need of security

Distributed resourcesNo centralized controlDifferent resource providersEach resource provider uses different security policies

Page 16: Grid Computing - UPM

16

Security in Grid

Generic Security Services (GSS)– Authentication, delegation, integrity and

confidentiality – Public Key Infrastructure (PKI) with X.509

certificates– Kerberos– Secure Socket Layer (SSL)

Grid Security Infrastructure (GSI)– Delegation– Single Sign-On Proxy certificates

Page 17: Grid Computing - UPM

17

Certificate request

An user ask for a certificate to a Certification Authority (CA)The CA checks the user identityThen, the CA signs the request, creating a certificate, and return it to the user– Certificates can be cancelled

Certificate Revocation List (CRL)

The aim of the certificates is described in the certificate policy (CP)

Page 18: Grid Computing - UPM

18

Overview

IntroductionArchitectureSecurityInformation Systems

Grid MonitoringData ManagementWorkload ManagementReferences

Page 19: Grid Computing - UPM

19

Information Systems

Provide information on:– The Grid itself

The user may query about the status and performance of the Grid

– Grid applicationsRegister and monitors resourcesStandardization is required to interoperate among different grids projects

– Globus: MDS (Monitoring and Discovery Service) – European Data Grid: R-GMA (Relational Grid Monitoring

Architecture)– UNICORE: Incarnation Database (IDB)

Page 20: Grid Computing - UPM

20

Performance

Traditionally performance measures:– Speed– Throughput– Bandwidth

In Grid environments, it is necessary to measure:– Allocation of resources to processes– QoS– Availability

Page 21: Grid Computing - UPM

21

Overview

IntroductionArchitectureSecurityInformation SystemsData ManagementWorkload ManagementReferences

Page 22: Grid Computing - UPM

22

Data Grid

Set of storage resources and data retrievalcomponents which allows applications to access data by means of special software mechanismsData grid problems:

– Data location– Replication– I/O performance

Page 23: Grid Computing - UPM

23

Data Transfer

GridFTP: Protocol to data transfer in a secure way in a grid environment

– Extends FTP protocol– Use Grid Security Infrastructure (GSI)– Several storage systems provide GridFTP interfaces:

CastorEDG’s SRM

Reliable File Transfer (RFT): Grid Service whichprovides interfaces to manage and monitor file transfers by using GridFTP servers

Page 24: Grid Computing - UPM

24

Data replication

Due to the complexity of a grid environment, the existence offile replicas could be advisableNeed of identifying and locating replicasReplica Location Service (RLS): a Grid Service for registeringdata replicas and later discovering

– Mappings between logic and fisical identifiers– Database for metadata

Page 25: Grid Computing - UPM

25

Overview

IntroductionArchitectureSecurityInformation SystemsData ManagementWorkload ManagementReferences

Page 26: Grid Computing - UPM

26

Resource management system

Resource Management includes the efficient use of computing and storage resources– Processor time– Memory– Storage– Network

User-transparentInteracts with the rest of Grid components

Page 27: Grid Computing - UPM

27

Job Execution

A job can be any kind of executable that requires CPU or storageResource manager:

– Resource BrokeringFind suitable resources

– MatchmakingAssign a job to a resource that satisfies job requirements

– Job executionExecute the jobs and retrieve outputError management

Job execution requires to find the right Computing Element

Page 28: Grid Computing - UPM

28

Job submission

UI

WorkloadManager

ReplicaCatalogue

Inform.Service

ComputingElement

StorageElement

“Grid enabled”data transfers/

accesses

RBstorage

In/OutputSandboxfiles

Job

Data Localization

Status

SE statusCE status

In/OutputSandboxfiles

Job

Page 29: Grid Computing - UPM

29

Intensive jobs

Used in parallel and distributed environments– Parallel machines– Clusters

A Grid is understood as a set of clusters or parallel machinesPossibility to execute MPI jobs– MPICH-G2 – LAM-MPI 7.0.4

Matchmaking– Resource broker must select nodes that have MPI

installed, and at least n CPUs

Page 30: Grid Computing - UPM

30

Job queue managers

Condor-G: Condor High-throughput computing project – http://www.cs.wisc.edu

Portable Batch System (PBS)Sun Grid Engine (SGE)

Page 31: Grid Computing - UPM

31

Overview

IntroductionArchitectureSecurityInformation SystemsData ManagementWorkload ManagementReferences

Page 32: Grid Computing - UPM

32

References

“The Grid: Blueprint for a New Computing Infrastructure”. I. Foster and C. Kesselman. Morgan Kaufmann. 1998.“The Anatomy of the Grid: Enabling Scalable Virtual Organizations”. I. Foster, C. Kessleman and S. Tuecke. International Journal of Supercomputer Applications. 2001“The Globus Alliance”. http://www.globus.org

Page 33: Grid Computing - UPM

Post-XML Grids

Page 34: Grid Computing - UPM

34

Outline

Web services– SOAP– WSDL– UDDI

Grid Services & OGSIWS-RF

Page 35: Grid Computing - UPM

35

Web evolutionPast: Documents Web

– Static pages– Web as a huge repository of information– Technologies: HTTP + HTML

Present: Applications Web– Pages dynamically generated by Web applications– Applications export their interface to users through Web– Commercial transactions environment (Business to consumer, B2C)– Technologies: CGI, ASP, PHP, JSP, servlets, ...

Future (and present): Services Web (functions/methods)– “Libraries” offer services to programs (no to users)– Web as a huge services API (Components Web)– “Added value” Enterprises (Business to business, B2B)– Distributed systems over Internet

Web Service: RPC in the Web using XML

Page 36: Grid Computing - UPM

36

Web applications: Common scenario

Picture extracted from “Understanding Web Services”: http://www7.software.ibm.com/vad.nsf/Data/Document4362

Page 37: Grid Computing - UPM

37

Web Services: Common scenario

Picture extracted from “Understanding Web Services”: http://www7.software.ibm.com/vad.nsf/Data/Document4362

Page 38: Grid Computing - UPM

38

Web service

Module which exports a set of functions (methods) toapplications through the Web, providing hw/swplatforms independenceSimilar to RPC or RMI but integrated in the WebStandarization managed by W3C:

– http://www.w3.org/2002/ws/Questions:

– Transport protocol → HTTP– Representation format → XML– Communication protocol→ SOAP– IDL (Interface Definition Language) → WSDL– Binding → UDDI

Page 39: Grid Computing - UPM

39

Transport protocol: HTTP

POST used for request and answer fromRPC– Universally available– It passes through firewalls

POST /~ssoo/consultaBD.cgi HTTP/1.0Content-length: 76.....................

DNI=87654321&MAT=980000&Asignatura=sod&Curso=2002&Convocatoria=Jun&Tipo=acta

HTTP/1.1 200 OKContent-Type: text/html; charset=iso-8859-1.....................

<HTML>

Page 40: Grid Computing - UPM

40

Representation format: XML

RPC information coded in XML– Flexible and powerful– XML Schema allows us to define accurately data types– E.g., float GetLastTradePrice(string symbol);

Request:<GetLastTradePrice>

<symbol>DIS</symbol> </GetLastTradePrice>

Answer:<GetLastTradePriceResponse>

<Price>34.5</Price></GetLastTradePriceResponse>

Schema:<element name="GetLastTradePrice"><complexType><all>

<element name="symbol" type="string"/></all></complexType></element><element name="GetLastTradePriceResponse"><complexType><all>

<element name="Price" type="float"/></all></complexType></element>

Page 41: Grid Computing - UPM

41

Communication protocol: SOAP

Simple Object Access Protocol (CandidateRecommendation)SOAP = HTTP + XML– It specifies how to send XML messages over

HTTP– It defines the message container (in XML)– General protocol (not only for RPC)

Message container Structure:– Envelope: Header [optional] + Body

Header: complementary info. (e.g., in RPC thetransaction ID)Body: Original message

Page 42: Grid Computing - UPM

42

SOAP and RPCPOST /StockQuote HTTP/1.1......................<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"

SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"><SOAP-ENV:Body>

<m:GetLastTradePrice xmlns:m="http://example.com/stockquote.xsd"><symbol>DIS</symbol>

</m:GetLastTradePrice></SOAP-ENV:Body>

</SOAP-ENV:Envelope>

HTTP/1.1 200 OK...............<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"

SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/><SOAP-ENV:Body>

<m:GetLastTradePriceResponse xmlns:m="http://example.com/stockquote.xsd"><Price>34.5</Price>

</m:GetLastTradePriceResponse></SOAP-ENV:Body>

</SOAP-ENV:Envelope>

Request

Answ

er

Page 43: Grid Computing - UPM

43

Service Definition: WSDL

Web Service Description LanguageIDL for Web Services based on XMLWDSL document describes the Web service:– Data types (XML Schema)– Exported functions and request/answer messages– Protocols: usually SOAP over HTTP– Service address → URL with server and

“component”E.g., http://www.stockquoteserver.com/StockQuote

Usually, it is generated automatically fromservice code

Page 44: Grid Computing - UPM

44

UDDI

Universal Description, Discovery, andIntegrationDistributed registry of web services offeredby enterprisesIt is accessed as a web serviceQuery by using different criteria:– Activity, kind of service, geographical location

Page 45: Grid Computing - UPM

45

Web service registration

Picture extracted from “Understanding Web Services”: http://www7.software.ibm.com/vad.nsf/Data/Document4362

Page 46: Grid Computing - UPM

46

Information of a UDDI Registry

White pages: Listing of organizations (contactinformation) and of services provided by suchorganizationsYellow pages: Classifications of companies and Web Services according to taxonomiesGreen pages: It describes how a Web service can be invoked (Pointers to service description documents). Usually stored outside the registry.

Page 47: Grid Computing - UPM

Grid Services

OGSI

Page 48: Grid Computing - UPM

48

Computationally intensiveFile access/transferBag of various heterogeneous

protocols & toolkitsMonolithic designRecognised internet, ignored WebAcademic teams

Generation GameIn

crea

sed

func

tiona

lity,

stan

dard

izat

ion

Time

Customsolutions

Open GridServices

ArchitectureWeb services

Globus ToolkitCondor, Unicore

Defacto standardsGridFTP, GSI

X.509,LDAP,FTP, …

App-specificServices

Data and knowledge intensiveOpen services-based architecture

Builds on Web servicesGGF + OASIS+W3C

Multiple implementations Global Grid Forum

Industry participation(adapted from Ian Foster GGF7 Plenary)

Page 49: Grid Computing - UPM

49

Grid Services

Grid services were first introduced in “The Physiology of the Grid: An Open Grid Service Architecture for Distributed Systems Integration” by Foster et al.“Grid Service: a Web service that provides a set of well-defined interfaces and that follows specific conventions”.“The interfaces address discovery, dynamic service creation, lifetime management, notification, and manageability; the conventions address naming and upgradeability”.

Page 50: Grid Computing - UPM

50

Grid Services

The Physiology paper and its sister paper, The Anatomy of the Grid: Enabling Scalable Virtual Organizations, were the first papers to discuss using Web services to build Grids.They described an architecture built on special types of Web services, Grid services.There is now an OGSA working group at the Global Grid Forum (GGF) trying to tie the various grid standards coming out of GGF into a coherent whole.

Page 51: Grid Computing - UPM

51

OGSA

Defined by The Global Grid ForumOpen Grid Services Architecture– Grid Computing + Web Services– Concepts of both technologies

Page 52: Grid Computing - UPM

52

OGSA

What provides?– Distributed Services among Distributed,

Dynamic and Heterogeneous VOsTo whom?– Grid communities– Web Services communities

Page 53: Grid Computing - UPM

53

OGSI

OGSI (Open Grid Services Infrastructure)Formal and technical specification of what a Grid Service is.Detailed Specification of how Grid Services work.

Page 54: Grid Computing - UPM

54

Globus (GT3), OGSA and OGSI

Source: The Globus Toolkit 3 Programmer's Tutorial. Borja Sotomayor. http://www.casa-sotomayor.net/gt3-tutorial

Page 55: Grid Computing - UPM

55

Grid ServicesA Web service with a lot of extensions that make itadequate for a grid-based applicationMain improvements:– Stateful and potentially transient services– ServiceData– Notifications– portType extension– Lifecycle management– GSH & GSR

Page 56: Grid Computing - UPM

56

Writing a Grid ServiceDefine the service’s interface– GWSDLImplement the service– JavaDefine the deployment parameters– WSDDCompile everything and generate GAR file– AntDeploy service– Ant

Source: The Globus Toolkit 3 Programmer's Tutorial. Borja Sotomayor. http://www.casa-sotomayor.net/gt3-tutorial

Page 57: Grid Computing - UPM

57

Evolution of Grid StandardsOGSI drawbacks:– Long and dense specification– It does not work well with current Web Services tools– Too object orientedWSRF & GT4– http://www.globus.org/wsrf– WSRF presented in January 2004– GT4 is the current releaseHowever, WSRF and OGSI are conceptually thesame thing.

Page 58: Grid Computing - UPM

58

OGSI referencesThe Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration -http://www.globus.org/research/papers/ogsa.pdfThe Anatomy of the Grid: Enabling Scalable Virtual Organizations. – http://www.globus.org/research/papers/anatomy.pdfFinal OGSI Specification V1.0 –https://forge.gridforum.org/projects/ogsi-wg/document/Final_OGSI_Specification_V1.0/en/1OGSI V1.0 Primer - https://forge.gridforum.org/projects/ogsi-wg/document/draft-ggf-ogsi-gridserviceprimer/en/1From Open Grid Services Infrastructure to WS-Resource Framework: Refactoring and Extension -http://www.globus.org/wsrf/specs/ogsi_to_wsrf_1.0.pdfA Grid Application Framework based on Web Services Specifications and Practices – http://www.neresc.ac.uk/ws-gaf/documents.htmlGGF – http://www.ggf.org/

Page 59: Grid Computing - UPM

WS-RF: WS-Resources Framework

Page 60: Grid Computing - UPM

60

Grid and Web Services

Grid

Web Services

Pre-XML Post-XML

GT2 GT3

OGSI

WS-RF

GT4

Page 61: Grid Computing - UPM

61

WS-RF Web Service Resource Framework

WS-RF effectively has replaced OGSI since January 2004.Addresses the issues with OGSI.Doesn’t use inheritance – instead we compose portTypes.Simply a re-factoring of OGSI?Instead of Grid service instances we have WS-Resources.

Page 62: Grid Computing - UPM

62

WS-Resource Counter

WebServiceClient

createResource

CounterResource

counterID=1

CounterResource

counterID=2

add

WS-AddressingEPR

add

Destroy

(adapted from Marc McKeown Slides)

Page 63: Grid Computing - UPM

63

“Implied” Resource PatternThe WS-Resource definition codifies the relationship between Web services and stateful resources in terms of the implied resource pattern

– A set of conventions on Web services technologies that allow the state of a resource to be defined and associated with the description of a Web service interface.

WS-Addressing standardizes the endpoint reference construct used to represent the address of a Web service deployed at a given network endpoint.

– WS-Addressing Endpoint Reference, the client uses this EPR to communicate with the WS-Resource.

– The EPR holds an identifier for the WS-Resource.

Page 64: Grid Computing - UPM

64

OGSI vs WS-RF

WS-BaseFaultBase fault type

WSDLGWSDL

WS-ServiceGroupServiceGroup portTypes

Factory portType

WS-NotificationNotification portTypes

WS-ResourceLifetimeGridService portType lifetime management

WS-ResourcePropertiesGridService portType service data access

Resource properties definitionService Data definition

WS-RenewableReferencesHandleResolver portType

WS-Addressing Endpoint Reference & WS-RenewableReferences

GSH

WS-Addressing Endpoint ReferenceGSR

WS-Resource FrameworkOGSI

Page 65: Grid Computing - UPM

65

WSRF Implementations

Globus GT4 supports WSRFWSRF.NET from University of VirginiaPython implementation from Lawrence Berkley Laboratory.Java implementation from University of Indiana.Perl implementation from University of Manchester.

Page 66: Grid Computing - UPM

66

WSRF References

Modeling Stateful Resources with Web Services – http://www.globus.org/wsrfThe WS-Resource Framework -http://www.globus.org/wsrfFrom Open Grid Services Infrastructure to WS-Resource Framework: Refactoring and Extension- http://www.globus.org/wsrfWSRF OASIS working group - http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=wsrfWS-Notification OASIS working group -http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=wsn