connecting ourgrid & gridsam a short overview. content goals ourgrid: architecture overview...

17
Connecting OurGrid & GridSAM A Short Overview

Upload: randolf-green

Post on 27-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example

Connecting OurGrid & GridSAM

A Short Overview

Page 2: Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example

Content• Goals• OurGrid: architecture overview• OurGrid: short overview• GridSAM: short overview• GridSAM: example deployment with Condor• Different paradigms: OurGrid• Different paradigms: GridSAM• Issues: File Staging• Issues: many related job submissions• OurGrid<>GridSAM connector

Page 3: Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example

Goals

• To maintain two grid environments in parallel: OurGrid & Condor

• To handle job submission process through common interface: JSDL, using GridSAM

• To build connector for GridSAM to talk to OurGrid

• GridSAM can already talk to Condor through a connector, no problems here

Page 4: Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example

OurGrid: architecture overview

Page 5: Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example

OurGrid: short overview

• Workers are typically desktop computers that can run jobs directly in their OS or through virtualization (XEN, VMWare, VirtualBox etc.)

• „Clouds of Workers” are controlled by Peers• Jobs are submitted through Brokers• Two possibilities here:– Broker can be a dedicated web-site interfacing with

specific Peers– Broker can be any machine with MyGrid tool installed

that communicates to specified Peers

Page 6: Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example

GridSAM: short overview• Web Service-type middleware laying between job

submitter and core grid machinery• Modular architecture: can talk to many grid

infrastructures through specific connectors• Collects job submissions sent as XML JSDL files• Manages multiple submissions thanks to

persistency and monitors submissions lifecycle• After accepting JSDLs, re-submits jobs directly to

underlying grid machinery as defined in specific connectors

Page 7: Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example

GridSAM: example deployment with Condor

• Machine (B) runs GridSAM instance in secured OMII container• Machine (B) has capability of directly re-submitting jobs to Condor Pool (C)• Authorized job submitter (A) can submit jobs over the internet to the GridSAM instance running on (B)

Page 8: Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example

Different paradigms: OurGrid

• Designed for labs that have access to a pool of desktop machines whose free CPU cycles can be utilized

• Bag-of-Tasks: jobs are usually disjoint units with independent input and output

• Data sets often have reasonable enough sizes to be transferred many times across many machines

• As end-user friendly as possible: asks job submitter only for JDL job submission specification, input files and output files

• All details of job scheduling and file transfer are hidden from job submitter

Page 9: Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example

Different paradigms: GridSAM• Designed primarily for labs utilizing high performance

computing (HPC) techniques using few powerful machines• HPC is typically used for CPU-demanding computations that

uses extensive data sets• Every milisecond is important: job specification, input and

output files must be handled with minimum human and OS intervention

• Jobs are often dependent on very large datasets, file transfer should be minimized

• Data must be accessed in fast and secure way, preferably through URIs which requires minimum external intervention

• The URIs must be specified directly in JSDL file

Page 10: Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example

Issues: File staging

• In OurGrid, MyGrid tool takes care of transfer of input files, distributing them according to BoT paradigm, and transfer of output files back to job submitter

• Also, when submitting through web-site, feedback is sent when output files are available for download

• Job submitter can just point out files on its own machine, or upload them to some storage server accessible to MyGrid

• No dedicated storage is needed for MyGrid to work

Page 11: Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example

Issues: File staging• GridSAM does not handle input and output files by itself; it

delegates this subtask to yet another middleware, Apache VFS

• VFS was designed to access resources identified by URIs based on fully qualified hostnames and few recognized protocols (FTP/SFTP, HTTP, GridFTP, WebDAV etc.)

• When submitting JSDL using GridSAM client on particular machine, one cannot just point out local files; they must be uploaded to some dedicated storage space that is identifiable through URI to VFS machinery

• Only when correctly specified (reliable URIs!) in JSDL, and uploaded to dedicated storage, files may be further processed by GridSAM

Page 12: Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example

Issues: File staging

• Possible solution 1: define dedicated storage in the form of SFTP/GridFTP file server, accessible both to OurGrid and GridSAM, and write all URIs in JSDL files according to this dedicated storage

• Possible solution 2: let job submitter decide its own storage mechanisms; accept URI if it is accessible (readable/writable), process the job as usual, let VFS do the rest

Page 13: Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example

Issues: File staging

• In both cases, security is an important feature to consider

• JSDL processing is secure enough in GridSAM but secure access to external storage must be maintained separately

Page 14: Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example

Issues: many related job submissions

• In OurGrid, job submitter can submit JDL job specification with many jobs defined

• Also, specific environment variables set by OurGrid can be utilized to differentiate between multiple jobs and multiple input/output files

• No specific support for parameter sweep concept is provided, but job submitter can simulate it by using properly written JDL job specification

Page 15: Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example

Issues: many related job submissions

• With GridSAM, job submitter is submitting JSDL that contains details for single job only

• In theory, it is possible to submit multiple JSDLs in short time; they should be internally scheduled using persistency mechanisms by GridSAM, then gradually re-submitted to grid machinery through specified queuing strategy

• Parameter sweep JSDL extension is currently not supported in GridSAM; in theory, job submitter can submit bunch of JSDLs that simulate it

Page 16: Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example

Issues: many related job submissions

• Possible solution 1: rely on GridSAM scheduling mechanisms; allow to accept multiple submissions in very short time and let GridSAM re-submit them according to its own strategies

• Possible solution 2: implement parameter sweep JSDL extension in OurGrid connector or even in GridSAM core module itself

• Solution 1 is very straightforward; however, the behaviour of GridSAM under those conditions needs to be examined closely

• Solution 2 is very feasible, but requires much time and resources

Page 17: Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example

OurGrid<>GridSAM connector

• For OurGrid, MyGrid tool instance (either installed on local machine or as component of job submission web-site) is a single „contact point” for job submitter, hiding all the underlying grid-specific mechanisms

• The connector should be a wrapper over MyGrid instance