www.eu-eela.org e-science grid facility for europe and latin america e2gris1 francisco prieto (phd),...

49
www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA-CIEMAT Itacuruça (Brazil), 2-15 November 2008 PORTING APPLICATIONS TO THE GRID. THE gLite WORKLOAD MANAGEMENT SYSTEM

Post on 18-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

E2GRIS1

Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA-CIEMAT

Itacuruça (Brazil), 2-15 November 2008

PORTING APPLICATIONS TO THE GRID. THE gLite WORKLOAD MANAGEMENT SYSTEM

Page 2: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

1. INTRODUCTION TO WMS

1.1 OVERVIEW OF THE WORKLOAD MANAGEMENT SYSTEM 1.2 JOB PREPARATION 1.3 JDL. THE JOB DESCRIPTION LANGUAGE 1.4 SPECIAL JOBS 1.5 JOB SUBMISSION AND STATUS MONITORING

2. APPLICATIONS PORTING

2.1 PROBLEM ANALYSIS 2.2 GRID IMPLEMENTATION 2.3 PORTING EXAMPLE

CONTENTS

Page 3: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

1 INTRODUCTION TO WMS

In order to accomplish these tasks, the WMS implements the following services

1.1 OVERVIEW OF THE WORKLOAD MANAGEMENT SYSTEM I

1) Job submission

2) Job execution according to prescribed schemes (match-making)

3) Job status monitoring

4) Retrieval of execution output

WMS COMMITMENTS

The purpose of the Workload Management System (WMS) is to accept user jobs, to assign them to the most appropriate Computing Element, to record their status and retrieve their output. The Resource Broker (RB) is the machine where the WMS services run

From the gLite 3.1 User guide:

Page 4: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

The Resource Broker (RB) in gLite 3.0, WMS in version 3.1

1 INTRODUCTION TO WMS

1.1 OVERVIEW OF THE WORKLOAD MANAGEMENT SYSTEM III

WMS CONTEXT IN THE gLITE INFRASTRUCTURE

UIJDL

ResourceResourceBrokerBroker

Job SubmissionJob SubmissionServiceService

ComputingComputingElementElementJob Status

LFCLFCCatalogCatalog

DataSets infoInput “sandbox”

Globus RSL

Output “sandbox”

Job Status

Information Information ServiceService

StorageStorageElementElement

Logging &Logging &Book-keepingBook-keeping

Page 5: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

1 INTRODUCTION TO WMS

1.1 OVERVIEW OF THE WORKLOAD MANAGEMENT SYSTEM IV

WMS GRID INTERACTIONS

a) Information Service

b) LFC Catalogue

c) Logging and Bookkeeping (LB)

d) Policy Management Systems

WMS USER INTERACTION IN JOB SUBMISSION

a) WMS

b) Computing Element

c) Worker Node

WMS INTERACTIONS

Page 6: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

1 INTRODUCTION TO WMS

1.1 OVERVIEW OF THE WORKLOAD MANAGEMENT SYSTEM V

WMS INTERACTIONS IN THE gLITE INFRASTRUCTURE

b) Job Submitted. In the JDL one or more files to be copied from the UI to the WN can be specified, and these are initially copied to RB An event is logged in the LB and the status of the job is SUBMITTED.

c) WMS looks for the best available CE to execute the job. To do so, it interrogates the Information Supermarket (ISM), an internal cache of information which in the current system is read from the BDII, to determine the status of computational and storage resources, and the File Catalogue to find the location of any required input files. Another event is logged in the LB and the status of the job is WAITING.

d) The gLite WMS prepares the job for submission, creating a wrapper script that will be passed, together with other parameters, to the selected CE. An event is logged in the LB and the status of the job is READY.

e) The CE receives the request and sends the job for execution to the local LRMS. An event is logged in the LB and the status of the job is SCHEDULED. f) The LRMS handles the execution of

jobs on the local Worker Nodes. The Input Sandbox files are copied from the gLite WMS to an available WN where the job is executed. An event is logged in the LB and the status of the job is RUNNING.

i) If the job ends without errors, the output (not large data files, but just small output files specified by the user in the so called Output Sandbox) is transferred back to the gLite WMS node. An event is logged in the LB and the status of the job is DONE

j) At this point, the user can retrieve the output of his job to the UI. An event is logged in the LB and the status of the job is CLEARED.

Page 7: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

WM

a) Web service Interface WM Proxy

b) Task queue

b) Match-Maker system

c) Information SuperMarket

1 INTRODUCTION TO WMS

1.1 OVERVIEW OF THE WORKLOAD MANAGEMENT SYSTEM VI

LOGGING AND BOOK-KEEPING

a) Register every WMS action(in particular job actions)

b) Accounts for job status

WMS ELEMENTS

Page 8: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

1 INTRODUCTION TO WMS

Basically, a repository of resource information available only in read mode to the matchmaking engine

Information SuperMarket

Task queue systemPossibility to keep a submission request for a while if no resources are immediately available.

1.1 OVERVIEW OF THE WORKLOAD MANAGEMENT SYSTEM VII

Match-making mechanismThe matchmaker has the goal to find the best suitable CE where to execute the job

To accomplish this task, the WMS interacts with the other EGEE/LCG components (Replica location Service, and Information Service)

There are three different scenarios to be dealt with separately:

a) Direct job submissionb) Job submission without data-access requirementsc) Job submission with data-access requirements

Page 9: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

A WM can adopt different policies to schedule a job:Two job submission models (accordingly to user requests and site policies):

1 Eager scheduling (PUSH):

The job is bound to a resource as soon as possible. Once a given decision has been taken, the job is passed to the selected resource for execution.

2 Lazy scheduling (PULL):

a job is held by the WM until a resource becomes available. When this happen, the resource is matched against the submitted jobs.

Intermediate approaches are also possible

1 INTRODUCTION TO WMS

Batch Schedule policies.

1.1 OVERVIEW OF THE WORKLOAD MANAGEMENT SYSTEM VIII

Page 10: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

1 INTRODUCTION TO WMS

1.1 OVERVIEW OF THE WORKLOAD MANAGEMENT SYSTEM IX

Page 11: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

The following information needs to be specified when a job has to be submitted:

1) Job characteristics

2) Job requirements and preferences on the computing resources

3) Software dependencies

4) Job data requirements

1 INTRODUCTION TO WMS

1.2 JOB PREPARATION

This information is specified using a Job Description Language (JDL) Based upon Condor’s CLASSified ADvertisement language (ClassAd)

Hence, JDL allows the definition of a set of attributes which are taken into account by WMS when making its scheduling decision

Page 12: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

1 INTRODUCTION TO WMS

1.3 JDL- THE JOB DESCRIPTION LANGUAGE I

•An attribute is the pair (key, value), where the value might be an integer, string, boolean, etc.

<key>=<value>;•Comments must either be preceded by (#) or follow the C++ syntax

•JDL is case sensitive to tabs and white spaces

•In case values as strings, these must be enclosed in double quotes and backslash (\)Arguments= “\”Hello World!\” 10”;

•Special characters, such as & , | , < , > , are only allowed through the following syntax:• Arguments = “-f file1\\\&file2”

•Single quotes “’” are not allowed

Page 13: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

1 INTRODUCTION TO WMS

1.3 JDL- THE JOB DESCRIPTION LANGUAGE II

•JobType (optional field)–Normal (simple, sequential)–Interactive–MPICH–Checkpointable–Partitionable–any combination of these

Examples:JobType=“Interactive”;

JobType={“MPICH”,”Checkpointable”};

““Interactive” + “MPICH” not allowedInteractive” + “MPICH” not allowed

JDL FIELDS

Page 14: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

•Executable

–Command, script or program to be sent to the GRID for execution.–User can specify an executable located in a remote CE or in the UI (the executable must be included in the InputSandBox).

Examples:Executable={“/grid/ceta/testrungdir/RunTrayectory.sh”};

Executable={“RunTrayectory.sh”};

InputSandbox={“/home/fprieto/GRID_TEST/RunTrayectory.sh”}

1 INTRODUCTION TO WMS

1.3 JDL- THE JOB DESCRIPTION LANGUAGE III

Page 15: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

•Arguments (optional)

String where all arguments needed for the execution are specified.

Example:If the “sum” takes N1 and N2 as arguments:

$sum N1 N2 –output output.out

In the JDL:

Executable=“sum”;

Arguments={“N1 N2 –output output.out”};

1 INTRODUCTION TO WMS

1.3 JDL- THE JOB DESCRIPTION LANGUAGE III

Page 16: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

•Environment (optional)

Environment variables needed for a proper execution

Example:Environment={“JAVABIN=/usr/local/java”};

1 INTRODUCTION TO WMS

1.3 JDL- THE JOB DESCRIPTION LANGUAGE IV

Page 17: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

•StdInput (optional)–Standard input of job execution

•StdOutput (optional)–Standard output of job execution

•StdError (opcional)–Standard error for job execution

Examples:StdOutput = "message.txt";

StdError = "stderror";

1 INTRODUCTION TO WMS

1.3 JDL- THE JOB DESCRIPTION LANGUAGE IV

Page 18: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

•InputSandbox (optional)–List of files, located at the UI, needed for job execution–Every listed file will be automatically sent to the remote resource

Example:InputSandbox={“my-script.sh”,”/tmp/cc.sh”};

•OutputSandbox (optional)–List of files generated after execution which are intended to be retrieved

Example:OutputSandbox={“std.out”,”std.err”,”points.data”};

1 INTRODUCTION TO WMS

1.3 JDL- THE JOB DESCRIPTION LANGUAGE V

Page 19: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

•VirtualOrganisation (optional)–VO name which the user belong to

Example:VirtualOrganisation={“ceta”};

1 INTRODUCTION TO WMS

1.3 JDL- THE JOB DESCRIPTION LANGUAGE V

Page 20: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

1 INTRODUCTION TO WMS

1.3 JDL- THE JOB DESCRIPTION LANGUAGE VI

•Requirements (optional)–Computational resources needed by the job–Requirements are specified by using the GLUE schema attributes issued through the Information Service–If no requirements are given, the default configuration values of the UI are set:

Default.Requirements=other.GlueCEStateStatus==“Production”;

Examples:

Requirements=other.GlueCEUniqueID==“grid006.cecalc.ula.ve:2119/jobmanager-pbs-infinite”;

Requirements=Member(“ALICE-3.07.01”, other.GlueHostApplicationSoftwareRunTimeEnvironment);

Page 21: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

•Rank (optional)

–float point expression to classify the CEs matching the intended requirements–The Rank expression might hold the attributes describing the CE in the Information System (IS)–Rank field evaluation is performed by the Resource Broker during the “match-making” process –The higher the numeric value, the better the Rank–When Rank is not set, the default UI configuration value is used:

Default.Rank=other.GlueCEStateFreeCPUs;

1 INTRODUCTION TO WMS

1.3 JDL- THE JOB DESCRIPTION LANGUAGE VII

Page 22: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

1 INTRODUCTION TO WMS

1.3 JDL- THE JOB DESCRIPTION LANGUAGE VIII

•InputData (optional)–List of Logical File Names (LFNs) or Grid Unique Identifiers (GUIDs) representing the input files needed for a job execution–This list is used by the Resource Broker to find a CE from where those files can be accessed more efficiently

Example:InputData={“lfn:cmstestfile”,”guid:135b7b23-4a6a-11d7-87e7-9d101f8c8b70”};

Page 23: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

•DataAccessProtocol (needed if InputData has been specified)

–Protocol or list of protocols used by the application to access the files listed in the InputData of a given SE.–Not all the SEs handle the same set of protocols (gridftp, file, rfio …)

Example:

DataAccessProtocol={“file”,”gridftp”};

1 INTRODUCTION TO WMS

1.3 JDL- THE JOB DESCRIPTION LANGUAGE IX

Page 24: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

1 INTRODUCTION TO WMS

1.3 JDL- THE JOB DESCRIPTION LANGUAGE X

•OutputData (optional)

–This field allows the user to retrieve the register and automatic load of the output data generated in the Worker Node (WN) and stored in a Storage Element (SE)–Its main attributes are:OutputFileStorageElementLogicalFileName

•OutputFile (needed if OutputData is specified)

–Represents the output file name, generated at the W which needs to be registered and loaded by the WMS in a SE.

Page 25: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

•StorageElement (optional)

–It represents the SE URI where the output file specified in the attribute OutputFile will be load by the WMS

•LogicalFileName (optional)

–It represents the LFN associated to the output file to be addressed in the catalogue.

1 INTRODUCTION TO WMS

1.3 JDL- THE JOB DESCRIPTION LANGUAGE XI

Page 26: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

1 INTRODUCTION TO WMS

1.3 JDL- THE JOB DESCRIPTION LANGUAGE XII

•StorageIndex (neede if InputData and OutputData are specified)–It represents the StorageIndex Service URL to be contacted in order to solve the names of the files specified in InputData and/or OutputData

Example:StorageIndex=“https://glite.org:9443/StorageIndex”;

•OutputSE (optional)–It represents the SE URL where output data is intended to be stored.–This attribute is used by the Resource Broker to find the “closest” CE to the specified SE and, hence, planning execution there.

Example:OutputSE=“grid003.cecalc.ula.ve”;

Page 27: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

•Example 1

The resource needs PBS as LRMS having at least 2 CPUs

Requirements=other.GlueCEInfoLRMSType==“PBS” && other.GlueCEInfoTotalCPUs>1;

(Notice that attribute names are preceded by other.)

•Example 2

Resource must have the specified software installed (this information is published in the resource environment)

Requirements=Member(“CMSIM-133”, other.GlueHostApplicationSoftwareRunTimeEnvironment);

RunTimeEnvironment is a multiple value attribute

The Member operator returns “true” if the given value exists in the RunTimeEnvironment list

1 INTRODUCTION TO WMS

1.3 JDL- THE JOB DESCRIPTION LANGUAGE XIII

JDL EXAMPLES

Page 28: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

•Example 3

The job must be executed in a CE belonging to cern.ch

Requirements=(RegExp(“cern.ch”, other.GlueCEUniqueID));

1 INTRODUCTION TO WMS

1.3 JDL- THE JOB DESCRIPTION LANGUAGE XIV

Page 29: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

•Example 4

The job requires at least 25 minutes of CPU time and 100 minutes of real time

Requirements = other.MaxCPUTime>=1500 && other.MaxWallClockTime>=6000;

1 INTRODUCTION TO WMS

1.3 JDL- THE JOB DESCRIPTION LANGUAGE XV

•Example 5

The resource must have 2 packages installed (VO-alice-Alien y VO-alice-Alien-v4-01-Rev-01) and the job must be executed at least during 86000 seconds

Requirements = Member(“VO-alice-Alien”, other.GlueHostApplicationSoftwareRunTimeEnvironment) && Member(“VO-alice-Alien-v4-01-Rev-01”, other.GlueHostApplicationSoftwareRunTimeEnvironment) && (other.GlueCEPolicyMaxWallClockTime > 86000);

Page 30: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

1 INTRODUCTION TO WMS

1.4 SPECIAL JOBS I

• At the state of the art, parallel jobs can run inside single Computing Elements (CE) only;

• Several projects are involved into studies concerning the possibility of executing parallel jobs on Worker Nodes (WNs) belonging to different CEs

MPI IMPLEMENTATION

Page 31: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

1 INTRODUCTION TO WMS

1.4 SPECIAL JOBS II

1) the MPICH software must be installed and placed in the PATH environment variable on each WNs of the CE.

2) Some MPI’s applications require a shared filesystem among the WNs to run.2) Some MPI’s applications require a shared filesystem among the WNs to run.

The variable The variable VO_<name_of_VO>_SW_DIRVO_<name_of_VO>_SW_DIR will contain the name of a will contain the name of a directorydirectory in case of SHARED in case of SHARED filesystem.filesystem.

The variable The variable VO_<name_of_VO>_SW_DIRVO_<name_of_VO>_SW_DIR will contain “.” if there is NO SHARED filesystem. will contain “.” if there is NO SHARED filesystem.

INSTALL: INSTALL: glite-MPI_utils at CE and Worker Nodesglite-MPI_utils at CE and Worker Nodes

MPI Infrastructure Requirements

Page 32: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

From the user’s point of view, jobs to be run as MPI are specified setting the JDL JobType attribute to MPICH and specifying the NodeNumber attribute as well

Example:

JobType = “MPICH”;NodeNumber = 4;

When the previous two attributes are included in a JDL, the User Interface (UI)automatically adds the following expression to the JDL Requirements expression in orderto find out the best resource where the job can be executed.(other.GlueCEInfoTotalCPUs >= NodeNumber) &&Member (“MPICH”,other.GlueHostApplicationSoftwareRunTimeEnvironment)

1 INTRODUCTION TO WMS

1.4 SPECIAL JOBS III

MPI Usage

Page 33: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

Type = "Job";

JobType = "MPICH";

Executable = “cpi";

NodeNumber = 2;

StdOutput = “cpi.out";

StdError = “cpi.err";

InputSandbox = {"cpi"};

OutputSandbox = {“cpi.err",“cpi.out"};

RetryCount = 0;

1 INTRODUCTION TO WMS

1.4 SPECIAL JOBS IV

MPI Example:

Page 34: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

1 INTRODUCTION TO WMS

1.4 SPECIAL JOBS V

A job collection is a set of independent jobs that user wants to submit and monitor via a single request

Jobs of a collection are submitted as DAG nodes without dependencies JDL is a list of classad, which describes the subjobs

Type = "collection";VirtualOrganisation = “gilda"; nodes = {

[ <job descr 1 >], [ <job descr 2 >],

…};

JOB Collections

Page 35: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

type = "collection"; InputSandbox = {"date.sh"}; RetryCount = 0; nodes = { [ file ="jobs/job1.jdl" ; ], [ [ Executable = "/bin/sh"; Arguments = "date.sh"; Stdoutput = "date.out"; StdError = "date.err"; OutputSandbox ={"date.out", "date.err"};] ], [ file ="jobs/job3.jdl" ; ] };

1 INTRODUCTION TO WMS

1.4 SPECIAL JOBS VI

Job Collection Example:

Page 36: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

-A parametric job is a job where one or more of its attributes are parameterized Values of attributes vary according to a parameter

Example:

JobType = "Parametric"; Executable = "/bin/sh"; Arguments = "md5.sh input_PARAM_.txt"; InputSandbox = {"md5.sh", "input_PARAM_.txt"}; StdOutput = "out_PARAM_.txt"; StdError = "err_PARAM_.txt"; Parameters = 4; ParameterStart = 1; ParameterStep = 1; OutputSandbox = {"out_PARAM_.txt", "err_PARAM_.txt"};

-Job monitoring / managing is always done through an unique jobID, as if the job was single (see submission of collection

-Parameter can be also a list of string

1 INTRODUCTION TO WMS

1.4 SPECIAL JOBS VII

Parametrized Jobs

Page 37: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

1 INTRODUCTION TO WMS

1.4 JOB SUBMISSION AND STATUS MONITORING I

glite-wms-job-submit [–r <res_id>] [-c <config file>] [-vo <VO>] [-o <output file>] <job.jdl>

where:-r the job is submitted directly to the computing element identified by <res_id>-c the configuration file <config file> is pointed by the UI instead of the standard configuration file-vo the Virtual Organisation (if user is not happy with the one specified in the UI configuration file)-o the generated edg_jobId is written in the <output file>Useful for other commands, e.g.: glite-wms-job-status –i <input file> (or edg_jobId)-i the status information about edg_jobId contained in the <input file> are displayed

Page 38: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

glite-wms-job-list-matchLists resources matching a job descriptionThe --rank option prints the ranking of each resourcePerforms the matchmaking without submitting the job

glite-wms-job-cancelCancels a given job

glite-wms-job-status -i jobid (--noint)Displays the status of the job

glite-wms-job-get-output --dir output -i jobid (--noint)Returns the job-output (the OutputSandbox files) to the user

glite-wms-job-get-logging-infoDisplays logging information about submitted jobs (all the events “pushed” by the various components of the

WMS)Very useful for debug purposes (see next slide)

1 INTRODUCTION TO WMS

1.4 JOB SUBMISSION AND STATUS MONITORING II

USEFUL UI COMMANDS

Page 39: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

– Which quantities are expected to be obtained in the study?

– Identify the stages of the process to obtain such quantities and intermediate results/files

– Which is the input data (parameters, input files, etc)?. Can the input data be partitioned to generate different configurations? (data partitioning)

– Starting from the same input data, is there an stage where the intermediate products can be partitioned? (process partitioning)

– Which elements should be stored? (estimate the volume of data)

– What success policies apply? (retry counts etc)

– How will output data be treated? (post-processing)

THESE RESULTS MUST GUIDE THE GRID IMPLEMENTATION

2 PORTING APPLICATIONS TO THE GRID. HANDS ON

2.1 PROBLEM ANALYSIS I

Page 40: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

2 PORTING APPLICATIONS TO THE GRID

2.1 PROBLEM ANALYSIS II

From the preceding analysis, it must be determined the nature of the jobs.

Jobs can be:

1) Independent

2) Dependent

1) Non-interactive

2) Interactive

Page 41: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

– Generation of configuration space Set of all entities which constitute an analysis?? (model parameters, initial conditions, other)

– Job executions tasks: Send jobs to the GRID infrastructure Monitor job status according to the configuration space

– Storage Storage large files at the SE. Use specific LFN GRID Federated database interaction?

– Navigation over results/retrieve According to the configuration space and the success

policies

2 PORTING APPLICATIONS TO THE GRID

2.2 GRID IMPLEMENTATION I

Page 42: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

BUILD-UP A GLOBAL DESIGN!!

2 PORTING APPLICATIONS TO THE GRID

2.2 GRID IMPLEMENTATION II

2) Which elements are needed for job execution at:

a) UI b) WN c) SE (big files)d) Metadata at AMGA (DB)

3) How these elements interact?

1) What is a job? (what is executed?)

4) How is the output treated?

Page 43: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

2.3 PORTING EXAMPLE I

2 PORTING APPLICATIONS TO THE GRID

ANALYSIS OF THE PHASE SPACE OF THE FORCED NONLINEAR PENDULUM

GOAL

REQUIREMENTS

Orbits integrated in time for different parameter values in order to find patterns in the phase space (such as attractors)

JavaBashCommons Org Apache math libraries

PARTITION SCHEME

The parameter space is partitioned. Then for each point of this space, an orbit is numerically integrated,

Page 44: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

2.3 PORTING EXAMPLE II

2 PORTING APPLICATIONS TO THE GRID

At the UI

At the WN

Stored in SE

Wrapper, executable,JDL’s, etc

configuration filesexecutablewrapper...

Page 45: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

GENERATE CONFIGURATIONS 2.3 PORTING EXAMPLE III

2 PORTING APPLICATIONS TO THE GRID

TO WN 1TO WN 2

Page 46: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

2.3 PORTING EXAMPLE IV

2 PORTING APPLICATIONS TO THE GRID

External Library

Configuration Files

Executable

Output File

Associated LFN

to SE with LFN

Page 47: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

Type = "Job";JobType = "Normal"Executable = "’/bin/bash’";Arguments = {"RunTrayectory.sh,ParamsSweep.conf ParamsName.conf IC.conf Integrator.conf omegaXX_muYY"};StdOutput = "std.out";StdError = "std.err";InputSandbox = {"/GridTest/Scrpits/RunTrayectory.sh","/GridTest/jarfile/OrbitIntegrator.jar","/GridTest/conf/ParamsSweep.conf","/GridTest/conf ParamsName.conf","/GridTest/conf/IC.conf","/GridTes/conf/Integrator.conf","/GridTest/lib/commons-math-1.2.jar"};OutputSandbox = {"std.out","std.err","OrbitName","Orbit.log"};

THE JDL FOR THIS PROBLEM

glite-wms-job-submit -o JOBID.id /temp/jdl/JDL_omegaXX_muYY.jdl

glite-wms-job-status -i JOBID --noint

JOB SUBMISSION

JOB STATUS MONITORING

JOB RETRIEVEglite-wms-get-output --dir output -i JOBID.id --noint

2.3 PORTING EXAMPLE V

2 PORTING APPLICATIONS TO THE GRID

Page 48: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.org

E-science grid facility for Europe and Latin America

GENERATING AMGA ENTRIES

2.3 PORTING EXAMPLE VI

2 PORTING APPLICATIONS TO THE GRID

ORBITS ARE NAVIGATED LATER THROUGH AMGA QUERIES:

Query> omega=2.0 mu=02. Associated LFN

obtain this file from the SE

Page 49: Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça

www.eu-eela.eu Itacuruça (Brazil) , E2GRIS1, 2.11.2008 – 15.11.2008

Questions …

2

THANKS

Dr. Francisco Prieto Castrillo,Science and Technology Unit Coordinator,CETA-CIEMAT