ceng 546 dr. esma yıldırım. a fundamental enabling technology for the "grid," letting...

36
GLOBUS TOOLKIT (GT5) CENG 546 Dr. Esma Yıldırım

Upload: molly-hughson

Post on 15-Jan-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

GLOBUS TOOLKIT (GT5)CENG 546

Dr. Esma Yıldırım

Page 2: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

What is Globus?

A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely online across corporate, institutional, and geographic boundaries without sacrificing local autonomy

Page 3: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

What does GT5 provide?

It includes software services and libraries for resource monitoring, discovery, and management, plus security and file management security, information infrastructure, resource management, data management, communication, fault detection, portability

Page 4: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

GT5 Component Structure

Page 5: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

History

How did Globus become the “de facto standard” for Grid Computing A small team led by Ian Foster at Argonne created new

protocols that allowed I-WAY users to run applications on computers across the country at Super Computing 95

The experiment got the attention of DOE and NSF With the funding from many national agencies, it began

in 1996. The project has spurred a revolution in the way science

is conducted. High-energy physicists designing the Large Hadron Collider

at CERN are developing Globus-based technologies through the European Data Grid, and the U.S. efforts like the Grid Physics Network (GriPhyN) and Particle Physics Data Grid.

Page 6: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

GRAM5

The Grid Resource Allocation and Management (GRAM5) component is used to locate, submit, monitor, and cancel jobs on Grid computing resources.

GRAM5 is not a Local Resource Manager, but rather a set of services and clients for communicating with a range of different batch/cluster job schedulers using a common protocol.

Page 7: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely
Page 8: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely
Page 9: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely
Page 10: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely
Page 11: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

GRAM5 Architecture

GRAM 5 Components Gate Keeper Job Manager Scheduler Event Generator LRM Adaptor

Page 12: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

Component 1: Gate Keeper

The globus-gatekeeper service provides a network interface to the GRAM5 system.

It authenticates client identities and starts Job Manager processes using the local user account to which the client identity is mapped.

One instance of the globus-gatekeeper process runs to accept network connections, and forks a new short-lived process to process each new connection.

Page 13: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

Component 2: Job Manager

The globus-job-manager daemon processes job requests and coordinates file transfers.

There is one long-lived instance of this per user per LRM and one short-lived instance per job.

Page 14: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

Component 3: Scheduler Event Generator

The globus-scheduler-event-generator process parses LRM-specific data relating to job startup, execution, and termination into an LRM-independent data format.

There is optionally one instance of this program per LRM.

Page 15: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

Component 4: LRM Adaptor

The LRM adapter provides an interface between the GRAM5 system components and the LRM.

It provides concrete implementations of the submit, cancel, and poll functionality for a particular system's LRM and to generate job status change events.

Page 16: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely
Page 17: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

Overview

GRAM jobs consist of file transfers and program execution on one or more compute elements managed by a local resource manager

The GRAM client can submit the job and then later poll for its status, or it can request that the GRAM service notify it when the job changes state or completes.

While the job is executing, the client may send control messages to the GRAM service to monitor or modify the job.

GRAM provides reliable job submission, job recovery in case of service or client failures, file staging, and asynchronous notification messages.

Page 18: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

Overview

GRAM achieves its uniform interface by implementing a domain-specific language called the Resource Specification Language (RSL) which provides a simple way to express job requirements, environment, and commands in a specification which is independent of the local resource manager which will actually execute the job.

Page 19: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

GRAM and Security

GRAM uses a proxy certificate which is a short-term credential digitally signed by a private key

You must first obtain a security credential (.X509 certificate)

Page 20: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

GRAM Resource Names

Before interacting with a GRAM service, you must know its contact address

grid.example.org:2120/jobmanager-sge:/C=US/O=Example/OU=Grid/CN=host/grid.example.org

Host name Port NoServic

e Name

Credential name

Page 21: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

Basic Client Interface

globus-job-run : waits until the job terminates before exiting and prints job standard output and stderr after the job completes

globus-job-submit : submit the job and then exit immediately, printing the job contact to its standard output stream

globusrun : Uses RSL language to run jobs

Page 22: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

globus-job-run Examples

Minimal job running

submits a single instance of the /bin/hostname executable to the resource named by grid.example.org/jobmanager-pbs

% globus-job-run grid.example.org/jobmanager-pbs /bin/hostnamenode1.grid.example.org

Page 23: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

globus-job-run Examples

Multiprocess job running

submits ten instances of an executable /bin/hostname.

The output of the job is the name of the ten hosts that the job ran on. The -np COUNT option causes globus-job-run to run COUNT instances of the executable.

% globus-job-run grid.example.org/jobmanager-pbs -np 4 /bin/hostname node1.grid.example.org node3.grid.example.org node2.grid.example.org node10.grid.example.org

Page 24: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

globus-job-run Examples

Staging an executable file

submits an executable which is local to the submit machine to the GRAM resource, then executes it.

The executable is removed automatically from the GRAM resource after the job completes.

The -s option prior to the executable name causes globus-job-run to stage the executable using GASS (an https-based protocol) from the machine running globus-job-run to the GRAM resource.

% globus-job-run grid.example.org/jobmanager-pbs -s my-executablenode1.grid.example.org

Page 25: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

globus-job-run Examples

Providing an input file to a job

submits a job to a GRAM resource. When this job runs, its standard input

will read from the file $HOME/inputfile.txt, which is located on the GRAM resource.

The -stdin command-line option indicates this path.

% globus-job-run grid.example.org/jobmanager-pbs -stdin inputfile.txt /bin/catHello, Grid

Page 26: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

globus-job-run Examples

Staging an input file to a job

submits a job to a GRAM resource. When this job runs, its standard input will

read from the file inputfile.txt, which is located on the submit client machine.

The -stdin -s command-line option combination causes the input to be staged in the above executable staging example

% globus-job-run grid.example.org/jobmanager-pbs -stdin -s inputfile.txt /bin/catHello, staged input on the Grid

Page 27: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

globus-job-submit Example% globus-job-submit grid.example.org/jobmanager-pbs /bin/hostnamehttps://grid.example.org:38843/16001600430615223386/5295612977486013582/% globus-job-status https://grid.example.org:38843/16001600430615223386/5295612977486013582/PENDING% globus-job-status https://grid.example.org:38843/16001600430615223386/5295612977486013582/ACTIVE% globus-job-status https://grid.example.org:38843/16001600430615223386/5295612977486013582/DONE% globus-job-get-output -r grid.example.org/jobmanager-fork \ https://grid.example.org:38843/16001600430615223386/5295612977486013582/node1.grid.example.org% globus-job-clean -r grid.example.org/jobmanager-fork \ https://grid.example.org:38843/16001600430615223386/5295612977486013582/ WARNING: Cleaning a job means: - Kill the job if it still running, and - Remove the cached output on the remote resource Are you sure you want to cleanup the job now (Y/N) ? yCleanup successful.

Page 28: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

globusrun Examples

Basic interactive job

submit interactive job with globusrun. When the -s is used, the output of the job command is returned to the client and displayed as if the command ran locally. This is similar to the behavior of the globus-job-run program described.

% globusrun -s -r example.grid.org/jobmanager-pbs "&(executable=/bin/hostname (count=5)”node03.grid.example.orgnode01.grid.example.orgnode02.grid.example.urgnode05.grid.example.orgnode04.grid.example.org

Page 29: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

globusrun Examples

Basic batch job

submit, monitor, and cancel a batch job using globusrun. This method is useful for the case where the job may run for a long time, the job may be queued for a long time, or when there are network reliability issues between the client and service

% globusrun -b -r grid.example.org/jobmanager-pbs "&(executable=/bin/sleep)(arguments=500)”globus_gram_client_callback_allow successfulGRAM Job submission successful https://grid.example.org:38824/16001608125017717261/5295612977486019989/GLOBUS_GRAM_PROTOCOL_JOB_STATE_PENDING% globusrun -status https://grid.example.org:38824/16001608125017717261/5295612977486019989/PENDING% globusrun -k https://grid.example.org:38824/16001608125017717261/5295612977486019989/%

Page 30: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

GridFTP

One of the foundational issues in HPC computing is the ability to move large (multi Gigabyte, and even Terabyte), file-based data sets between sites.

Simple file transfer mechanisms such as FTP and SCP are not sufficient either from a reliability or performance perspective.

GridFTP extends the standard FTP protocol to provide a high-performance, secure, reliable protocol for bulk data transfer

Page 31: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

GridFTP Key Features

Performance - GridFTP protocol supports using parallel TCP streams and multi-node transfers to achieve high performance.

Checkpointing - GridFTP protocol requires that the server send restart markers (checkpoint) to the client.

Third-party transfers - The FTP protocol on which GridFTP is based separates control and data channels, enabling third-party transfers, that is, the transfer of data between two end hosts, mediated by a third host.

Security - Provides strong security on both control and data channels. Control channel is encrypted by default. Data channel is authenticated by default with optional integrity protection and encryption.

Page 32: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

Globus Implementation of GridFTP

A server implementation called globus-gridftp-server,

A scriptable command line client called globus-url-copy,

A set of development libraries for custom clients.

Page 33: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

GridFTP Client Examples

globus-url-copy –vb -p 4 source_url destination_url -vb -> outputs transfer performance -p -> sets the number of parallel streams

globus-url-copy -vb -p 4 -r -cd - cc 4 source_url destination_url Directory transfer -r -> copy files in sub directories -cd -> create destination directory -cc -> number of concurrent connections

Page 34: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

GridFTP Client Examples

Source and Destination URLs file:///path/to/my/file

if you are accessing a file on a file system accessible by the host on which you are running your client.

gsiftp://hostname/path/to/remote/file if you are accessing a file from a GridFTP server.

Page 35: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

GridFTP Client Examples

Uploading a File globus-url-copy -vb -p 4 file:///tmp/foo

gsiftp://remote.machine.my.edu/tmp/bar Downloading a File

globus-url-copy -vb -p 4 gsiftp://remote.machine.my.edu/tmp/bar file:///tmp/foo

Third party Transfers globus-url-copy -vb -p 4

gsiftp://other.machine.my.edu/tmp/foo gsiftp://remote.machine.my.edu/tmp/bar

Page 36: CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely

Summary

Job Submission GRAM

Data Transfer GridFTP

Security GSI -> Coming Soon