use of condor on the open science grid chris green, osg user group / fnal condor week, april 30 2008

12
Use of Condor on the Open Science Grid Chris Green, OSG User Group / FNAL Condor Week, April 30 2008

Upload: paul-casey

Post on 02-Jan-2016

232 views

Category:

Documents


0 download

TRANSCRIPT

Use of Condor on the Open Science Grid

Chris Green, OSG User Group / FNAL

Condor Week, April 30 2008

April 30, 2008Condor Week

Chris GreenOSG User Group / FNAL

2

What is OSG?Links• OSG home page.• VORS resource map

and information.• VDT (Virtual Data

Toolkit) home page.• Current use of OSG.

• "Virtual Organizations" (VOs): trust point for authorization; role-based personalities.

• Works with multiple underlying batch systems (Condor, PBS family, LSF, SGE).

• Collection of mostly US-based scientific / academic sites sharing computing and storage resources via common software stack.

• Job submission and management based around Globus / CondorG.

April 30, 2008Condor Week

Chris GreenOSG User Group / FNAL

3

OSG facts and figures

• 83 registered computing resources.• 30 registered VOs.• Usage breakdown for 2008/04/19 – 2008/04/25:

Wall Time (d)

65014

30582

165

Condor PBS/LSF SGE

Computing Resource Batch Managers

534

22

2

Condor SGE PBS LSF

April 30, 2008Condor Week

Chris GreenOSG User Group / FNAL

4

Survey of Condor useon OSG

• Out of the box:CondorG for inter-site job transfer via

Globus/GRAM: GT2 submissions via CondorG still (by far) the most common method of grid job submission on OSG.

Task scheduling for site health monitoring.One of several batch systems supported on

OSG."ManagedFork" job management.

April 30, 2008Condor Week

Chris GreenOSG User Group / FNAL

5

Survey of Condor useon OSG

• External projectsGlidein / WMS: "pilot" job submission and management.FermiGrid: job forwarding, "campus grid" management.OSGMM / ReSS: job forwarding and attribute-based

matchmaking across multiple OSG sites."condorview:" enhanced job monitoring and control – not the web-based statistics client of the same name.

Complex workflows (eg LIGO: Pegasus/DAGMAN).Gratia: accounting system leverages features of condor

where available: condor_history, PER_JOB_HISTORY_DIR, DN.

April 30, 2008Condor Week

Chris GreenOSG User Group / FNAL

6

More detail: Glidein/WMS

• Workload Management System (Igor Sfiligoi, FNAL) uses Condor Glideins -- startd submitted as a grid job ("pilot") makes remote batch nodes look like local ones.

• Two main components:One or more glidein factories: manage available grid

sites and submit pilot jobs.One or more VO frontends: receive payload

submissions from users for distribution to sites.

• Pilots receive user payloads as distributed by VO frontends.

April 30, 2008Condor Week

Chris GreenOSG User Group / FNAL

7

More detail: Glidein/WMS

April 30, 2008Condor Week

Chris GreenOSG User Group / FNAL

8

More detail: Glidein/WMS

• Uses GCB for firewall / NAT management .• Intra-VO priority management.• Works with glExec: application running on

worker nodes which handles authorization and UID mapping for payloads – per user accountability to the site.

• Unaffected by grid site batch manager choice.• V1.0 released Dec.'07; v1.1 Jan'08.• In use by: CDF; Minos (FNAL); being

commissioned for CMS.

April 30, 2008Condor Week

Chris GreenOSG User Group / FNAL

9

More detail: "condorview"

• Michael Thomas, Caltech.

• Graphical tool for browsing and managing a condor queue.

• Hooks to vacate and kill jobs.

• Hooks to ssh into job directory on worker node and print out process tree.

• Uses condor_q, condor_config_val, and condor_fetchlog.

April 30, 2008Condor Week

Chris GreenOSG User Group / FNAL

10

More detail: condorview

April 30, 2008Condor Week

Chris GreenOSG User Group / FNAL

11

More detail: condorview

April 30, 2008Condor Week

Chris GreenOSG User Group / FNAL

12

Concluding statements

• Condor essential to the OSG.• Condor use underpins connectivity of sites within

the OSG.• Close ties: Miron is OSG PI; VDT team at

Wisconsin; new Condor features often a result of OSG needs.

• Widely used on OSG; many novel uses of and applications building on Condor features.

• More details in later talks!