derek wright computer sciences department university of wisconsin-madison [email protected] condor...

30
Derek Wright Computer Sciences Department University of Wisconsin-Madison [email protected] http://www.cs.wisc.edu/condor Condor and MPI Paradyn/Condor Week Madison, WI 2001

Upload: gladys-phelps

Post on 13-Dec-2015

219 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

Derek WrightComputer Sciences DepartmentUniversity of Wisconsin-Madison

[email protected]://www.cs.wisc.edu/condor

Condor and MPIParadyn/Condor Week

Madison, WI 2001

Page 2: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

Overview

› MPI and Condor: Why Now?

› Dedicated and Opportunistic Scheduling

› How Does it All Work?

› Specific MPI Implementations

› Future Work

Page 3: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

What is MPI?

› MPI is the “Message Passing Interface”

› Basically, a library for writing parallel applications that use message passing for inter-process communication

› MPI is a standard with many different implementations

Page 4: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

MPI and Condor: Why Haven’t We Supported

it Until Now? › MPI's model is a static world

› We always saw the world as dynamic, opportunistic, ever-changing

› We focused our parallel support on PVM which supported a dynamic environment

Page 5: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

MPI With Condor:Why Now?

› More and more Condor pools are being formed from dedicated resources

› MPI's API is also starting to move towards supporting a dynamic world (e.g. LAM, MPI2, etc)

› Few schedulers (if any) handle both opportunistic and dedicated resources at the same time

Page 6: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

Dedicated and Opportunistic

Scheduling› Resources can move between

'dedicated' and 'opportunistic' status

› Users submit jobs that are either dedicated (e.g. Universe = MPI) or opportunistic (e.g. Universe = standard)

Page 7: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

Dedicated and Opportunistic (Cont'd)

› Condor leaves all resources as opportunistic unless it sees dedicated jobs to service

› The Dedicated Scheduler ('DS') claims opportunistic resources and turns them into dedicated ones to schedule into the future

Page 8: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

Dedicated and Opportunistic (Cont'd)

› When the DS has no more jobs, it releases the resources which go back to serving opportunistic jobs

Page 9: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

Dedicated Scheduling, and "Back-Filling”

› There will always be "holes" in the dedicated schedule, sets of resources that can't be filled with dedicated jobs for certain periods of time

› Traditional solution is “back-filling” the holes with smaller dedicated jobs

› However, these might not be preemptable

Page 10: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

Back-Filling (Cont’d)

› Instead of back-filling with dedicated jobs, we give the resources to Condor’s opportunistic scheduler

› Condor runs preemptable opportunistic jobs until the DS decides it needs the resources again and reclaims them

Page 11: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

Dedicated Resources are Opportunistic

Resources› Even “dedicated” resources are

really opportunistic Hardware failure, software failure, etc Condor handles these failures better

than traditional dedicated schedulers, since our system already deals with them after years of opportunistic scheduling experience

Page 12: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

How Does MPI Support in Condor Really Work?› Changes to the resource agent

(condor_startd)

› Changes to the job scheduling agent (condor_schedd)

› Changes to the rest of the Condor system

Page 13: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

How Do You Make a Resource Dedicated in

Condor?› Just have to change a few config file

settings.... no new startd binary is required

› Add an attribute to the classad saying which scheduler, if any, this resource is willing to become dedicated to

Page 14: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

Other Configuration Changes for the startd

› In addition, you must change the policy expressions: Must always be willing to run jobs

from the DS While the resource is claimed by the

DS, the startd should never suspend or preempt jobs.

Page 15: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

Submitting Dedicated Jobs

› Requires a new "contrib" version of the condor_schedd

› Condor "wakes up" the dedicated scheduler logic inside the condor_schedd when MPI jobs are submitted

Page 16: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

How Does Your Job Get Resources?

› The DS does a query to find all resources that are willing to become dedicated to it

› DS sends out "resource request" classads and negotiates for resources with the negotiator (the opportunistic scheduler)

Page 17: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

How Does Your Job Get Resources? (Cont’d)

› DS then claims resources directly

› Once resources are available, the DS schedules and spawns jobs

› When jobs complete, if more MPI jobs can be serviced with the same resources, the DS holds onto them and uses them immediately

Page 18: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

Changes to the rest of Condor?

› Very few other changes required

› Users can use all the same tools, interfaces, etc.

› Just need a new condor_starter to actually spawn MPI jobs (will also be offered as a contrib module)

Page 19: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

Specific MPI Implementations

› MPICH

› LAM

› Others?

Page 20: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

Condor and MPICH

› Currently we support MPICH on Unix

› Working on adding MPICH-NT support NT’s MPICH has a different

mechanism to spawn jobs than the Unix MPICH...

Page 21: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

Condor + LAM = "LAMdor”

› LAM's API is better suited for a dynamic environment, where hosts can come and go from your MPI universe

› Has a different mechanism for spawning jobs than MPICH

› Condor working to support their methods for spawning

Page 22: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

LAMdor (Cont’d)

› LAM working to understand, expand, and fully implement the dynamic scheduling calls in their API

› LAM also considering using Condor’s libraries to support checkpointing of MPI computations

Page 23: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

MPI-2 Standard

› The MPI-2 standard contains calls to handle dynamic resources

› Not yet fully implemented by anyone

› When it is, we'll support it

Page 24: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

Other MPI implementations

› What are people using?

› Do you want to see Condor support any other MPI implementations?

› If so, send email to [email protected] and let us know

Page 25: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

Future work

› Implementing more advanced dedicated scheduling algorithms

› Support for all sorts of MPI implementations (LAM, MPICH-NT, MPI-2, others)

Page 26: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

More Future work

› Solving problems w/ MPI on the Grid "Flocking" MPI jobs to remote pools, or

even spanning pools with a single computation

Solving issues of resource ownership on the Grid (i.e. how do you handle multiple dedicated schedulers on the grid wanting to control a given resource?)

Page 27: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

More Future work

› Checkpointing entire MPI computations

› "MW" implmentation on top of Condor-MPI

Page 28: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

More Future work

› Support for other kinds of dedicated jobs Generic dedicated jobs (we just

gather and schedule the resources, then call your program, give it the list of machines, and let the program spawn itself)

LINDA

Page 29: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

How do I start using MPI with Condor?

› MPI support is still alpha, not quite ready for production use

› A beta release should be out soon as a contrib module

› Check the web site www.cs.wisc.edu/condor

Page 30: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu  Condor and MPI Paradyn/Condor

www.cs.wisc.edu/condor

Thanks for Listening!

› Questions?

› For more information: http://www.cs.wisc.edu/condor mailto:[email protected]