master-worker tutorial condor week 2006
Post on 07-Feb-2016
34 Views
Preview:
DESCRIPTION
TRANSCRIPT
Greg ThainComputer Sciences DepartmentUniversity of Wisconsin-Madison
gthain@cs.wisc.eduhttp://www.cs.wisc.edu/condor/mw
Master-WorkerTutorial
Condor Week 2006
www.cs.wisc.edu/condor/mw
Agenda
› What is M-W
› When to use M-W
› How to build a simple M-W application
› Q & A
www.cs.wisc.edu/condor/mw
Why M-W?
› M-W addresses a weakness in Condor:
Short jobs
› Also, for dynamic, parallel workflows
www.cs.wisc.edu/condor/mw
A Condor Job…
www.cs.wisc.edu/condor/mw
An easy solution:
› Why not just wrap up smaller jobs into a bigger Condor job? Partial failures? Load balancing? Dynamic creation of work?
www.cs.wisc.edu/condor/mw
Solution: Lightweight Tasks
Multiplexed on top of Jobs
› Process : Thread :: Condor Job : MW Task
› MWTask dispatch in milliseconds, Condor job can take minutes
www.cs.wisc.edu/condor/mw
MW is…
› C++ Framework
› To re-use condor worker jobs
› To each run many tasks
› Results in very parallel application
www.cs.wisc.edu/condor/mw
MW is not
› MPI
› General parallel programming scheme
www.cs.wisc.edu/condor/mw
MW in action
condor_submit
Submit machine
T T T T T T T T
Master exe
T
T
TWorker
Worker
Worker
www.cs.wisc.edu/condor/mw
You Must Write 3 Classes
Subclasses of …MWDriver
MWTask
MWWorker
Master exe
Worker exe
www.cs.wisc.edu/condor/mw
Your_MWTask
› Subclass MWTask
› Data members for inputs
› Data member for results
› Serialization of inputs and results
› Distinct instances on each side
www.cs.wisc.edu/condor/mw
The Four Task Methods
› void MyTask::pack_work(void);
› void MyTask::unpack_work(void);
› void MyTask::pack_results(void);
› void MyTask::unpack_results(void);
› Also ctor/dtor!
www.cs.wisc.edu/condor/mw
RMComms
› Abstraction for communication• (and some other stuff…)
› RMC->pack(int *array, int length);
› RMC->unpack(int *array, int length);
www.cs.wisc.edu/condor/mw
MWWorker
› Just one method:
› executeTask(MWTask *t)
› Also ctor/dtor!
www.cs.wisc.edu/condor/mw
MWDriver
› get_userinfo(int argc, char **argv) RMC->add_executable(char *exe, char *requirements);
› setup_initial_tasks(int num_tasks, MWTask ***init_tasks)
› act_on_completed_task(MWTask *t) RMC->add_task(MWTask *t)
› Also ctor/dtor
www.cs.wisc.edu/condor/mw
Putting it all together:new_skel
› ./new_skel MY_PROJECT
› Use configure –help for options
› make
www.cs.wisc.edu/condor/mw
Debugging with Independent Mode
› Special RMComm for debugging
› Single process, can run under gdb
www.cs.wisc.edu/condor/mw
Running on the Grid…
› Just launch the appropriate master
› condor_q to see it in action
www.cs.wisc.edu/condor/mw
Advice for Large Runs
› Use personal condor Flock, glide-in, schedd-on-side,
hobblein
› Use checkpointing!
› Set_worker_increment high
www.cs.wisc.edu/condor/mw
User-level Checkpointing
› MWTask::write_chkpt_info(FILE *)
› MWTask::read_chkpt_info(FILE *)
› MWDriver::read_master_state(FILE *)
› MWDriver::write_master_state(FILE *)
www.cs.wisc.edu/condor/mw
Example codes with MW
› Matmul
› Blackbox
› knapsack
www.cs.wisc.edu/condor/mw
MW Philosophy
› Reuse either code or concept
› Key idea: Late binding
www.cs.wisc.edu/condor/mw
Other resources
› http://www.cs.wisc.edu/condor/mw
› Online manual
› MW-users mailing list
www.cs.wisc.edu/condor/mw
Thank You!
Questions?
MW Home page: http://www.cs.wisc.edu/condor/mw
top related