scheduling generic parallel applications – classification, meta- scheduling sathish vadhiyar...
TRANSCRIPT
Scheduling Generic Parallel Scheduling Generic Parallel Applications – classification, Applications – classification,
Meta-schedulingMeta-scheduling
Sathish VadhiyarSathish Vadhiyar
Sources/Credits/Taken from: Sources/Credits/Taken from: Papers listed in “References” slidePapers listed in “References” slide
Scheduling ArchitecturesScheduling Architectures
Centralized schedulersCentralized schedulersSingle-site scheduling – a job does not span across sitesSingle-site scheduling – a job does not span across sitesMulti-site – the oppositeMulti-site – the opposite
Hierarchical structures - A central scheduler Hierarchical structures - A central scheduler (metascheduler) for global scheduling and local (metascheduler) for global scheduling and local scheduling on individual sitesscheduling on individual sites
Decentralized scheduling – distributed schedulers Decentralized scheduling – distributed schedulers interact, exchange information and submit jobs to interact, exchange information and submit jobs to remote systemsremote systems
Direct communication – local scheduler directly contacts Direct communication – local scheduler directly contacts remote schedulers and transfers some of its jobsremote schedulers and transfers some of its jobsCommunication via central job pool – jobs that cannot be Communication via central job pool – jobs that cannot be immediately executed are pushed to a central pool, immediately executed are pushed to a central pool, other local schedulers pull the jobs out of the poolother local schedulers pull the jobs out of the pool
Various Scheduling ArchitecturesVarious Scheduling Architectures
Various Scheduling ArchitecturesVarious Scheduling Architectures
Metascheduler across MPPsMetascheduler across MPPs
TypesTypes CentralizedCentralized
A meta scheduler and local dispatchersA meta scheduler and local dispatchersJobs submitted to meta schedulerJobs submitted to meta scheduler
HierarchicalHierarchicalCombination of central and local schedulersCombination of central and local schedulersJobs submitted to meta schedulerJobs submitted to meta schedulerMeta scheduler sends job to the site for which earliest Meta scheduler sends job to the site for which earliest start time is expectedstart time is expectedLocal schedulers can follow their own policiesLocal schedulers can follow their own policies
DistributedDistributedEach site has a metascheduler and a local schedulerEach site has a metascheduler and a local schedulerJobs submitted to local metaschedulerJobs submitted to local metaschedulerJobs can be transffered to sites with lowest loadJobs can be transffered to sites with lowest load
Evaluation of schemesEvaluation of schemesCentralized
Hierarchical
Distributed
1. Global knowledge of all resources – hence optimized schedules
2. Can act as a bottleneck for large number of resources and jobs
3. May take time to transfer jobs from meta scheduler to local schedulers – need strategic position of meta scheduler
1. Medium level overhead
2. Sub optimal schedules
3. Still need strategic position of central scheduler
1. No bottleneck – workload evenly distributed
2. Needs all-to-all connections between MPPs
Evaluation of Various Scheduling Evaluation of Various Scheduling ArchitecturesArchitectures
Experiments to evaluate slowdowns in the 3 Experiments to evaluate slowdowns in the 3 schemesschemesBased on actual trace from a supercomputer centre Based on actual trace from a supercomputer centre – 5000 job set– 5000 job set4 sites were simulated – 2 with the same load as 4 sites were simulated – 2 with the same load as trace, other 2 where run time was multiplied by 1.7trace, other 2 where run time was multiplied by 1.7FCFS with EASY backfilling was usedFCFS with EASY backfilling was usedslowdown = (wait_time + run_time) / run_timeslowdown = (wait_time + run_time) / run_time2 more schemes2 more schemes
Independent – when local schedulers acted independently, Independent – when local schedulers acted independently, i.e. sites are not connectedi.e. sites are not connected
United – resources of all processors are combined to form United – resources of all processors are combined to form a single sitea single site
ResultsResults
ObservationsObservations1. Centralized and hierarchical performed slightly better than uniteda. Compared to hierarchical, scheduling decisions have to be
made for all jobs and all resources in united – overhead and hence wait time is highb. Comparing united and centralized.
i. 4 categories of jobs corresponding to 4 different combinations of 2 parameters – execution time (short, long) and number of resources requested (narrow, wide)
ii. Usually larger number of long narrow jobs than short wide jobs
iii. Why is centralized and hierarchical better than united?2. Distributed performed poorly
a. Short narrow jobs incurred more slowdown
b. short narrow jobs are large in number and best candidates for back filling
c. Back filling dynamics are complex
d. A site with an average light may not always be the best choice. SN jobs may find earliest holes in a heavily loaded site.
Newly Proposed ModelsNewly Proposed Models
K-distributed modelK-distributed model Distributed scheme where local metascheduler Distributed scheme where local metascheduler
distributes jobs to k least loaded sitesdistributes jobs to k least loaded sites When job starts on a site, notification is sent to When job starts on a site, notification is sent to
the local metascheduler which in turn asks the the local metascheduler which in turn asks the k-1 schedulers to dequeuek-1 schedulers to dequeue
K-Dual queue modelK-Dual queue model 2 queues are maintained at each site – one for 2 queues are maintained at each site – one for
local jobs and other for remote jobslocal jobs and other for remote jobs Remote jobs are executed only when they Remote jobs are executed only when they
don’t affect the start times of the local jobsdon’t affect the start times of the local jobs Local jobs are given priority during backfillingLocal jobs are given priority during backfilling
Results – Benefits of new schemesResults – Benefits of new schemes
45% improvement 15% improvement
Results – Usefulness of K-Dual Results – Usefulness of K-Dual schemescheme
Grouping jobs submitted at lightly loaded sites and Grouping jobs submitted at lightly loaded sites and heavily loaded sitesheavily loaded sites
Meta scheduler with Meta scheduler with AppLeS Local SchedulersAppLeS Local Schedulers
GoalsGoals
The aim was to overcome deficiencies of The aim was to overcome deficiencies of using plain AppLeS agentsusing plain AppLeS agents
Also to have global policiesAlso to have global policies Resolving different claims of applicationsResolving different claims of applications Improving the response times of individual Improving the response times of individual
apps.apps. Taking care of load dynamicsTaking care of load dynamics
Work done as part of GrADS projectWork done as part of GrADS project GrGrid id AApplication pplication DDevelopment evelopment SSoftwareoftware Collaboration between different universitiesCollaboration between different universities
GridRoutine /
ApplicationManager
User
Initial GrADS ArchitectureInitial GrADS Architecture
ResourceSelector
PerformanceModeler
MDSNWS
Matrix size, block size
Resource characteristics,
Problem characteristics
Final schedule – subset of resources
Performance ModelerPerformance ModelerGrid
Routine /Application
Manager
PerformanceModeler
All resources,
Problem parameters
Final schedule – subset of resources
SchedulingHeuristic
SimulationModel
All resources, problem parameters
Final Schedule
Candidate resources Execution cost
The scheduling heuristic passed only those candidate schedules that had “sufficient” memory
This is determined by calling a function in simulation model
Simulation ModelSimulation Model
Simulation of the ScaLAPACK right Simulation of the ScaLAPACK right looking LU factorizationlooking LU factorization
More about the applicationMore about the application Iterative – each iteration corresponding Iterative – each iteration corresponding
to a blockto a block Parallel application in which columns are Parallel application in which columns are
block-cyclic distributedblock-cyclic distributed Right looking LU – based on Gaussian Right looking LU – based on Gaussian
eliminationelimination
Gaussian Elimination - ReviewGaussian Elimination - Review
for each column ifor each column i zero it out below the diagonal by zero it out below the diagonal by
adding multiples of row i to later adding multiples of row i to later rowsrows
for i= 1 to n-1for i= 1 to n-1 for each row j below row ifor each row j below row i for j = i+1 to nfor j = i+1 to n A(j, i) = A(j, i) / A(i, i)A(j, i) = A(j, i) / A(i, i) for k = i+1 to nfor k = i+1 to n A(j, k) = A(j, k) – A(j, i)* A(i, k)A(j, k) = A(j, k) – A(j, i)* A(i, k)
0
0
0
0
0
0
0
0
0
0
0
i
ii,i
X
X
x
j
k
A(i,i)
A(j,i) A(j,k)
A(i,k)
Finished multipliers
Finished multipliers
i
i
A(i+1:n, i)
A(i, i+1:n)
A(i+1:n, i+1:n)
Need for blocking - BLASNeed for blocking - BLAS
Basic Linear Basic Linear Algebra Algebra SubroutineSubroutine
Memory hierarchy Memory hierarchy efficiently efficiently exploited by exploited by higher level BLAS higher level BLAS
3 levels3 levels
BLASBLAS MemMemory ory Refs.Refs.
FlopFlopss
FlopsFlops//MemMemory ory refs.refs.
Level-1Level-1
(vector)(vector)
y=y+axy=y+ax
Z=y.xZ=y.x
3n3n 2n2n 2/32/3
Level-2Level-2
(Matrix-vector)(Matrix-vector)
y=y+Axy=y+Ax
A = A+(alpha) xyA = A+(alpha) xyTT
nn22 2n2n22 22
Level-3Level-3
(Matrix-Matrix)(Matrix-Matrix)
C=C+ABC=C+AB
4n4n22 2n2n33 n/2n/2
Converting BLAS2 to BLAS3Converting BLAS2 to BLAS3
Use blocking for optimized Use blocking for optimized matrix-matrix-multipliesmultiplies (BLAS3) (BLAS3)Matrix multiplies by Matrix multiplies by delayed delayed updatesupdatesSave several updates to trailing Save several updates to trailing matricesmatricesApply several updates in the form of Apply several updates in the form of matrix multiplymatrix multiply
Modified GE using BLAS3Modified GE using BLAS3Courtesy: Dr. Jack DongarraCourtesy: Dr. Jack Dongarra
for ib = 1 to n-1 step b /* process matrix b columns at a time */for ib = 1 to n-1 step b /* process matrix b columns at a time */ end = ib+b-1;end = ib+b-1; /* Apply BLAS 2 version of GE to get A(ib:n, ib:end) factored./* Apply BLAS 2 version of GE to get A(ib:n, ib:end) factored. Let LL denote the strictly lower triangular portion of A(ib:end, ib:end) */Let LL denote the strictly lower triangular portion of A(ib:end, ib:end) */ A(ib:end, end+1:n) = LLA(ib:end, end+1:n) = LL-1-1A(ib:end, end+1:n) /* update next b rows of U */A(ib:end, end+1:n) /* update next b rows of U */ A(end+1:n, end+1:n) = A(end+1:n, end+1:n) - A(ib:n, ib:end)* A(ib:end, A(end+1:n, end+1:n) = A(end+1:n, end+1:n) - A(ib:n, ib:end)* A(ib:end,
end+1:n) end+1:n) /* Apply delayed updates with single matrix multiply *//* Apply delayed updates with single matrix multiply */
bib end
ib
end
bCompleted part of L
Completed part of U A(ib:end, ib:end)
A(end+1:n, ib:end) A(end+1:n, end+1:n)
OperationsOperations
So, the LU application in each So, the LU application in each iteration involves:iteration involves: Block factorization – (ib:n, ib:ib) floating Block factorization – (ib:n, ib:ib) floating
point operationspoint operations Broadcast for multiply – message size Broadcast for multiply – message size
equals approximately n*block_sizeequals approximately n*block_size Each process does its own multiply:Each process does its own multiply:
Remaining columns divided by number of Remaining columns divided by number of processorsprocessors
Back to the simulation modelBack to the simulation modeldouble getExecTimeCost(int matrix_size, int block_size, candidate_schedule){double getExecTimeCost(int matrix_size, int block_size, candidate_schedule){
for(i=0; i<number_of_blocks; i++){for(i=0; i<number_of_blocks; i++){ /* find the proc. Belonging to the column. Note its speed, its connections to other /* find the proc. Belonging to the column. Note its speed, its connections to other
procs. */procs. */ tfact += … /* simulate block factorization. Depends on {processor_speed, tfact += … /* simulate block factorization. Depends on {processor_speed,
machine_load, flop_count of factorization */machine_load, flop_count of factorization */
tbcast += max(bcast times for each proc.) /* scalapack follows split ring broadcast. tbcast += max(bcast times for each proc.) /* scalapack follows split ring broadcast. Simulate broadcast algorithm for each proc. Depends on {elements of matrix to be Simulate broadcast algorithm for each proc. Depends on {elements of matrix to be broadcast, connection bandwidth and latency */broadcast, connection bandwidth and latency */
tupdate += max(matrix multiplies across all proc.) /* depends on {flop count of matrix tupdate += max(matrix multiplies across all proc.) /* depends on {flop count of matrix multiply, processor speed, load} */multiply, processor speed, load} */
}}
return (tfact + tbcast + tupdate);return (tfact + tbcast + tupdate);
}}
GridRoutine /
ApplicationManager
User
Initial GrADS ArchitectureInitial GrADS Architecture
ResourceSelector
PerformanceModeler
AppLauncher
ContractMonitor Application
MDSNWS
Matrix size, block size
Resource characteristics,
Problem characteristics
Problem, parameters, app. Location, final schedule
Contract Monitor ArchitectureContract Monitor Architecture
AutopilotManager
Application
Sensors
ContractMonitor
Fork
registrationObtain sensor information
Obtain information about variable x
Performance Model EvaluationPerformance Model Evaluation
GrADS LimitationsGrADS Limitations
Hence a metascheduler that has global knowledge of all applications is needed
MetaschedulerMetaschedulerTo ensure that applications are To ensure that applications are scheduled based on correct resource scheduled based on correct resource informationinformationTo accommodate as many new To accommodate as many new applications as possibleapplications as possibleTo improve the performance contract of To improve the performance contract of new applicationsnew applicationsTo minimize the impact of new To minimize the impact of new applications on executing applicationsapplications on executing applicationsTo employ policies to migrate executing To employ policies to migrate executing applicationsapplications
GridRoutine /
ApplicationManager
User
Modified GrADS ArchitectureModified GrADS Architecture
ResourceSelector
PerformanceModeler
ContractDeveloper
AppLauncher
ContractMonitor
Application
MDS
NWSPermission
Service
RSS
ContractNegotiator
Rescheduler
DatabaseManager
Database ManagerDatabase Manager
A persistent service listening for requests from A persistent service listening for requests from the clientsthe clientsMaintains global clockMaintains global clockHas event notification capabilities – clients can Has event notification capabilities – clients can express their interests in various events.express their interests in various events.Stores various information:Stores various information:
Application’s statesApplication’s states Initial machinesInitial machines Resource informationResource information Final scheduleFinal schedule Location of various daemonsLocation of various daemons Average number of contract violationsAverage number of contract violations
Database Manager (Contd…)Database Manager (Contd…)
When an application stops or completes, When an application stops or completes, the database manager calculates the database manager calculates percentage completion time of the percentage completion time of the applicationapplication
time_diff : (current_time – time when the time_diff : (current_time – time when the application instance started)application instance started)
Avg_ratio: average of (actual costs / Avg_ratio: average of (actual costs / predicted costs)predicted costs)
Permission ServicePermission Service
After collecting resource information from After collecting resource information from NWS, the GrADS apps. contact PS.NWS, the GrADS apps. contact PS.PS makes decisions based on the problem PS makes decisions based on the problem requirements and resource characteristics requirements and resource characteristics makes decisionsmakes decisionsIf resources have enough capacity, then If resources have enough capacity, then permission is givenpermission is givenIf not, the permission serviceIf not, the permission service Waits for resource consuming applications to Waits for resource consuming applications to
end soonend soon Preempts resource consuming applications to Preempts resource consuming applications to
accommodate short applicationsaccommodate short applications
Permission Service (Pseudo code)Permission Service (Pseudo code)
Permission Service (Pseudo code)Permission Service (Pseudo code)
Permission Service – determining Permission Service – determining resource consuming applicationsresource consuming applicationsFor each currently executing GrADS app., For each currently executing GrADS app., contact DBM, obtain NWS resource contact DBM, obtain NWS resource information.information.
Determine change of resources caused by Determine change of resources caused by app. iapp. i
Add the change to current resource Add the change to current resource characteristics to obtain resource characteristics to obtain resource parameters in the absence of app. iparameters in the absence of app. i
Determining remaining execution Determining remaining execution timetime
Whenever a meta scheduler Whenever a meta scheduler component wants to determine component wants to determine remaining execution time of app., it remaining execution time of app., it contacts the contract monitor of app.contacts the contract monitor of app.Retrieves average of ratios between Retrieves average of ratios between actual times and predicted timesactual times and predicted timesUses {average, predicted time, Uses {average, predicted time, percentage completion time} to percentage completion time} to determine r.e.t.determine r.e.t.
Determing r.e.t (pseudo code)Determing r.e.t (pseudo code)
Determining r.e.t (pseudo code)Determining r.e.t (pseudo code)
Contract NegotiatorContract Negotiator
Main functionalitiesMain functionalities Ensure apps. made decisions based on updated resource Ensure apps. made decisions based on updated resource
informationinformation Improve the performance of current apps. by possibly stopping Improve the performance of current apps. by possibly stopping
and continuing executing big apps.and continuing executing big apps. Reduces the impact caused by current apps. on executing Reduces the impact caused by current apps. on executing
apps.apps.When contract is approved, the application starts using When contract is approved, the application starts using resourcesresourcesWhen contract is rejected, the application goes back to When contract is rejected, the application goes back to obtain new resource characteristics, and generates new obtain new resource characteristics, and generates new schedulescheduleEnforces ordering of the applications whose application-Enforces ordering of the applications whose application-level characteristics use the same resourceslevel characteristics use the same resources
approves the contract of one applicationsapproves the contract of one applications Waits for the application to start using resourcesWaits for the application to start using resources Rejects the contract of the otherRejects the contract of the other
Contract Negotiator (Pseudo code)Contract Negotiator (Pseudo code)Ensuring app. has made scheduling decision based on correct resource information
Contract Negotiator (Pseudo code)Contract Negotiator (Pseudo code)Improving the performance of the current app. by preempting an executing large app.
Contract Negotiator – 3 scenariosContract Negotiator – 3 scenariost1 – average completion time of current app. and big app. when big app. is preempted, current app. accommodated, big app. continued
t2 - average completion time of current app. and big app. when big app. is allowed to complete, then current app. is accommodated
t3 - average completion time of current app. and big app. when both applications are executed simultaneously
if (t1 < 25% of min(t2, t3) case 1
else if(t3 > 1.2t2) case 2
else case 3
Contract Negotiator (Pseudo code)Contract Negotiator (Pseudo code)Improving the performance of the current app. by preempting an executing large app.
Contract Negotiator (Pseudo code)Contract Negotiator (Pseudo code)Reducing the impact of the current app. on executing app. by modifying the schedule
Contract Negotiator (Pseudo code)Contract Negotiator (Pseudo code)Reducing the impact of the current app. on executing app. by modifying the schedule
Application and Metascheduler InteractionsApplication and Metascheduler Interactions
User
ResourceSelection
RequestingPermission
PermissionService
Permission?
Application SpecificScheduling
ContractDevelopment
ContractNegotiator
ContractApproved?
ApplicationLaunching
Problem parameters
Initial list of machines
PermissionNO
YES
Abort
Exit
Get new resource information
Application specific schedule
Get new resource information
NOYES
ApplicationCompletion?
Application Completed
Wait for restartsignal
Application was stopped
Problem parameters, final schedule Get new resource
information
Experiments and ResultsExperiments and ResultsDemonstration of Permission ServiceDemonstration of Permission Service
ReferencesReferences
A taxonomy of scheduling in general-purpose distributed A taxonomy of scheduling in general-purpose distributed computing systems. IEEE Transactions on Software computing systems. IEEE Transactions on Software Engineering. Engineering. Volume 14 , Issue 2 (February 1988) Pages: 141 - Volume 14 , Issue 2 (February 1988) Pages: 141 - 154 Year of Publication: 1988 154 Year of Publication: 1988 AuthorsAuthors T. L. Casavant J. G. Kuhl T. L. Casavant J. G. KuhlEvaluation of Job-Scheduling Strategies for Grid Evaluation of Job-Scheduling Strategies for Grid ComputingSourceLecture Notes In Computer Science. ComputingSourceLecture Notes In Computer Science. Proceedings of the First IEEE/ACM International Workshop on Proceedings of the First IEEE/ACM International Workshop on Grid Computing. Grid Computing. Pages: 191 - 202 Year of Publication: 2000 Pages: 191 - 202 Year of Publication: 2000 ISBN:3-540-41403-7. Volker Hamscher Uwe Schwiegelshohn ISBN:3-540-41403-7. Volker Hamscher Uwe Schwiegelshohn Achim Streit Ramin YahyapourAchim Streit Ramin Yahyapour"Distributed Job Scheduling on Computational Grids using Multiple "Distributed Job Scheduling on Computational Grids using Multiple Simultaneous Requests" Vijay Subramani, Rajkumar Kettimuthu, Simultaneous Requests" Vijay Subramani, Rajkumar Kettimuthu, Srividya Srinivasan, P. Sadayappan, Proceedings of 11th IEEE Srividya Srinivasan, P. Sadayappan, Proceedings of 11th IEEE Symposium on High Performance Distributed Computing (HPDC Symposium on High Performance Distributed Computing (HPDC 2002), July 20022002), July 2002
ReferencesReferences
Vadhiyar, S., Dongarra, J. and Yarkhan, A. “Vadhiyar, S., Dongarra, J. and Yarkhan, A. “GrADSolve - RPC for GrADSolve - RPC for High Performance Computing on the GridHigh Performance Computing on the Grid". ". Euro-Par 2003, 9th Euro-Par 2003, 9th International Euro-Par Conference, ProceedingsInternational Euro-Par Conference, Proceedings, Springer, LCNS , Springer, LCNS 2790, p. 394-403, August 26 -29, 2003.2790, p. 394-403, August 26 -29, 2003.Vadhiyar, S. and Dongarra, J. “Vadhiyar, S. and Dongarra, J. “Metascheduler for the GridMetascheduler for the Grid”. ”. Proceedings of theProceedings of the 11th IEEE International Symposium on High 11th IEEE International Symposium on High Performance Distributed ComputingPerformance Distributed Computing, pp 343-351, July 2002, , pp 343-351, July 2002, Edinburgh, Scotland.Edinburgh, Scotland.Vadhiyar, S. and Dongarra, J. “Vadhiyar, S. and Dongarra, J. “GrADSolve - A Grid-based RPC GrADSolve - A Grid-based RPC system for Parallel Computing with Application-level system for Parallel Computing with Application-level SchedulingScheduling". ". Journal of Parallel and Distributed ComputingJournal of Parallel and Distributed Computing, , Volume 64, pp. 774-783, 2004.Volume 64, pp. 774-783, 2004.Petitet, A., Blackford, S., Dongarra, J., Ellis, B., Fagg, G., Roche, K., Petitet, A., Blackford, S., Dongarra, J., Ellis, B., Fagg, G., Roche, K., Vadhiyar, S. "Numerical Libraries and The Grid: The Grads Vadhiyar, S. "Numerical Libraries and The Grid: The Grads Experiments with ScaLAPACK, " Experiments with ScaLAPACK, " Journal of High Performance Journal of High Performance Applications and SupercomputingApplications and Supercomputing, Vol. 15, number 4 (Winter 2001): , Vol. 15, number 4 (Winter 2001): 359-374. 359-374.
Metascheduler ComponentsMetascheduler Components
PermissionService
ContractNegotiator
DatabaseManager
Rescheduler
Applications
Metascheduler
Requests from applications for permission to execute on the Grid
Storing and retrieval of the states of the applications
Application level schedules from the applications
•Decisions based on resource capacities
•Can stop an executing resource consuming application
•Can accept or reject contracts
•Acts as queue manager
•Ensures scheduling based on correct information
•Improves performance contracts
•Minimizes impactRequest for Migration
Reschedules executing applications:
- To escape from heavy load
- To use free resources
Taxonomy of scheduling for Taxonomy of scheduling for distributed heterogeneous systems distributed heterogeneous systems
– Casavant and Kuhl (1988)– Casavant and Kuhl (1988)
TaxonomyTaxonomy
Local vs GlobalLocal vs Global Local – scheduling processes to time slices on a single Local – scheduling processes to time slices on a single
processorprocessor Global – deciding which processor should a job go toGlobal – deciding which processor should a job go to
Approximate vs heuristicApproximate vs heuristic Approximate – stop when you find a “good” solution. Uses same Approximate – stop when you find a “good” solution. Uses same
formal computational model . The ability to succeed depends on. formal computational model . The ability to succeed depends on. Availability of a function to evaluate a solutionAvailability of a function to evaluate a solutionThe time required to evaluate a solutionThe time required to evaluate a solutionThe ability to judge according to some metric valueThe ability to judge according to some metric valueMechanism to intelligently prune the solution spaceMechanism to intelligently prune the solution space
HeuristicsHeuristicsWorks on assumptions about the impact of “important” parametersWorks on assumptions about the impact of “important” parametersCannot quantize the assumption and the amount of impact all the Cannot quantize the assumption and the amount of impact all the timestimes
Also…Also…
Flat characteristicsFlat characteristics Adaptive vs. non-adaptiveAdaptive vs. non-adaptive Load balancingLoad balancing Bidding – e.g. CondorBidding – e.g. Condor Probabilistic – random searchesProbabilistic – random searches One time assignment vs. dynamic One time assignment vs. dynamic
reassignmentreassignment
Evaluation – Subramani et. al.Evaluation – Subramani et. al.
Results – Usefulness of K-Dual Results – Usefulness of K-Dual schemescheme
Grouping jobs submitted at lightly loaded sites and Grouping jobs submitted at lightly loaded sites and heavily loaded sitesheavily loaded sites
Experiments and ResultsExperiments and ResultsPractical ExperimentsPractical Experiments
5 applications were integrated into GrADS – 5 applications were integrated into GrADS – ScaLAPACK LU, QR, Eigen, PETSC CG and ScaLAPACK LU, QR, Eigen, PETSC CG and Heat equationHeat equationIntegration involved – developing Integration involved – developing performance models, instrumenting with SRSperformance models, instrumenting with SRS50 problems with different arrival rates50 problems with different arrival rates
- Poisson distribution with different mean arrival rates - Poisson distribution with different mean arrival rates for job submissionfor job submission
- uniform distributions for problem types, problem - uniform distributions for problem types, problem sizessizes
Different statistics were collectedDifferent statistics were collectedMetascheduler was enabled or disabledMetascheduler was enabled or disabled
Experiments and ResultsExperiments and ResultsPractical Experiments – Total Throughput ComparisonPractical Experiments – Total Throughput Comparison
Experiments and ResultsExperiments and ResultsPractical Experiments – Performance Contract ViolationsPractical Experiments – Performance Contract Violations
Measured Time/Expected Time
Maximum allowed Measured Time/Expected Time
Contract Violation: Measured/Expected > maximum allowed Measured/Expected