vo-ganglia grid simulator
DESCRIPTION
VO-Ganglia Grid Simulator. Catalin Dumitrescu, Mike Wilde, Ian Foster Computer Science Department The University of Chicago. Talk Overview. Part I: The Grid-enabled Monitoring Tool Part II: From Monitoring to Simulation Part III: Features / Extended Model Shortcomings - PowerPoint PPT PresentationTRANSCRIPT
VO-Ganglia VO-Ganglia Grid Grid
SimulatorSimulator
Catalin Dumitrescu, Mike Wilde, Ian Catalin Dumitrescu, Mike Wilde, Ian FosterFoster
Computer Science DepartmentComputer Science DepartmentThe University of ChicagoThe University of Chicago
Talk Overview
➢Part I: The Grid-enabled Monitoring Tool
➢Part II: From Monitoring to Simulation
➢ Part III: Features / Extended Model
➢Shortcomings
➢Future Work / Conclusions
2
VO-Ganglia / Grid-enabled Mon
➢P2P Reporting ✗implicit hierarchic infrastructures
➢Interface with Other Monitoring Tools ✗Nagios, MDS 2
➢Grid/Globus Specific Metrics ✗Gatekeeper Information / Cluster RM Status
➢Per VO Monitoring Support ✗Collected metrics were aggregated and VO specific as well
➢Resource Management ➢Preference Specifications ➢Usage Policy Enforcement
3
Best Snapshot (1)
4
Best Snapshot (2)
5
Why to Continue on this Path?
➢Implemented Ideas ●VO based Metric Reporting●Usage Policy Metric Incorporation ●Distributed Infrastructure for Usage Policy
➢Time Spent with Development ●Enhanced Monitoring ~ 3 month●Policy ~ 6 months●Simulator ~ 3 months
➢Are Other Alternatives Around? ➢MonaLisa ➢Standard Ganglia
6
➢Difficult to Find Always Acceptable Grid Testbeds
➢Deployment Takes Time
➢Computing Time Represents an Issue in Production Environments
➢What Do Some Well Known TestBeds offer Today? ➢Grid3: many clusters with similar software AND Globus➢PlanetLab: individual machines with similar characteristics
7
From Monitoring to Simulation
8
Features / Implemented Model
➢CPU Management / Task Assignment Policies ➢Disk Management / Space Assignment Policies ➢Network Management / Maximum Capacity (so far) ➢Usage Policy Specification Interface
➢Data File Management (replica selection problem)
9
Implementation Details
➢Before: ✗Metric collection by means of specific collectors
➢Now: ✗Special modules that generate metrics about different loads✗Similar to a discrete simulator but integrated with a real tool
➢“How exactly?”✗Periodic invocations (instead of monitoring collectors) ✗State management for workloads, data file migration, CPU and disk allocations, network usages
10
Running Examples
Talk Overview
11
➢Part I: The Grid-enabled Monitoring Tool
➢Part II: From Monitoring to Simulation
➢ Part III: Features / Extended Model
➢Shortcomings
➢Future Work / Conclusions
12
Distributed Simulations
➢Idea: Is it possible to run several simulators on different machines and configure each instance to report to a set of specified neighbors?
➢Advantages:✗Simplicity in connecting several local simulators working on different data✗Support for metric distribution and visualization
13
Running Examples
[...]
14
Commitment Usage Policy
for each Gi with EPi, BPi, BEi do
# Case 1: fill BPi + BEi if (Sum(BAj) == 0) & (BAi < BPi) & (Qi has jobs) then schedule a job from some Qi to the least loaded site
# Case2: BAi<BPi (resources available) else if (SUM (BAk) < TOTAL) & (BAi < BPi) & (Qi has jobs) schedule a job from some Qi to the least loaded site
# Case 3: fill EPi (resource contention) else if (sum(BAk) == TOTAL) &
(BAi < EPi) & (Qi exists) then if (j exists such that BAj >= EPj) then stop scheduling jobs for VOj # Need to fill with extra jobs? if (BAi < EPi + BEi) then schedule a job from some Qi to the least loaded site
# ??if (EAi < EPi) & (Qi has jobs) then schedule additional backfill jobs
15
Usage Policy Example
99%
80%
20%
60%
90%
VO1
VO2
16
Commitment Policy in Practice
17
Current Issues
➢RRD / Disk Access
➢Perl / Interpreted Language Speed
➢Result Interpretation
➢Result Validation in Real Contexts
18
Future Work
➢“What Is Next? ”
✗More work Resource Usage Policy Analsys
✗“Export” ideas from VO-Ganglia in real pratice
19
Conclusions
➢“Why VO-Ganglia Is So 'Cool‘ for me?” ✗Some creative ideas
✗Easy to use
✗“Possibility to run on my laptop”
✗Provisioning tools for ✔Workload generation✔Result formatting
➢“Why Did I Invest More Than a Year in Developing It?”
20
Questions / Suggestions?
?