datasys laboratory dr. ioan raicu juan carlos hernández munuera, ms 2011 hui jin, tonglin li paper...
TRANSCRIPT
![Page 1: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/1.jpg)
SimMatrix: SIMulator for MAny-Task computing
execution fabRIc at eXascales
Ke WangData-Intensive Distributed Systems Laboratory
Computer Science DepartmentIllinois Institute of Technology
February 14th, 2012
![Page 2: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/2.jpg)
Acknowledgements
• DataSys Laboratory• Dr. Ioan Raicu• Juan Carlos Hernández
Munuera, MS 2011 • Hui Jin, Tonglin Li• Paper submission:
– Ke Wang, Ioan Raicu. “SimMatrix: Exploring Many-Task Computing through Simulations at Exascales”, under review at ACM HPDC 2012SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 2
![Page 3: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/3.jpg)
Outline
• Introduction & Motivation• Long-Term Aims and Contributions• SimMatrix Architecture• Implementation• Evaluation• Related Work• Contributions• Future Work & Conclusion
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 3
![Page 4: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/4.jpg)
Outline
• Introduction & Motivation• Long-Term Aims and Contributions• SimMatrix Architecture• Implementation• Evaluation• Related Work• Contributions• Future Work & Conclusion
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 4
![Page 5: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/5.jpg)
Distributed Systems
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 5
![Page 6: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/6.jpg)
0
50
100
150
200
250
300
2004 2006 2008 2010 2012 2014 2016 2018
Nu
mb
er
of
Co
res
0
10
20
30
40
50
60
70
80
90
100
Ma
nu
fac
turi
ng
Pro
ce
ss
Number of CoresProcessing
Pat Helland, Microsoft, The Irresistible Forces Meet the Movable Objects, November 9th, 2007
Manycore Computing
• Today (2011): Multicore Computing– O(10) cores commodity architectures– O(100) cores proprietary architectures– O(1000) GPU hardware threads
• Near future (~2018): Manycore Computing– ~1000 cores/threads commodity architectures
6SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales
![Page 7: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/7.jpg)
Exascale Computing
Top500 Performance Development,
http://top500.org/static/lists/2011/11/TOP500_201111_Poster.pdf 7
• Today (2012): 10 Petaflop Computing– O(100K) nodes (100X in the last 10 years) – O(1M) cores (1000X in the last 10 years)
• Near future (~2018): Exaflop Computing– ~1M nodes (10X) – ~1B processor-cores/threads (1000X)
![Page 8: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/8.jpg)
Major Challenges of Exascale Computing
• Concurrency– Parallel programmability
• Resilience– MTTF decreases, MPI suffers
• I/O and Memory– Minimizing data movement
• Heterogeneity– Accelerators, GPUs, MIC
• Energy– 20MW limitation
8SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales
![Page 9: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/9.jpg)
MTC: Many-Task Computing
Number of Tasks
Input Data Size
Hi
Med
Low1 1K 1M
HPC(Heroic
MPI Tasks)
HTC/MTC(Many Loosely Coupled Tasks)
MapReduce/MTC(Data Analysis,
Mining)
MTC(Big Data and Many Tasks)
• Bridge the gap between HPC and HTC
• Applied in clusters, grids, and supercomputers
• Loosely coupled apps with HPC orientations
• Many activities coupled by file system ops
• Many resources over short time periodsSimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 9
![Page 10: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/10.jpg)
MTC Middleware
• Falkon– Fast and
Lightweight Task Execution Framework
– http://datasys.cs.iit.edu/projects/Falkon/index.html
• Swift– Parallel
Programming System
– http://www.ci.uchicago.edu/swift/index.php
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 10
![Page 11: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/11.jpg)
Outline
• Introduction & Motivation• Long-Term Aims and Contributions• SimMatrix Architecture• Implementation• Evaluation• Related Work• Contributions• Future Work & Conclusion
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 11
![Page 12: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/12.jpg)
Long-Term Aims
• Address major exascale computing challenges:– Concurrency– Resilience– I/O and Memory– Heterogeneity
• Explore techniques to enable MTC at exascales• Design, Analyze, and Implement a distributed data-aware
execution fabric (MATRIX) supporting HPC/MTC workloads• Integrate MATRIX with parallel programming systems (e.g.
Swift, Charm++, MapReduce) and with the FusionFS distributed file system
• Prove that MTC applications can scale to exascales
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 12
![Page 13: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/13.jpg)
This Work’s Contributions
• Explore techniques to enable MTC to scale to exascales– Design, Analyze, and Implement a discrete-event
simulator (SimMatrix) enabling the study of MATRIX at extremely large scales (e.g. exascales)
– Identified work stealing as a viable technique to achieve load balance at exascales
– Provide evidence that work stealing is scalable by identifying optimal parameters affecting the performance of work stealing
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 13
![Page 14: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/14.jpg)
Outline
• Introduction & Motivation• Long-Term Aims and Contributions• SimMatrix Architecture• Implementation• Evaluation• Related Work• Contributions• Future Work & Conclusion
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 14
![Page 15: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/15.jpg)
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales
OverviewJob Scheduling Systems
• Efficiently manage the distributed computing power of workstations, servers, and supercomputers in order to maximize job throughput and system utilization.– Load balancing is critical
• Different scheduling strategies– Centralized scheduling hinders the scalability– Hierarchical scheduling has long job turnaround time – Distributed scheduling is a promising approach at exascales
• Work Stealing – a distributed scheduling strategy – Starved processors steal tasks from overloaded ones– Various parameters affect performance:
• Number of tasks to steal• Number of neighbors• Static or Dynamic random neighbors
15
![Page 16: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/16.jpg)
SimMatrix Architecture
Client
Submit tasks
Submit tasks
ClientArbitrary Node
Figure 1: Simulation architectures; the left part is the centralized one with a single dispatcher connecting all nodes, the right part is the homogeneous distributed topology with each node having the same number of cores and neighbors
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 16
Dispatcher
![Page 17: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/17.jpg)
Simulations
• Continuous time simulations– Abandoned the idea of creating a separate thread
per simulated node: we found that on our 48-core system with 256GB of memory, we were limited to 32K threads
• Discrete event simulations– The only viable approach (today) to explore
scheduling techniques at exascales (millions of nodes and billions of cores)
– Created a unique object per simulated node, and converted any behavior to an event
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 17
![Page 18: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/18.jpg)
Outline
• Introduction & Motivation• Long-Term Aims and Contributions• SimMatrix Architecture• Implementation• Evaluation• Related Work• Contributions• Future Work & Conclusion
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 18
![Page 19: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/19.jpg)
At the Heart of SimMatrixGlobal Event Queue
Figure 2: Event State Transition DiagramSimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 19
• All events are inserted to the queue, sorted based on the occurrence time ascending
• Handle the first event, advance the simulation time and update the event queue
• Implemented as red-black tree based “TreeSet” in Java, which ensures Θ(log ) 𝑛time for insert & remove
![Page 20: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/20.jpg)
Simulator Features
• Node load information– Nested hash maps provides extremely fast
performance at large scales• Dynamic Task Submission
– Aims to reduce the memory foot-print• Dynamic Poll interval
– Exponential backoff to reduce the number of messages and increase speed of simulation
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 20
![Page 21: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/21.jpg)
Implementation
• SimMatrix is developed in JAVA– Sun 64-bit JDK version 1.6.0_22– 1500 lines of code– Code accessible at:
• http://datasys.cs.iit.edu/projects/SimMatrix/index.html
• SimMatrix has no other dependencies
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 21
![Page 22: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/22.jpg)
Outline
• Introduction & Motivation• Long-Term Aims and Contributions• SimMatrix Architecture• Implementation• Evaluation• Related Work• Contributions• Future Work & Conclusion
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 22
![Page 23: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/23.jpg)
Experiment Environment
• Fusion system:– fusion.cs.iit.edu– 48 AMD Opteron cores at 1.93GHz– 256GB RAM– 64-bit Linux kernel 2.6.31.5– Sun 64-bit JDK version 1.6.0_22
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 23
![Page 24: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/24.jpg)
Metrics
• Throughput– Number of tasks finished per second. Calculated as
total-number-of-tasks/simulation-time. • Efficiency
– The ratio between the ideal simulation time of completing a given workload and the real simulation time. The ideal simulation time is calculated by taking the average task execution time multiplied by the number of tasks per core.
• Load Balancing– We adopted the coefficient variance of the number of tasks finished by each
node as a measure the load balancing. The smaller the coefficient variance, the better the load balancing is. It is calculated as the standard-deviation/average in terms of number of tasks finished by each node.
• Scalability– Total number of tasks, number of nodes, and number of cores supported.
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 24
![Page 25: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/25.jpg)
Workloads
• Synthetic workloads: – Uniform distributions with different average task
lengths, such as 10s (ave_10), 100s (ave_100), 1000s (ave_1000), 5000s (ave_5000), 10000s (ave_10000), and 100000s (ave_100000); also all tasks of 1 sec each (all_1)
• Realistic application workloads: – General MTC workload from 2008-2009 trace of
173M tasks; average task length 64±486s (mtc_64), using Gamma Distribution
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 25
![Page 26: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/26.jpg)
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales
Validation
Validate SimMatrix against the state-of-the-art MTC systems (e.g. Falkon), to ensure that the simulator can accurately predict the performance of current petascale systems. 26
![Page 27: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/27.jpg)
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales
Comparing Work Stealing to Falkon’s Naïve Distributed Scheduler
27
Fine grained workloads:• 2% 99.3%
efficiency increase
Coarse grained workloads:• 99%
99.999% efficiency increase
![Page 28: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/28.jpg)
Scalability1M Nodes and 10B tasks
Memory consumption• <13 KB/task• <200 GB
CPU Time• <90 us/task• <260 hours
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 28
![Page 29: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/29.jpg)
Scalability1M Nodes and 10B tasks
Efficiency• 90%+
Co-variance• <0.06• Load
imbalance of <600 tasks from 10K tasks per node
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 29
![Page 30: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/30.jpg)
Work Stealing ParametersNumber of Tasks to Steal
30
Stealing half of neighbor’s work is best
strategy!0%
10%20%30%40%50%60%70%80%90%
100% No. of Tasks to Stealsteal_1steal_2steal_logsteal_sqrtsteal_half
No. of Nodes
Effici
ency
![Page 31: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/31.jpg)
Work Stealing ParametersNumber of Neighbors (Static)
31
Requires linear number of neighbors for good
performance!
0%10%20%30%40%50%60%70%80%90%
100% No. of Static Neighbors
nb_2nb_lognb_sqrtnb_eighthnb_quarnb_half
No. of Nodes
Effici
ency
![Page 32: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/32.jpg)
Work Stealing ParametersNumber of Neighbors (Dynamic Random)
32
An increasing number of neighbors are needed for 90%+ efficiency, with the largest scales requiring square root neighbors (e.g. 1K
neighbors from 1M nodes!0%10%20%30%40%50%60%70%80%90%
100% No. of Dynamic Random Neighbors
nb_1nb_2nb_lognb_sqrt
No. of Nodes
Effici
ency
![Page 33: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/33.jpg)
Work Stealing ParametersOptimal Parameters Generality
33
The same optimal parameters achieve 90%+ efficiency across many different
workloads!0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100% Different Workloads
ave_1064+/-486ave_100ave_1000
No. of Nodes
Effici
ency
![Page 34: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/34.jpg)
Work StealingThroughput
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 34
Centralized scheduling has severe bottleneck, especially for workload with fine granularity. Distributed scheduling has great scalability, for workload with coarse granularity, there is no obvious upper bound
0.125
0.5
2
8
32
128
512
2048
8191.99999999999
32767.9999999999
131072
524287.999999998
2097151.99999999
8388607.99999998
33554431.9999998
134217727.999999
Centralized(ave_5000)Distributed(ave_5000)Centralized(all_1)Distributed(all_1)
No. of Nodes
Thro
ughp
ut(t
asks
/sec
)
0
100000
200000
300000
400000
500000
600000
700000
Ave No. of Messages / tasks
No. of Nodes
Ave
rage
No
. of
Mes
sage
s pe
r Ta
sk
![Page 35: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/35.jpg)
Load Balancing Visualization1024 Nodes and Ave_5000 Workload
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 35
Good Load Balancing
Square Root Dynamic Neighbors
Starvation
Square Root Static Neighbors
Good Load Balancing
Quarter Static Neighbors
Starvation
2 Static Neighbors
![Page 36: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/36.jpg)
Summary Plot for Distributed Scheduling
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 36
Steady state utilization is ~100% at exascales
![Page 37: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/37.jpg)
Outline
• Introduction & Motivation• Long-Term Aims and Contributions• SimMatrix Architecture• Implementation• Evaluation• Related Work• Contributions• Future Work & Conclusion
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 37
![Page 38: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/38.jpg)
Related Work
• Real Job Scheduling Systems: – Condor (University of Wisconsin), Bradley et al, 2012 – PBS (NASA Ames) , Corbatto et al, 2012 – LSF Batch (Platform Computing of Toronto), 2011– Falkon (University of Chicago), Raicu et al, SC07
• Job Scheduling System Simulators:– simJava (University of Edinburgh), Wheeler et al, 2004 – GridSim (University of Melbourne, Australia), Buyya et al, 2010
• Load Balancing: – Neighborhood averaging scheme, Sinha et al, 1993 – Charm++ (UIUC), Zheng et al, 2011
• Scalable Work Stealing– Dinan et al, SC09– Blumofe et al, Scheduling multithreaded computations by work stealing, 1994
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 38
![Page 39: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/39.jpg)
Outline
• Introduction & Motivation• Long-Term Aims and Contributions• SimMatrix Architecture• Implementation• Evaluation• Related Work• Contributions• Future Work & Conclusion
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 39
![Page 40: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/40.jpg)
Contributions
• Designed, Analyzed, and Implemented a discrete-event simulator (SimMatrix) enabling the study of MTC workloads at exascales
• Identified work stealing as a viable technique to achieve load balance at exascales
• Provided evidence that work stealing is scalable by finding optimal parameters affecting the performance of work stealing– Number of tasks to steal is half– Dynamic random neighbors strategy is required– There must be a squared root number of neighbors
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 40
![Page 41: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/41.jpg)
Outline
• Introduction & Motivation• Long-Term Aims and Contributions• SimMatrix Architecture• Implementation• Evaluation• Related Work• Contributions• Future Work & Conclusion
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 41
![Page 42: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/42.jpg)
Future Work
• Explore work stealing for manycore processors with 1000 cores
• Enhancing the network topology model to allow complex networks
• Insight from SimMatrix will be used to develop MATRIX, a distributed task execution fabric– MATRIX will employ work stealing for distributed load
balancing– MATRIX will be integrated with other projects, such as
Swift (a data-flow parallel programming systems) and FusionFS(a distributed file systems)
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 42
![Page 43: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/43.jpg)
Conclusion
• Exascale systems bring great opportunities in unraveling of significant scientific mysteries
• There are significant challenges to achieve exascales, such as concurrency, resilience, I/O and memory, heterogeneity, and energy
• MTC requires a highly scalable and distributed task/job management system at large scales– Distributed scheduling is likely an efficient way to achieve
load balancing, leading to high job throughput and system utilization
• Work stealing is a scalable method to achieve load balance at exascales given the optimal parameters
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 43
![Page 44: DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649e415503460f94b33b4c/html5/thumbnails/44.jpg)
• More information:– http://datasys.cs.iit.edu/~kewang/ – http://datasys.cs.iit.edu/projects/SimMatrix/
• Contact:– [email protected]
• Questions?
More Information
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascales 44