ramses: robust analytical models for science at extreme …
TRANSCRIPT
![Page 1: RAMSES: Robust Analytical Models for Science at Extreme …](https://reader031.vdocuments.us/reader031/viewer/2022012512/618bba0e8394de544e3aedc9/html5/thumbnails/1.jpg)
RAMSES: Robust Analytical Models for Science at Extreme Scale
Presenter: Raj Kettimuthu (Argonne)
PI: Ian Foster (Argonne)Co-PIs: Gagan Agrawal (Ohio State), Nagi Rao (ORNL), Brad Settlemyer (LANL), Brian Tierney (LBL), and Don
Towsley (UMass)
![Page 2: RAMSES: Robust Analytical Models for Science at Extreme …](https://reader031.vdocuments.us/reader031/viewer/2022012512/618bba0e8394de544e3aedc9/html5/thumbnails/2.jpg)
Project Overview
Experiments
Database
Modeling
Estimation
Advisor
Estimators
Evaluators
Tester
Tools Develop easy-to-use tools to provide end-users with actionable advice
Develop and apply data-driven estimation methods: differential regression, surrogate models, etc.
Develop, evaluate, and refine component and end-to-end models
Conduct extensive, automated experiments to test models and build database
![Page 3: RAMSES: Robust Analytical Models for Science at Extreme …](https://reader031.vdocuments.us/reader031/viewer/2022012512/618bba0e8394de544e3aedc9/html5/thumbnails/3.jpg)
Exemplar Science Workflows
§ Five science workflows § Span a broad range of DOE science domains and modeling
problems§ File Transfer§ Light Source Workflows
– Tomographic Reconstruction, Diffuse Scattering § Distributed MapReduce§ In-situ Analysis§ Exascale Simulations
![Page 4: RAMSES: Robust Analytical Models for Science at Extreme …](https://reader031.vdocuments.us/reader031/viewer/2022012512/618bba0e8394de544e3aedc9/html5/thumbnails/4.jpg)
TCP Throughput Profiles
tconcaveregion
convexregion
RTT - ms
Thro
ughp
ut -
Gbps
§ Most common TCP throughput profile convex function of rtt
§ Observed dual-mode profiles: emulated 0-366ms rttconnections
– CUBIC, STCP: Smaller RTT - Concave region, Larger RTT- Convex region
§ Concave regions very desirable – Throughput does not decay as fast, rate of decrease slows down as rtt
![Page 5: RAMSES: Robust Analytical Models for Science at Extreme …](https://reader031.vdocuments.us/reader031/viewer/2022012512/618bba0e8394de544e3aedc9/html5/thumbnails/5.jpg)
Models of single TCP connections
STCP CUBIC
• Models account for increase/decrease rules, rtt, link capacity, max receive window size
• Validation against measurements• Useful for selecting best version and troubleshooting• Future directions: other versions, e.g., UDT, multiple
connections, account for I/O interactions
![Page 6: RAMSES: Robust Analytical Models for Science at Extreme …](https://reader031.vdocuments.us/reader031/viewer/2022012512/618bba0e8394de544e3aedc9/html5/thumbnails/6.jpg)
UDP-Based Transport: UDT
§ For dedicated 10G links, UDT provides higher throughput than CUBIC (linux default) § TCP and UDT Throughput transition-point depends on connection parameters –rtt, loss rate, host – NIC parameters, IP and UDP parameters§ Disk-to-Disk transfers (xdd) have lower transfer rate
xdd-read
xdd-write
CUBICSingle stream
UDT
![Page 7: RAMSES: Robust Analytical Models for Science at Extreme …](https://reader031.vdocuments.us/reader031/viewer/2022012512/618bba0e8394de544e3aedc9/html5/thumbnails/7.jpg)
Data Driven Models for File Transfer
§ Combines historical data with a correction term for current external load
§ Takes three pieces of input § Signature for a given transfer
– Concurrency level – Total known concurrency at source (“known load at source”) – Total known concurrency at destination (“known load at destination”)– File Size
§ Historical data – Transfer concurrency, known loads, and observed throughput for the
source-destination pair § Signatures and observed throughputs from the most recent
transfers for the source-destination pair § It produces an estimated throughput as an output.
![Page 8: RAMSES: Robust Analytical Models for Science at Extreme …](https://reader031.vdocuments.us/reader031/viewer/2022012512/618bba0e8394de544e3aedc9/html5/thumbnails/8.jpg)
Data Driven Models for File Transfer
§ Transfer Scheduling Algorithms– SEAL: Schedule transfers minimize average transfer slowdown– STEAL: Minimize slowdown for best-effort transfers and maximize
bandwidth utilization for batch transfers
Destination ≤1GB >1GB, ≤10GB >10G Overall
Gordon 4.26 9.3 6.67 8.31
Mason 3.55 9.4 8.22 8.76
Yellowstone 2.78 8.0 8.1 6.84
Blacklight 5.96 4.41 5.27 4.93
Darter 7.70 4.03 2.63 4.73
![Page 9: RAMSES: Robust Analytical Models for Science at Extreme …](https://reader031.vdocuments.us/reader031/viewer/2022012512/618bba0e8394de544e3aedc9/html5/thumbnails/9.jpg)
SEAL Evaluation – Turnaround Time 60% Load
![Page 10: RAMSES: Robust Analytical Models for Science at Extreme …](https://reader031.vdocuments.us/reader031/viewer/2022012512/618bba0e8394de544e3aedc9/html5/thumbnails/10.jpg)
Modeling In-situ Analysis
§ How often should we perform the analyses?
§ How often should the analyses output be written?
§ Analyses parameters– Time (Initialization, Auxiliary, Output)– Memory (Fixed, Auxiliary, Output)– Minimum interval between consecutive
steps– Importance – Threshold time for analyses
10
§ System parameters– I/O bandwidth– Rate of computation– Available memory
Problem Size
Net
wor
k ba
ndw
idth
/Pr
oces
s cou
nt
![Page 11: RAMSES: Robust Analytical Models for Science at Extreme …](https://reader031.vdocuments.us/reader031/viewer/2022012512/618bba0e8394de544e3aedc9/html5/thumbnails/11.jpg)
Results: Scheduling Analyses within Threshold
TotalThreshold (sec)
R1 (Radius of gyration)
R2 (Membrane density profile 2D histogram)
R3 (Protein density profile 2D histogram)
% within threshold
200 10 4 7 94.59
100 10 2 3 85.99
60 10 1 2 86.01
20 10 1 0 86.11
10 10 0 0 0.3
Table: Analysis frequencies, analysis times, and corresponding thresholds for 1 billion atoms rhodopsin simulation (1000 steps) in LAMMPS on 32768 cores (2048 nodes) of Mira.
Simulation: Rhodopsin protein benchmark, which consists of a protein embedded in a membrane and solvated with water and ions using LAMMPS.
11
Observation: More than 80% of the allowed threshold is used for analyses, when threshold > 20 s.
R1 R2 R3
Time
Mem
ory
![Page 12: RAMSES: Robust Analytical Models for Science at Extreme …](https://reader031.vdocuments.us/reader031/viewer/2022012512/618bba0e8394de544e3aedc9/html5/thumbnails/12.jpg)
Two In-Situ Modes
12
Time Sharing Mode: Minimizes memory consumption
Space Sharing Mode: Enhances resource utilization when simulation reaches its scalability bottleneck
§ Model computational part (MapReduce-like processing)§ Model memory
– Data locality between simulation and analytics (initial work only)
![Page 13: RAMSES: Robust Analytical Models for Science at Extreme …](https://reader031.vdocuments.us/reader031/viewer/2022012512/618bba0e8394de544e3aedc9/html5/thumbnails/13.jpg)
Performance Modeling with Disk Model for K-means with MATE(File Size = 1GB, K = 50, Num of Iterations = 1)
Modeling Computational Component in MATE/Smart
![Page 14: RAMSES: Robust Analytical Models for Science at Extreme …](https://reader031.vdocuments.us/reader031/viewer/2022012512/618bba0e8394de544e3aedc9/html5/thumbnails/14.jpg)
Modeling Computation Time for Parallel Tomographic Reconstruction
§ Computation– Number of intersected rays, and
horizontal and vertical linest x col^2 x (|sin(θ)|+|cos(θ)|)
14
0100002000030000400005000060000700008000090000
100000
0 10 20 30 40 50 60 70 80 90 100
110
120
130
140
150
160
170
Horizontal Vertical Total
0
0.005
0.01
0.015
0.02
0.025
65000
70000
75000
80000
85000
90000
95000
1 11 21 31 41 51 61 71 81 91 101
111
121
131
141
151
161
171
Estimated Real Error Ratio
…
P0
P2
P1
…P n
T0
T1
… T n
T2
![Page 15: RAMSES: Robust Analytical Models for Science at Extreme …](https://reader031.vdocuments.us/reader031/viewer/2022012512/618bba0e8394de544e3aedc9/html5/thumbnails/15.jpg)
Estimated Execution time vs. Real Reconstruction Time
RAMSES Meeting
15
0
500000000
1E+09
1.5E+09
2E+09
2.5E+09
3E+09
3.5E+09
4E+09
4.5E+09
0
100
200
300
400
500
600
700
80012
8
256
384
512
640
768
896
1024
1152
1280
1408
1536
1664
1792
1920
2048
Real Time
Estimated Exec Time (wrt 2K)
Estimated Computation
![Page 16: RAMSES: Robust Analytical Models for Science at Extreme …](https://reader031.vdocuments.us/reader031/viewer/2022012512/618bba0e8394de544e3aedc9/html5/thumbnails/16.jpg)
Questions?
![Page 17: RAMSES: Robust Analytical Models for Science at Extreme …](https://reader031.vdocuments.us/reader031/viewer/2022012512/618bba0e8394de544e3aedc9/html5/thumbnails/17.jpg)
Additive increase and additive decrease (AIAD) for optimal stream
Go to ”Insert (View) | Header and Footer" to add your organization, sponsor, meeting name here; then, click "Apply to All"17
For every step c (fixed number of epoch), do the following:
![Page 18: RAMSES: Robust Analytical Models for Science at Extreme …](https://reader031.vdocuments.us/reader031/viewer/2022012512/618bba0e8394de544e3aedc9/html5/thumbnails/18.jpg)
Tubes (ANL) to DMZ (UChicago)
Go to ”Insert (View) | Header and Footer" to add your organization, sponsor, meeting name here; then, click "Apply to All"18