adding storage simulation capacities to the simgrid toolkit · i farfrom hdparm results i default...
TRANSCRIPT
Adding Storage Simulation Capacitiesto the SimGrid Toolkit
A. Lebre, A. Legrand, F. Suter, P. Veyre
Inria, Ecole des Mines de Nantes/LINACNRS/INRIA/University of Grenoble
IN2P3 Computing Center, CNRS
CCGrid 2015Shenzen, ChinaMay 06, 2015
Storage Simulation with the SimGrid Toolkit 1/17
Data is Everywhere!
Age of Data-Intensive Computational Sciences
I Data is the new source of scientific resultsI Fourth paradigm, Data deluge, Big Data, . . .I ↗ Volume, ↗ Velocity, ↗ Variety,
↗ Veracity, ↗ Value
Storage becomes more and more important
I Not only for historical big playersI E.g., High Energy Physics and LHC data
processing on data grids
I But in every scientific fieldI And on any large scale distributed infrastructure
I Clusters, Clouds, Grids, . . .
Storage Simulation with the SimGrid Toolkit 2/17
Data is Everywhere!
Age of Data-Intensive Computational Sciences
I Data is the new source of scientific resultsI Fourth paradigm, Data deluge, Big Data, . . .I ↗ Volume, ↗ Velocity, ↗ Variety,
↗ Veracity, ↗ Value
Storage becomes more and more important
I Not only for historical big playersI E.g., High Energy Physics and LHC data
processing on data grids
I But in every scientific fieldI And on any large scale distributed infrastructure
I Clusters, Clouds, Grids, . . .
Storage Simulation with the SimGrid Toolkit 2/17
Why Simulate Storage?
Storage: a performance driver to understand
I Independent of scale and type of the computing infrastructure
I As much important as computing and networkingI Simulation is a classical approach in performance evaluation
I Accuracy, Scalability, and Versatility are the keys
Specifics and concerns of storage subsystems may vary
I Data Centers ; Hierarchial (mass) storage subsystemsI Different types of media involved
I Supercomputers ; Large scale dedicated storage networkI High-speed network interconnect
I (MapReduce) Clusters ; Specific and tuned file systemI Reliable, scalable, and simple
I Grids and Clouds ; Set of services offered by multiple data centersI Hidden underlying infrastructures
Storage Simulation with the SimGrid Toolkit 3/17
Why Simulate Storage?
Storage: a performance driver to understand
I Independent of scale and type of the computing infrastructure
I As much important as computing and networkingI Simulation is a classical approach in performance evaluation
I Accuracy, Scalability, and Versatility are the keys
Specifics and concerns of storage subsystems may vary
I Data Centers ; Hierarchial (mass) storage subsystemsI Different types of media involved
I Supercomputers ; Large scale dedicated storage networkI High-speed network interconnect
I (MapReduce) Clusters ; Specific and tuned file systemI Reliable, scalable, and simple
I Grids and Clouds ; Set of services offered by multiple data centersI Hidden underlying infrastructures
Storage Simulation with the SimGrid Toolkit 3/17
Simulating Storage with SimGrid
What is SimGrid?I 15-years old project for the simulation
of distributed systemsI but lacking of a storage component
for about 10 years
I Open source, sustainable, widely used
I Available on http://simgrid.org
Main StrengthsI Versatility: simulates Grids, Clouds,
HPC, and P2P systems
I Fast and scalable simulation kernel
I Tractable models: fluid models andMax-Min fairness sharing
I (In)validation studies: simulationresults can be trusted
SURF
MSG SMPI SIMDAG
SIMIX
372435work
remaining
variable
530530
50664
245245
Concurrent
{Condition
...
...
processes
variables
App. spec. astask graph
...
x1
x2
x3
x3
+
xn
...
+
+ xn
...
...
Variables
x1
App. spec. as concurrent code
... ...
Activities
ResourceCapacities
6 CL2
6 CP1
6 CL1
6 CLm
Our claim
I Building a simulatorfrom scratch should beavoided
Storage Simulation with the SimGrid Toolkit 4/17
Simulating Storage with SimGrid
What is SimGrid?I 15-years old project for the simulation
of distributed systemsI but lacking of a storage component
for about 10 years
I Open source, sustainable, widely used
I Available on http://simgrid.org
Main StrengthsI Versatility: simulates Grids, Clouds,
HPC, and P2P systems
I Fast and scalable simulation kernel
I Tractable models: fluid models andMax-Min fairness sharing
I (In)validation studies: simulationresults can be trusted
SURF
MSG SMPI SIMDAG
SIMIX
372435work
remaining
variable
530530
50664
245245
Concurrent
{Condition
...
...
processes
variables
App. spec. astask graph
...
x1
x2
x3
x3
+
xn
...
+
+ xn
...
...
Variables
x1
App. spec. as concurrent code
... ...
Activities
ResourceCapacities
6 CL2
6 CP1
6 CL1
6 CLm
Our claim
I Building a simulatorfrom scratch should beavoided
Storage Simulation with the SimGrid Toolkit 4/17
Disclaimer
This talk is end-to-end-study-free
I Problem ; Idea ; Implementation ; Evaluation ; Problem solved!
But not contribution-free
I Comprehensive description of storage-related conceptsI Original API to develop SimGrid-based simulators
I Leveraging a sound and reliable simulation kernel
I Performance analysis of various types of disks ; Derived models
Our objective
I Convince you to use our proposal to conduct your storage-relatedsimulation studies
Storage Simulation with the SimGrid Toolkit 5/17
Disclaimer
This talk is end-to-end-study-free
I Problem ; Idea ; Implementation ; Evaluation ; Problem solved!
But not contribution-free
I Comprehensive description of storage-related conceptsI Original API to develop SimGrid-based simulators
I Leveraging a sound and reliable simulation kernel
I Performance analysis of various types of disks ; Derived models
Our objective
I Convince you to use our proposal to conduct your storage-relatedsimulation studies
Storage Simulation with the SimGrid Toolkit 5/17
Outline
Introduction
Adding Storage to SimGridConcepts and ModelsImplementation Highlights
Checking Modeling Assumptions
Added Value of Using SimGrid
Conclusions and Future Work
Storage Simulation with the SimGrid Toolkit 6/17
Concepts and Models
Basic Concepts
I File descriptorsI Description: Name (= full path) + size [+ user-level properties]
I Remark: no UNIX info, no contents
I Life cycle: Simulated entity created by open, destroyed by closeI Local operations: open, close, read, write, seek, tell, move, and deleteI Remote operations: move and copy
I Storage volumesI Description: Name + type + capacity + file list + mount point +
attach point + simulation modelI Remark: inert file list and no navigation in tree
I Life cycle: Instantiated at parsing timeI Operations: get file list and get [total, used, available] capacity
Fluid models: Tractable and fastI Assumptions (to be experimentally confirmed)
I Linearity, negligible latency, fair sharing
Maximize mina∈A ρaunder constraints{∑
a ∈ A using resource r ρa 6 Cr ,
Storage Simulation with the SimGrid Toolkit 7/17
Implementation Highlights
Comprehensive platform description
I Scalable XML format
Seamless remote operations
I I/O operations ; network transfersI in a store-and-forward mode
<storage_type id="SATA-II_HDD" size="500GB"
content_type="txt_unix"
content="unix_content.txt"
model="linear">
<model_prop id="r_bw" value="92MBps"/>
<model_prop id="w_bw" value="62MBps"/>
</storage_type>
<storage id="Disk1" typeId="SATA-II_HDD"
attach="bob"/>
<storage id="Disk2" typeId="SATA-II_HDD"
attach="alice"
content_type="txt_windows"
content="windows_content.txt" />
<host id="bob" power="1Gf">
<mount id="Disk1" name="/home"/>
<mount id="Disk2" name="/windows"/>
</host>
<host id="alice" power="1Gf">
<mount id="Disk2" name="c:"/>
</host>
<link id="link1" bandwidth="125MBps"
latency="50us"/>
<route src="bob" dst="alice"
symmetrical="YES">
<link_ctn id="link1"/>
</route>
Storage Simulation with the SimGrid Toolkit 8/17
Implementation Highlights
Comprehensive platform description
I Scalable XML format
Seamless remote operations
I I/O operations ; network transfersI in a store-and-forward mode
<storage_type id="SATA-II_HDD" size="500GB"
content_type="txt_unix"
content="unix_content.txt"
model="linear">
<model_prop id="r_bw" value="92MBps"/>
<model_prop id="w_bw" value="62MBps"/>
</storage_type>
<storage id="Disk1" typeId="SATA-II_HDD"
attach="bob"/>
<storage id="Disk2" typeId="SATA-II_HDD"
attach="alice"
content_type="txt_windows"
content="windows_content.txt" />
<host id="bob" power="1Gf">
<mount id="Disk1" name="/home"/>
<mount id="Disk2" name="/windows"/>
</host>
<host id="alice" power="1Gf">
<mount id="Disk2" name="c:"/>
</host>
<link id="link1" bandwidth="125MBps"
latency="50us"/>
<route src="bob" dst="alice"
symmetrical="YES">
<link_ctn id="link1"/>
</route>
Storage Simulation with the SimGrid Toolkit 8/17
Outline
Introduction
Adding Storage to SimGrid
Checking Modeling AssumptionsExperimental SetupIndependent AccessesConcurrent Accesses
Added Value of Using SimGrid
Conclusions and Future Work
Storage Simulation with the SimGrid Toolkit 9/17
Experimental Setup
TestbedI Grid’5000 experimental platform (http://www.grid5000.fr)
Name Model Interface Size Max. Bandwidthgriffon Hitachi HDP72503 SATA-II 320 GiB 79 MiB/secgranduc Seagate ST9146802SS SAS 146 GiB 84.7 MiB/secedel C400-MTFDDAA SATA/SSD 128 GiB 244.8 MiB/sec
Methodology
I Randomized benchmarks with FIO 2.0.8 managed with execoI Additional dd benchmark on granduc to cope with faulty raid controller
I Synchronous, non-buffered I/O operationsI Independent: From 32kiB to 2GiB with a fixed block size of 32KiBI Concurrent: 1 to 15 operations
I for 10, 50, 100, 500, 1024, and 2048 MiB files
Feel free to check and/or reproduce our results
I Everything is available online (http://dx.doi.org/10.6084/m9.figshare.1175156)
I Engines, raw data, analysis scripts, graphs and article sources
Storage Simulation with the SimGrid Toolkit 10/17
Modeling the Behavior of SATA-II Disks
I Top: Size vs. DurationI Confirms the linearity assumptionI But heteroscedastic behavior
I Variability proportional to sizeI Negligible latency
I Middle: Size vs. BandwidthI Independent of file sizeI Variability ; Random variables
I Bottom: Bandwidth distributionI Single mode but not following any
well-known distribution
Griffon (SATA II), Read Griffon (SATA II), Write
0
50
100
150
0 500 1000 1500 2000 0 500 1000 1500 2000File Size (in MiB)
Dur
atio
n (
in s
econ
ds)
67.6 MiB/s 50.4 MiB/s
Griffon (SATA II), Read Griffon (SATA II), Write
0
25
50
75
0 500 1000 1500 2000 0 500 1000 1500 2000File Size (in MiB)
Effe
ctiv
e B
andw
idth
(in
MiB
/sec
ond)
Griffon (SATA II), Read Griffon (SATA II), Write
0
100
200
300
400
40 60 80 20 40 60 80Effective Bandwidth (in MiB/second) Distribution
Cou
nt
Properties of derived model
I Linear w.r.t. bandwidth with no latencyI Modeling the bandwidth-dependent variability
I Inject sample distribution and draw random variable upon access
Storage Simulation with the SimGrid Toolkit 11/17
Modeling the Behavior of SSD Disks
I Top: Size vs. DurationI Linear with very little variability
I Middle: Size vs. BandwidthI Far from hdparm resultsI Default ext4 config prevents
getting maximum performance
I Bottom: Bandwidth distributionI Regular but not following any
well-known distribution
Edel (SSD), Read Edel (SSD), Write
0
5
10
15
0 500 1000 1500 2000 0 500 1000 1500 2000File Size (in MiB)
Dur
atio
n (
in s
econ
ds)
152.4 MiB/s 132.2 MiB/s
Edel (SSD), Read Edel (SSD), Write
0
50
100
150
0 500 1000 1500 2000 0 500 1000 1500 2000File Size (in MiB)
Effe
ctiv
e B
andw
idth
(in
MiB
/sec
ond)
Edel (SSD), Read Edel (SSD), Write
0
100
200
300
400
130 140 150 160 120 130 140 150Effective Bandwidth (in MiB/second) Distribution
Cou
nt
Properties of derived model (similar to SATA-II)
I Linear w.r.t. bandwidth with no latencyI Modeling the bandwidth-dependent variability
I Inject sample distribution and draw random variable upon access
Storage Simulation with the SimGrid Toolkit 12/17
Modeling Concurrent Accesses
Performance improvements on SSD
I Significant and non-linear for readsI When having more than one write
I Likely because of bad ext4 setup
On SAS and SATA-II
I Fixed bandwidth for writesI Linear decay fro reads
I Explained by arm movements
●
Edel (SSD), Read Edel (SSD), Write
Granduc (SAS), Read Granduc (SAS), Write
Griffon (SATA II), Read Griffon (SATA II), Write
0
50
100
150
200
250
0
50
100
150
200
250
0
25
50
75
0
25
50
75
0
25
50
75
100
0
25
50
75
100
0 5 10 15 0 5 10 15Number of concurrent operations
Agg
rega
ted
Ban
dwid
th (
MiB
/s)
Properties of derived model
I Modify resource capacity as concurrency increases
I Reevaluation each time a transfer begins or ends
I Easy to implement in SimGrid’s kernel
Storage Simulation with the SimGrid Toolkit 13/17
Outline
Introduction
Adding Storage to SimGrid
Checking Modeling Assumptions
Added Value of Using SimGridBuild (and Trust) your own SimulatorDesign (and Plug) your own Storage Model
Conclusions and Future Work
Storage Simulation with the SimGrid Toolkit 14/17
Build (and Trust) your own Simulator
RationaleI Developing a full DES from scratch is counterproductive!
I Already there: open, fast, and scalable kernel
I Better focus on the applicative part of the simulatorI With confidence on lower layers: (in)validated and reliable models
I Leverage versatilityI Mixing concepts 6= stacking features
Examples of added value
I Versatility ; Study more performance drivers w/o oversimplificationI Storage study + network interconnect + CPU heterogeneity
I (In)validation studies ; get realistic results, not just some resultsI Leverage predictive value in performance studies
I Scalable does not necessarily means inaccurateI Both can be obtained simultaneously
Storage Simulation with the SimGrid Toolkit 15/17
Design (and Plug) your own Storage Model
There is more than disks to modelI Tape libraries ; Access time (arm movements) + I/O time
I Combination of models
I Parallel/Distributed File Systems ; Disks + management layerI File system simulator + disks modelsI Model experienced throughput
I Storage on unknown infrastructures (Clouds) ; Black boxesI Model with bandwidth vs. #requests matrices
How to design and plug a new model?
I Designing and plugging a fluid model is pretty straightforwardI Behavior for a single operation + Sharing policy
I Instantiation is more complex (yet crucial)I Benchmarking and analysis procedures available online
I Contributions are welcomed!
Storage Simulation with the SimGrid Toolkit 16/17
Conclusion and Future Work
Conclusions
I Comprehensive description of storage-related conceptsI Original API to develop SimGrid-based simulators
I Leveraging a sound and reliable simulation kernel
I Thorough Performance analysis of various types of disksI Derived Fluid models ; tractable, fast, and accurate
I Only a first step . . .
Future Work
I Extend API to handle block storage, handle cache policiesI Integrate other resource models
I Only after thorough (in)validation studies
I Study other performance metrics (e.g., energy consumption)I Welcome contributions from external users
I Now I hope you are convinced to use SimGrid ©
Storage Simulation with the SimGrid Toolkit 17/17
Conclusion and Future Work
Conclusions
I Comprehensive description of storage-related conceptsI Original API to develop SimGrid-based simulators
I Leveraging a sound and reliable simulation kernel
I Thorough Performance analysis of various types of disksI Derived Fluid models ; tractable, fast, and accurate
I Only a first step . . .
Future Work
I Extend API to handle block storage, handle cache policiesI Integrate other resource models
I Only after thorough (in)validation studies
I Study other performance metrics (e.g., energy consumption)I Welcome contributions from external users
I Now I hope you are convinced to use SimGrid ©
Storage Simulation with the SimGrid Toolkit 17/17