In-situ visualization
: integrating visualization with simulation
KISTI Supercomputing Center
Gibeom Gu
Contents
� HPC environment
� Trends in Top500
� Hareware organization� Hareware organization
� In-situ visualization
� Co-processing & post-processing
� In-situ processing
� Challenges & Issues
� Case study
� Paraview co-processing toolkit
� libsim library (VisIt)
� Concluson
Trends in Top500
Trends in Top500
Expecting sustained exa-FLOPS system near 2018 ..
Exascale hardware in a Nutshell
� # of nodes, network : no dramatic changes expected
� system size / complexity : expected to grow
� node architecture : expected to undergo dramatic
changes
� massively parallel
� multiple processor types
� multiple (programmable) memory types (scratchpad)
� generally more heterogeneous / hierarchical than today
� Memory : FLOPS ratio � expected to get worse
“Programming environments at the Exascale”, CRAY
Hardware organization
GPU cluster Computing system
node 1
Multiple GPUsMulti-core CPU
node 1
AcceleratorMulti-core CPU
Node O(100)
. . .
Multiple GPUsMulti-core CPU
node 2
Multiple GPUsMulti-core CPU
Node O(1000)
. . .
node 2
AcceleratorMulti-core CPU
AcceleratorMulti-core CPU
Storage
Multiple GPUs
Dump simulation dataVisualization
Multi-core CPUAcceleratorMulti-core CPU
Hardware organization
GPU cluster Computing system
node 1
Multiple GPUsMulti-core CPU
node 1
AcceleratorMulti-core CPU
Parallel network lines : O(100)Gbps
Node O(100)
. . .
Multiple GPUsMulti-core CPU
node 2
Multiple GPUsMulti-core CPU
Node O(1000)
. . .
node 2
AcceleratorMulti-core CPU
AcceleratorMulti-core CPU
Parallel network lines : O(100)Gbps
Storage
Multiple GPUs
Dump simulation dataVisualization
Multi-core CPUAcceleratorMulti-core CPUP
arallel network lines : O(100)Gbps
Hardware organization
Large cluster
node 1
Multiple GPUsMulti-core CPU
Node O(1000)
. . .
Multiple GPUsMulti-core CPU
node 2
Multiple GPUsMulti-core CPU
Multiple GPUsMulti-core CPU
Storage
•Visualization of dataset whose size ranges from
500 billion (2TB per timestep) to 2 trillion cells
•# of CPUs involved : 8,000 ~ 32,000 cores
•Performance :
• Disk I/O : 2+ min on 16,000 cores
• Contouring : ~ 10 sec.
• Rendering : 1 ~ 10 sec.
Large scale simulation & visualization
� High performance computers with millions of processing
elements
� Writing results to disk is a bottleneck� Writing results to disk is a bottleneck
� Major peformance hit preventing interactive data exploration
� Solution to directly visualize the progress of simulation
w/o the need to save the entire data to disk
� Live connection to the simulation code� Live connection to the simulation code
� Peek at any memory arrays and mesh structures
� Confirm the correct simulation setup and iterations
In-situ visualization
Simulation & Visualization
A Study of In-Situ Visualization for Petascale Combustion Simulations
Co-processing & Post-processing
� Post-processing
� Simulations can take many days(weeks) to finish
� Output is dumped to storage system and studied at
later time
� Disk I/O is the slowest operation : doesn’t scale well
� Disadvantages� Disadvantages
� Datasets are often under-sampled on disks
� Many time steps are never archived
Co-processing & Post-processing
� Co-processing
� Dedicated visualization machine connected to the
supercomputer with a fast network
� Simulation output is directly transferred to the
visualization machine for immediate processing
� Disadvantages
� Visualization hardware is traditionally smaller than
supercomputers
In-situ processing
� In-situ processing
� Enable interactive data analysis and visualization
� Extract feature of interest � offline analysis
� Most effective way to reduce data output
� Data reduction
� Feature extraction� Feature extraction
� Quality assessment
In-situ processing
� Data reduction
� Subsampling : timestep skipping, lower mesh
resolutionresolution
� Quantization : reduced precision
� (non) uniform scalar quantization, vector quantization, ...
� Transform-based compression : DCT, wavelet, ...
In-situ processing
� Feature extraction
� Feature : a particular physical structure, pattern, or event
of interest (vortex, shock, eddy, critical point, etc.)of interest (vortex, shock, eddy, critical point, etc.)
� Feature extraction can significantly reduce storage
requirements
� Based on computer vision, image processing, machine learning,
etc.
In-situ processing
� Quality assessment
� Identify & quantify the loss of data quality
� Full reference models
� Mean square error, PSNR, ...
� Not applicable to largescale applications
� Reduced reference approach
� Only important statistical information extracted from the � Only important statistical information extracted from the
orginal data is used for quality evaluation
Challenges & issues
� Visualization must interact directly with the simulation
� Simulation and visualization codes must share the same
data structures to avoid replicationdata structures to avoid replication
� Not all simulation codes can share data seamlessly with
the codes for visualization
� Visualization workload balancing is difficult
� Parallel visualization : optimized for the visualization
algorithm itself
� Data partition and distribution is dictated by the simulation
code
Challenges & Issues
� Supercomputer time is expensive
� Most scientists are reluctant to use their
supercomputer time for visualization calculations
� Visualization calculations must incur a low cost
Case study
- Co-processing with ParaView
- libsim library (VisIt)
Co-processing with ParaView
� ParaView (Kitware, ASC, SNL, LANL)
� Open-source, multi-platform data analysis and
visualization application
� Support distributed computation models to process large
data sets
� Open, flexible, and intuitive user interface.
� Extensible architecture based on open standards� Extensible architecture based on open standards
Co-processing with ParaView
� ParaView co-processing toolkit
� Integrate core data processing with the simulation to
enable scalable data analysis
� Adaptor
� Passes a VTK data set or composite data set
libsim library (VisIt)
� VisIt (LLNL)
� Interactive parallel visualization and graphical
analysis tool for viewing scientific data on Unix and analysis tool for viewing scientific data on Unix and
PC platforms
� Visualization of scalar, vector, and tensor data set
� Qualitative and quantitative visualization and analysis
� Supports multiple mesh types
� Parallel & distributed architecture for visualizing terascale
data sets
� Interfaces with C++, Python, and Java
� Extensible with dynamically loaded plug-ins
libsim library (VisIt)
� libsim library for visIt
� Lets VisIt connect to simulation code and operate in-
situ on its data arrays
• Add functions to simulation that let VisIt connect
• Add functions to simulation that expose arrays as data VisIt will
process
• Link simulation with libsim
• Run the simulation and connect with VisIt
• User will be able to perform any of VisIt’s operations on
simulation data VisIt runtime
datacommands
simulation data
• Advance the simulation and watch plots update
• New features
• Species
• Vector,Tensor data
• AMR meshes
• CSG meshes
• Users don’t allocate memory
• Additional error checking
• Write in C, Fortran, or Python
• Windows support
Simulation
libsim
Glue code
VisIt runtime
Conclusion
� Extreme scale simuliation
� petaFLOPS systems are available now, and exaFLOPS
system is expected near 2018system is expected near 2018
� Storage & network I/O bottleneck, memory bottleneck
� It is desirable to render data in-situ for monitoring
and steering a simulation
� Direct interaction between visualization and simulation� Direct interaction between visualization and simulation
� Needs advanced visualization techniques to implement in-
situ approach
� Several open source tools are already available
References
① http://www.scidac.gov
② http://www.top500.org
③ Hongfeng Yu, et al, A Study of In-Situ Visualization for Petascale
Combustion Simulations, 2009Combustion Simulations, 2009
④ Jean M. Favre, In-situ Visualization Computational Steering, 2011.
⑤ Brad Chamberlain, Programming models and programming environments
at the Exascale, 2010.
⑥ Kwan-Liu Ma, et. al, In-Situ Processing and Visualization for Ultra Scale
Simulations, Journal of Physics Conference Series, vol.78, 2007.
⑦ Brad Whitlock, et al, Parallel In Situ Coupling of Simulation with a Fully ⑦ Brad Whitlock, et al, Parallel In Situ Coupling of Simulation with a Fully
Featured Visualization System, EGPGV, 2011
⑧ Kenneth Moreland, et al., In-Situ visualization with the ParaView
Coprocessing Library, SAND 2010-6270P, 2010
⑨ Jean M. Favre, Simulations go Live, a.k.a. In-situ visualization