high-performance inverse modeling with reverse … inverse modeling with reverse monte carlo...
TRANSCRIPT
![Page 1: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/1.jpg)
High-Performance Inverse Modelingwith Reverse Monte Carlo Simulations
Abhinav Sarje1
Xiaoye Sherry Li Alexander Hexemer1Computational Research Advanced Light Source
Lawrence Berkeley National Laboratory
09.10.14
International Conference for Parallel Processing 2014, Minneapolis
![Page 2: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/2.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Background and Motivation
• X-ray scattering is an important tool for scientists to probe structuralproperties of materials at nano-scales.
• SAXS, Small Angle X-ray scattering, is a widely used type.
• SAXS is primarily used for non-crystalline/amorphous materials.
• Data challenge: Currently about 25-50 TB / month of raw data fromONE beamline, and growing nearly exponentially. A Synchrotron has10s-100 beamlines.
• Beamline scientists have not had HPC resources available. Need tobridge the gap.
• Many materials are too complex to use sophisticated modelingtechniques.
• Reverse Monte Carlo simulation works well with SAXS data.
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 3: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/3.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
The General RMC Modeling Algorithm
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 4: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/4.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Faster Updates of Fourier Transform
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 5: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/5.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
A Scaled RMC Modeling Algorithm
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 6: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/6.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
A Scaled RMC Modeling Algorithm
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 7: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/7.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
A Scaled RMC Modeling Algorithm
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 8: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/8.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
A Scaled RMC Modeling Algorithm
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 9: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/9.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
A Scaled RMC Modeling Algorithm
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 10: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/10.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Autotuned Model Temperature
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 11: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/11.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Distributed-Memory Parallelization over Multiple Tiles
• MPI for distributed memory level.
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 12: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/12.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Distributed-Memory Parallelization over Multiple Tiles
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 13: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/13.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Distributed-Memory Parallelization of a Single Tile
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 14: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/14.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Subtile Implementations
• Two main types of kernels:1 Data parallel.2 Reduction.
• Multicore CPU implementation with OpenMP.
• Graphics Processor (GPU) implementation with Nvidia CUDA.
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 15: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/15.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Validation with Model Reconstruction
ActualModels
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 16: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/16.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Validation with Model Reconstruction
ActualModels
Start Models
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 17: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/17.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Validation with Model Reconstruction
ActualModels
Start Models
ReconstructedModels
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 18: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/18.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Model Reconstruction
Simulated and Experimental Patterns Computed Model
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 19: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/19.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Experimental Performance: Environment
• All computations are in double precision.
1 Single node platform:• Dual-socket 2.2 GHz 8-core Intel Xeon E5-2660 (Sandy Bridge)• 16 cores and 32 hardware threads with 2 NUMA regions, and 64 GB RAM• Nvidia K20X (Kepler) graphics processor.
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 20: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/20.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Experimental Performance: Number of iterations
101
102
103
104
105
106
107
108
5 10
20
40
80
160
320
500
Tim
e [m
s]
Number of iterations [x1000]
TotalFFT Update
ReductionMiscMPI
Multicore CPU
101
102
103
104
105
106
107
108
5 10
20
40
80
160
320
500
Tim
e [m
s]
Number of iterations [x1000]
TotalFFT Update
ReductionMiscMPI
Kepler GPU
• Tile size of 512 × 512.
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 21: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/21.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Experimental Performance: Scaling with number of tiles
101
102
103
104
105
106
107
108
1 2 4 8 16
32
64
Tim
e [m
s]
Number of tiles
TotalFFT Update
ReductionMiscMPI
Multicore CPU
101
102
103
104
105
106
107
108
1 2 4 8 16
32
64
Tim
e [m
s]
Number of tiles
TotalFFT Update
ReductionMiscMPI
Kepler GPU
• Total of 10,000 iterations.
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 22: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/22.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Experimental Performance: Scaling with tile size
101
102
103
104
105
106
107
108
64
128
256
512
102
4
204
8
409
6
Tim
e [m
s]
Tile size
TotalFFT Update
ReductionMiscMPI
Multicore CPU
101
102
103
104
105
106
107
108
64
128
256
512
102
4
204
8
409
6
Tim
e [m
s]
Tile size
TotalFFT Update
ReductionMiscMPI
Kepler GPU
• Total of 40,000 iterations.
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 23: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/23.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Experimental Performance: Environment
1 Multicore CPU cluster:• Cray XE6 supercomputer (‘Hopper’ at NERSC/LBNL)• Each node is a dual-socket 2.1 GHz 12-core AMD MagnyCours processors• 24 cores per node with 4 NUMA domains, and 32 GB RAM
2 GPU cluster:• Cray XK7 supercomputer (‘Titan’ at OLCF/ORNL)• Each node is a 2.2 GHz 16-core AMD Opteron 6274 (Interlagos)• 16 cores per node with 2 NUMA domains, and 32 GB RAM• One Nvidia K20X (Kepler) GPU on each node.
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 24: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/24.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Experimental Performance: Weak scaling
104
105
106
107
24
48
96
192
384
768
153
6
307
2
614
4
122
88
Tim
e [m
s]
Number of Cores
t=2p, s=2048t=p, s=2048t=p, s=512
t=p/2, s=512
Hopper
104
105
106
1 2 4 8 16
32
64
128
256
512
102
4
Tim
e [m
s]
Number of Nodes (GPUs)
t=2p, s=2048t=p, s=2048t=2p, s=512t=p, s=512
Titan
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 25: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/25.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Experimental Performance: Strong scaling
104
105
106
107
108
109
1010
24
48
96
192
384
768
153
6
307
2
614
4
122
88
Tim
e [m
s]
Number of Cores
t=1024, s=2048t=1024, s=512t=128, s=2048t=128, s=512
Hopper
103
104
105
106
107
108
1 2 4 8 16
32
64
128
256
512
Tim
e [m
s]
Number of Nodes (GPUs)
t=1024, s=2048t=1024, s=512t=128, s=512
Titan
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 26: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/26.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Conclusions
1 Synchrotron beamlines need HPC for data processing to be efficient.
2 Monte-Carlo and Reverse Monte-Carlo simulations are highly usedmethods in scientific computing, and same parallelization techniquescan be applied.
3 RMC works when all other sophisticated methods fail.
4 Reduction operations are nearly one order of magnitude slower onGPUs compared to multicore CPUs. Better caching and lesssynchronizations on CPUs are two primary factors.
5 Even with large number of reduction kernels in the application, at scalethe overall performance on GPUs is about an order of magnitudebetter than on multicore CPUs!
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 27: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/27.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Future Work
1 Non-binary model reconstruction.
2 3-D structures reconstruction.
3 Time-series fitting.
4 Realtime?
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 28: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/28.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Acknowledgements
• Supported by the Director, Office of Science, of the U.S. Department ofEnergy under Contract No. DE-AC02-05CH11231, and DOE EarlyCareer Research grant awarded to Alexander Hexemer.
• Resource usage at National Energy Research Scientific Computing Center(NERSC) at the Lawrence Berkeley National Laboratory, supported bythe Office of Science of the U.S. Department of Energy under ContractNo. DE-AC02- 05CH11231.
• Resource usage at Oak Ridge Leadership Computing Facility (OLCF) at theOak Ridge National Laboratory, supported by the Office of Science of theU.S. Department of Energy under Contract No. DE-AC05-00OR22725.
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov
![Page 29: High-Performance Inverse Modeling with Reverse … Inverse Modeling with Reverse Monte Carlo Simulations Abhinav Sarje1 Xiaoye Sherry Li Alexander Hexemer 1Computational Research Advanced](https://reader031.vdocuments.us/reader031/viewer/2022021817/5aa2daf47f8b9a84398d9551/html5/thumbnails/29.jpg)
Introduction The RMC Methodology A High-Performance RMC Parallel RMC Performance Conclusions
Thank you!
High-Performance Inverse Modeling with Reverse Monte Carlo Simulations. ICPP’14 asarje @ lbl.gov