gpu-accelerated design optimization on the cloud
TRANSCRIPT
![Page 1: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/1.jpg)
GPU-Accelerated
Design Optimization
on the Cloud
Krishnan Suresh
Associate Professor
Mechanical Engineering
![Page 2: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/2.jpg)
Design Optimization
2
Reduce weight
subject to constraints
(GE/GrabCAD)
A structure subject to loading
Design Optimization
![Page 3: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/3.jpg)
Domains
3(OptiStruct)
(Generico)
![Page 4: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/4.jpg)
4
Big Players, Big $
� ANSYS
� Abaqus
� Altair/OptiStruct
� Nastran
� SolidWorks
� &
$10 billion investment annually technavio.com
![Page 5: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/5.jpg)
Design Optimization on the Cloud
5
Browser driven design optimization
![Page 6: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/6.jpg)
Design Optimization on the Cloud
6
Client
• No software/hardware investment
• Pay as you go
• Anywhere, anytime
• &
Service Provider
• Easier maintenance
• Larger market
• &
![Page 7: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/7.jpg)
3D-Printing
Democratization of
fabrication
![Page 8: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/8.jpg)
Design Optimization to 3D Printing
8
Democratization of design
![Page 9: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/9.jpg)
Catch?
9
![Page 10: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/10.jpg)
Design Optimization
DesignSpace
Finite Element Analysis(FEA)
Optimal?
ChangeDesign
No
10^5 ~ 10^7 dof
Solve Kd = f
K: Sparse SPD
100’s of iterations!
10
![Page 11: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/11.jpg)
Design Optimization Cost
11
A naïve port
to cloud will not work!
� OptiStruct (commercial code)
� Xeon E5 2697, 92 GB
� 20 hours!
![Page 12: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/12.jpg)
Cloud Based Design Optimization
Fast Limited
Memory FEA
Pareto
Optimization
GPU
Acceleration
WebGL
![Page 13: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/13.jpg)
Fast Limited
Memory FEA
Cloud Based Design Optimization
![Page 14: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/14.jpg)
FEA Bottleneck: Kd = f
DesignSpace
Finite Element Analysis(FEA)
Optimal?
ChangeTopology
No
10^5 ~ 10^7 dof
Solve Kd = f
K: Sparse SPD
100’s of iterations!
14
==
![Page 15: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/15.jpg)
15
Kd = f (GTC)
� Fine-grained Parallel Preconditioners
� CULA
� MAGMA
� Accelerating Iterative Linear Solvers
� Efficient AMG on Hybrid GPU Clusters
� Preconditioning for Large-Scale Linear Solvers
� &
![Page 16: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/16.jpg)
� Exploit mesh congruency
� Exploit physics behavior
� K constantly changing
� &
Design Optimization
Kd = f
![Page 17: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/17.jpg)
17
Exploit Mesh Congruency (GTC 2014)
Kd f=
Model DiscretizeAssemble/
Solve
Post-process
Mesh-aware SpMV Acceleration: Congruence
![Page 18: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/18.jpg)
Element Congruency
18
62350 elements
2780 distinct
95.5% congruent
Observation: Large-meshes contain many similar elements!
Elements are ‘rigid-body/scaling’ congruent
⇒ Identical element stiffness Ke
Only store Ke of distinct elements
![Page 19: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/19.jpg)
Implication: SpMV
19
( )1
Classic: N
e
i
Kd K d=
≡ ∑
: Sparse Matrix-Vector Multiplication (SpMV)
Critical operation in ALL iterative solvers
Kd
( )1
Assembly-free: N
e e
i
Kd K d=
≡∑
Only store Ke of distinct elements + Assembly Free
![Page 20: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/20.jpg)
Experiment
20
106 Elements
1 Distinct element
0
200
400
600
800
1000
Assembled AF-CPU AF-GPU
SpMV; Kd (msec)
770
37
- Same number of FLOPS!
- Reduced memory
CPU
Mesh-aware Kd
Naïve Kd
One Kd (SpMV)
![Page 21: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/21.jpg)
21
Physics Aware Deflation
Kd f=
Model DiscretizeAssemble/
Solve
Post-process
![Page 22: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/22.jpg)
Physics Aware Deflation
0
TK W KW=%
0K K<<%
Agglomeration/Grouping
Treat each group as rigid body
Deflated CG
Kd f=
![Page 23: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/23.jpg)
Example
23
3.15 million DOF
![Page 24: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/24.jpg)
Fast Limited
Memory FEA
Pareto
Optimization
1. Mesh Congruency
2. AF Deflation
Cloud Based Design Optimization
![Page 25: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/25.jpg)
Design Optimization
K Matrix: Constantly changing
� Update K?
� Update deflation?
( )1
Assembly-free: N
e e
i
Kd K d=
≡∑Skip deleted finite elements
SpMV accelerates further
0
TK W KW=%
0 ( )T
eK K W K W= − ∆% %
K K<<%
![Page 26: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/26.jpg)
Example: Design Optimization
� OptiStruct (commercial)
� Xeon E5 2697, 92 GB
� 20 hours!
� Pareto
� I7 4770, 8 GB
� 42 mins
![Page 27: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/27.jpg)
Framework
Fast Limited
Memory FEA
GPU
Acceleration
Pareto
Optimization
![Page 28: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/28.jpg)
Mesh Aware SpMV on GPU
![Page 29: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/29.jpg)
Deflation on GPU
TW dWµProlongation Restriction
![Page 30: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/30.jpg)
Example: Design Optimization
� OptiStruct (commercial)
� Xeon E5 2697,92 GB
� 20 hours!
� Pareto
� I7 4770,8 GB
� 42 mins
� Pareto
� GTX 480,1.5 GB
� 6 mins
![Page 31: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/31.jpg)
Cloud Based Design Optimization
Fast Limited
Memory FEA
Pareto
Optimization
GPU
Acceleration
WebGL
![Page 32: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/32.jpg)
32
WebGL & Three.js
WebGL
� JavaScript API for 3D graphics in browsers
� www.khronos.org
� Almost all browsers
ThreeJs
� Higher-level library
� www.threejs.org
� Almost all browsers
![Page 33: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/33.jpg)
33
Finally …
![Page 34: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/34.jpg)
34
A Pilot Service
www.cloudtopopt.com
� Entry level server
• E3-1270 V3
• 8 GB
� Limited to 150,000 degrees of freedom
� 300+ users
![Page 35: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/35.jpg)
www.cloudtopopt.com
35
![Page 36: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/36.jpg)
www.cloudtopopt.com
36
![Page 37: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/37.jpg)
37
Plans
� Port to HPC provider
� NSF funding
� Launch startup
www.cloudtopopt.com
![Page 38: GPU-Accelerated Design Optimization on the Cloud](https://reader034.vdocuments.us/reader034/viewer/2022052620/628e50ec8b15824d7b4b1973/html5/thumbnails/38.jpg)
Acknowledgements
Praveen Yadav Shiguang Deng Amir M. Mirzendehdel Chaman Singh Alireza Taheri Bian Xiang
Anirudh Krishnakumar Anirban Niyogi Victor Cavalcanti Cameron Gilanshah Yibo Hu Alex Buehler
Funding� NSF
� Air-force
� Luvata
� Autodesk
� Sandia National Lab
[email protected] www.cloudtopopt.com