© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tosh Tambe – AWS Strategic Alliances – Design / Engineering & HPC
Judd Kaiser – Program Mgr., Cloud Computing, ANSYS, Inc.
October 2015
CMP202
Engineering Simulation and
Analysis in the Cloud
What to Expect from the Session
• Overview of HPC usage in Design / Engineering
Simulation & Analysis (CAE)
• Understanding challenges of HPC users in CAE
• Customer Example: How ANSYS re-engineered their
CAE HPC solution for cloud deployment
Design & Engineering in Manufacturing
Data & Process (PLM)
Conceptual
Design
Engineering Design (CAD)
Simulation & Analysis
(CAE)
Tooling Design (CAM)
Production
Engineering / Design Simulation (CAE)
Anatomy of a CAE solution
Client Server
Data
Identity /
Access
Job
Management
Mapping HPC ApplicationsTightly Coupled (MPI/HPC)
Loosely Coupled (Grid/HTC)
Data-IntensiveRequires high-IOPS
storage, or has very
large datasets
Data-LightLess dependence on
high-IOPS, with smaller
datasets Financial simulations
Molecular modeling
Contextual search
Alt-coin mining
Animation rendering
Semiconductor verification
Image processing / GIS
Genomics
Seismic processing
High energy physics
Metagenomics
Human Brain Project
Fluid dynamics
Weather forecasting
Materials simulations
Crash simulations
Grid
Computing(“Pleasingly parallel”)
Grid with IO
Cluster
Computing
Cluster with IO(Data-intensive HPC)
Design Exploration with CAE
Toyota Motor Corporation’s TPS process
Source: Durward K. Sobek II, Allen C. Ward and Jeffrey K. Liker – sloanreview.mit.edu
Scaling Out Simulation: Platform Strategy
Multiphysics
Simulation
Systems
Engineering
Robust
Design
Simulation
Democratization
Simulation Trends Desktop Platform
Enterprise Platform
• HPC
• Data Mgmt.
• Process
Mgmt.
ANSYS EKM
“Friends don’t let friends build data centers.”
04/14/2014 – Charles Phillips, CEO, Infor
HPC on the cloud
Cloud-Based Strategies for Delivery
• Immediate access (training/demo)
• Short term use
• Burst to the cloud for HPC
• Cloud for short term projects
• End-to-end simulation
Application streaming
Software as a service
ANSYS Enterprise Cloud
Key Engineering Challenges
• HPC compute
• Interactive graphics
• Data management
• Solution deployment
Scale-out HPC
Amazon EC2 instances: Families and Generations
General-purpose: M3 , M4, T2
Compute-optimized: CC2, C3, C4
Memory-optimized: M2, CR1, R3
Dense-storage: HS1, D2
I/O-optimized: HI1, I2
GPU: CG1, G2
Micro: T1, T2
c4.largeInstance family
Instance generation
Instance size
Amazon EC2 Instances: Types and Sizes
Performance factors: CPU
Intel Xeon E5-2670 (Sandy Bridge) CPUs
• Available on M3, CC2, CR1, and G2 instance types
Intel Xeon E5-2680 v2 (Ivy Bridge) CPUs
• Available on C3, R3, and I2 instance types
• 2.8 GHz in C3, Turbo enabled up to 3.6 GHz
• Supports Enhanced Advanced Vector Extensions (AVX) instructions
Intel Xeon E5-2666 v3 (Haswell – AVX2) CPUs
• Available on C4, D2, and M4 instance types
• 2.9 GHz in C4, Turbo enabled up to 3.5 GHz (with Intel Turbo Boost)
• Supports AVX2 instructions
C4: CPU-Optimized Haswell Instance Type
• 2.9 GHz Intel Xeon E5-2666v3 (Haswell) CPU
• Turbo enabled to 3.5 GHz
• Multiple instances sizes with 2, 4, 8, 16, 36 vCPUs
• From 3.75GiB to 60GiB RAM
• Optimized for use with Elastic Block Storage (SSD) for
higher IOPS
Performance Factors: Networks
AWS proprietary 10 Gb networking
• Highest performance in .8xlarge instance sizes
• Full bi-section bandwidth in placement groups
Enhanced networking
• Available on D2, C3, C4, M4, R3, I2
• Over 1M PPS performance, reduced instance-
to-instance latencies, consistent performance
ANSYS Enterprise Cloud: HPC on AWS
Auto-scaling HPC provisions resources on-demand, using machine
configurations optimized for specific workloads.
• Scale on demand
• Match compute instances to workloads
• Optimize steady state (reserved) and on-demand AWS spend
CycleCloud: Managing WorkLoad Lifecycles
Teardown
• Reporting
• Usage tracking
• Auditing
Provisioning
• On-demand
• Spot pricing
• Multi-provider
Configuration
• Chef (Puppet)
• Cluster-Init
Monitoring
• Auto-scaling
• Job tracking
• Error handling
Parallel Solver Performance
Solve real problems…
…and solve as many
designs as you like.
Interactive Graphics
Performance Factors: Accelerators
NVIDIA GPUs! • For computing and for remote
graphics
• In EC2 CG1 and G2 instances
• GPU accelerators augment CPU-
based computing by offloading
specialized processing
• Performance gains depend on
application-level support
ANSYS Enterprise Cloud: Graphics
Remote rendering delivers 3D graphics performance and large memory,
providing a high-end workstation experience in the cloud.
• Rendering on Linux g2.2xlarge
• r3 application server running Windows, up to 244 GB of RAM
Large Memory or GPU Instance? Why Not Both?
Technical User
Thin viewer
DCV protocol
over
HTTP(S)
DCV
Proxy
R3
G2
Op
en
GL
Gra
ph
ics
off
loadin
g
HW Acceleration
and Compression
Large memory
models
G2 multiplexing
Leverage latest NVIDIA GRID API• 3 to 10 times less bandwidth
• Lower latency in pixel capture
Optimized network usage• Dynamic image compression, with quality boost for still images
Memory rightsizing for each problem / model size
24+ FPS with the most demanding use cases• Fully interactive collaboration sessions
Workstation class responsiveness even across continental links
Collab.
Data Management
Data Management
• Amazon RDS for database
• Takes are of data management tasks such as snapshots and
back-ups
• Amazon EBS for work-in-progress data
• In the cluster file system – match capacity to customer
requirements
• Using SSD – good performance
• Amazon S3 for archive data
• Low cost, high redundancy
• High quality back-up solution
• Multi-part upload allows rapid data transfer
Solution Deployment
Solution Deployment
• We deploy a full solution in a dedicated customer account
• Set of CloudFormation templates
• Use CloudFormation parameters and Lambda features to
customize the deployment in multiple regions
• Also use Cycle to deploy elastic infrastructure using Chef
• Full deployment in less than three hours, as compared
with months for on premise deployment
Summary
• Workloads such as CAE benefit from HPC on AWS
• Several AWS instance families ideal for HPC
• Pick the optimal ones
• HPC benefits from AWS networking services
• Placement groups, enhanced networking
• Focus on good user experience
• Data and user management are key
• Make deployment easy for users
• AWS CloudFormation and AWS Lambda
• Leverage the AWS partner ecosystem for technology building blocks
Remember to complete
your evaluations!
Thank you!