understanding hardware selection to speedup your · pdf fileunderstanding hardware selection...

79
© 2013 ANSYS, Inc. May 5, 2015 1 Understanding Hardware Selection to Speedup Your FEA Simulations Wim Slagter, PhD ANSYS, Inc.

Upload: lamnhi

Post on 06-Feb-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 20151

Understanding Hardware Selection to Speedup Your FEA Simulations

Wim Slagter, PhD

ANSYS, Inc.

Page 2: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 20152

• Why Talking About Hardware

• HPC Terminology

• ANSYS Work-flow

• Hardware Considerations

• Additional resources

Agenda

Page 3: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 20153

• Why Talking About Hardware

• HPC Terminology

• ANSYS Work-flow

• Hardware Considerations

• Additional resources

Agenda

Page 4: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 20154

Most Users Constrained by Hardware

Source: HPC Usage survey with over 1,800 ANSYS respondents

Page 5: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 20155

Problem Statement

I am not achieving the performance and throughput I was

expecting from my hardware & software

Image courtesy of Intel Corporation

Page 6: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 20156

Building A Balanced System Is The Key To Improving Your Experience

If Your System Is

Slow So Are Your

Engineers &

Analysts Processors

Memory

Storage

Networks

Page 7: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 20157

What Hardware Configuration to Select?

The right combination of hardware and software

leads to maximum efficiency

SMP vs. DMP

HDD vs. SSD

Interconnects?

Clusters?

GPUs?CPUs?

Page 8: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 20158

• Why Talking About Hardware

• HPC Terminology

• ANSYS Work-flow

• Hardware Considerations

• Additional resources

Agenda

Page 9: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 20159

HPC Hardware Terminology

Machine 1 (or Node 1)

GPU

Processor 1 (or Socket 1)

Processor 2 (or Socket 2)

Interconnect(GigE or InfiniBand)

Machine N (or Node N)

GPU

Processor 1 (or Socket 1)

Processor 2 (or Socket 2)

Page 10: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201510

Machine 1 (or Node 1)

Shared Memory Parallel

• Single Machine Parallel (SMP) systems share a single global memory image that may be distributed physically across multiple cores, but is globally addressable.

• OpenMP is the industry standard.

Processor 1 (or Socket 1)

Page 11: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201511

Distributed Memory Parallel

• Distributed memory parallel processing (DMP) assumes that physical memory for each process is separate from all other processes.

• Parallel processing on such a system requires some form of message passing software to exchange data between the cores.

• MPI (Message Passing Interface) is the industry standard for this.

Machine 1 (or Node 1)

Processor 1 (or Socket 1)

Page 12: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201512

• Why Talking About Hardware

• HPC Terminology

• ANSYS Work-flow

• Hardware Considerations

• Additional resources

Agenda

Page 13: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201513

Typical HPC Growth Path

Cluster UsersDesktop UserWorkstation and/or

Server Users

Cloud Solution

Page 14: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201514

• Ideal for

– remote users submitting jobs from a Windows machine to a Linux cluster or local users submitting jobs to a Linux cluster

– users that do not have enough power (memory or graphics) on their local workstation to build large meshes or view graphics.

• ANSYS 15.0 supports the following remote visualization applications

– Nice Desktop Cloud Visualiation (DCV) 2012.2

• Linux server + Linux/Windows client

– OpenText Exceed onDemand 8 SP3

• Linux server + Linux/Windows client

– RealVNC Enterprise Edition 5.0.4 (with VirtualGL)

• Linux server + Linux/Windows client

– (on Windows cluster: Microsoft Remote Desktop)

• Hardware requirements for remote visualization servers require:

– GPU capable video cards

– large amounts of RAM accessible for multiple user availability when running ANSYS applications and pre/post processing

Remote Visualization

Page 15: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201515

Desktop Server Cluster (with 3rd party scheduler)

The Remote Solve Manager (RSM) is a GUI-based, job queuing system that distributes simulation tasks to (shared) computing resources

RSM enables tasks to be

• Run in background mode on the local machine

• Sent to a remote compute machine

• Broken into a series of jobs for parallel processing across a variety of computers

RSM as a scheduler RSM as a transport mechanism

• Submits to RSM itself.

• Unit recognition: jobs (e.g. a run of a solver such as CFX, Fluent or Mechanical)

• Submits through RSM to a high-level scheduler such as LSF, PBS Pro,Windows HPC Server 2008 R2 / 2012, and Univa Grid Engine (at R15.0).

• Unit recognition: cores

ANSYS Remote Solve Manager (RSM)

Page 16: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201516

Submission from a client to a centralized (shared) compute resource, allowing

• back-ground queuing on a centralized machine

• multiple users to share a common, usually large memory/fast machine (compared to client machine)

Submission from a client to a centralized (shared) compute resource with a job scheduler, allowing

• back-ground queuing on a centralized machine that submits to a job scheduler (e.g. LSF)

• multiple users to run multi-node jobs on shared compute resources

Submission from a client to multiple (shared) compute resources, allowing

• back-ground queuing on a centralized machine that submits to other machines (compute servers)

• multiple users to share user workstations (often at night) using the RSM “Limit Times for Job Submission” feature

RSM Usage Scenarios

Page 17: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201517

• Improved robustness and scalability

• Added support for Univa Grid Engine

• Added support for Mechanical/MAPDL restart

• Non-root users on Linux can now use RSM wizard

• Enriched support for RSM customization

• Added component override for design point update

• Improved efficiency of Design Point updates…

Design objectives:• Equal fresh and exhaust gas mass flow

distribution to each cylinder• To minimize the overall pressure dropInput parameters:• Radii of 3 fillets near inlet (8 design points)

~5.0x speed-up over sequential execution

Parametric, Optimization of Intake Manifold

Initial

Optimized

RSM Enhancements at R15.0

Page 18: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201518

• Know your hardware lifecycle

• Have a goal in mind for what you want to achieve.

• Using Licensing productively

• Using ANSYS provided processes effectively.

Guidelines :

Page 19: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201519

• Why Talking About Hardware

• HPC Terminology

• ANSYS Work-flow

• Hardware Considerations

• Additional resources

Agenda

Page 20: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201520

HDD vs. SSD

What Hardware Configuration to Select?

SMP vs. DMP Interconnects?Clusters?

GPUs?CPUs?

Page 21: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201521

Understanding the effect of clock speed

Generally, ANSYS applications scale with clock frequency

• Cost/performance argues for high clock (but maybe not top bin)

Impact of CPU Clock on Application Performance

Processor: Xeon X5600 Series

Hyper Threading: OFF, TURBO: ON

Active cores: 12/node; Memory speed: 1333 MHz(performance measure is improvement relative to CPU Clock 2.66 GHz)

0.80

0.85

0.90

0.95

1.00

1.05

1.10

1.15

1.20

1.25

1.30

1.35

1.40

Clock Ratio eddy_417K aircraft_2M turbo_500K sedan_4M truck_14M

ANSYS/FLUENT Model

Imp

rov

em

en

t d

ue

to

Clo

ck

2.66 GHz

2.93 GHz

3.47 GHz

Hig

he

r is

be

tte

r

ANSYS DMP benchmarks (8 core)

• Clock effect is highest for sparse solver

Using higher clock speed is always

helpful to realize productivity gains

Page 22: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201522

Generation to Generation - ANSYS Mechanical

Current Processors are Up to 1.98X faster

Than processors that are 3 years old Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific

computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you

in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.. For more information go to http://www.intel.com/performance. Results are

estimated by Intel using the SPEC benchmark software cited and are provided for informational purposes only. Copyright © 2013, Intel Corporation.

Configuration: ANSYS Mechanical: Xeon 1280v3(16GB,4xSSD Raid0), Xeon 1660v2(32GB,4xSSD,Raid0), Xeon E5-2687W v2(128GB,4xSSD Raid0).. Intel Internal measurements as of August 2013. Refer to

backup for additional details.

* Other names and brands may be claimed as the property of others.

Page 23: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201523

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific

computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you

in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.. For more information go to http://www.intel.com/performance. Results are

estimated by Intel using the SPEC benchmark software cited and are provided for informational purposes only. Copyright © 2013, Intel Corporation.

Configuration: ANSYS Mechanical: Xeon 1280v3(16GB,4xSSD Raid0), Xeon 1660v2(32GB,4xSSD,Raid0), Xeon E5-2687W v2(128GB,4xSSD Raid0).. Intel Internal measurements as of August 2013. Refer to

backup for additional details.

* Other names and brands may be claimed as the property of others.

Intel® Xeon® processor E5-2687W is up to 3.77X faster

than an entry-level workstation with a single processor

Which Intel Processor Might Meet Your Needs - ANSYS Mechanical

Page 24: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201524

Understanding the effect of memory bandwidth- Is 24 Cores Equal to 24 Cores?

3 x (2 x 4) = 24 cores

x5570

x5570x5570

2 x (2 x 6) = 24 cores

x5670

x5670

Page 25: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201525

Understanding the effect of memory bandwidth- Is 24 Cores Equal to 24 Cores?

3 x (2 x 4) = 24 cores

x5570

x5570x5570

2 x (2 x 6) = 24 cores

x5670

x5670

Consider memory per core!

Page 26: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201526

Understanding the effect of memory bandwidth- Is 16 Cores Equal to 16 Cores?

2 x (2 x 4) = 16 cores 2 x (2 x 4) = 16 cores

x5570

x5570 x5670

x5670

Using less cores per node can be

helpful to realize productivity gains

Page 27: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201527

Understanding the effect of memory bandwidth- ANSYS Mechanical

Consider memory per core!

Page 28: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201528

Understanding the effect of memory speed

• We can see here the effect of memory speed.

• This has implications on how you build your hardware.

• Some processors types have slower memory speeds by default.

• On other processors non-optimally filling the memory channels can slow the memory speed.

Using higher memory speed can be

helpful to realize productivity gains

Page 29: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201529

Turbo Boost (Intel) / Turbo Core (AMD)- ANSYS Mechanical

• Effect of Turbo Boost on the SMP benchmarks using 1, 2, 4 and 8 out of 8 physical cores of 1 node

• Turbo Boost most efficient for the lower core counts

Imp

act

of

Turb

o B

oo

st -

Spee

d

# of cores

Using Turbo Boost / Core can be

helpful to realize productivity gains

- particularly for lower core counts

Page 30: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201530

Hyper-threading

Hyper-threading is NOT

recommended

Page 31: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201531

• Faster cores mean faster solution

• Faster memory means faster solution

• Memory bandwidth is an important factor for (linear) scale-ability

• Turbo Boost/Turbo Core modes do give some benefit especially at low core counts per node.

• In general hyperthreading should not be used because of licensing implications.

Recap

Processor Hardware Tips

Page 32: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201532

HDD vs. SSD

What Hardware Configuration to Select?

SMP vs. DMP Interconnects?Clusters?

GPUs?CPUs?

Page 33: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201533

• Need fast interconnects to feed fast processors– Two main characteristics for each interconnect: latency and bandwidth

– Distributed ANSYS is highly bandwidth bound

+--------- D I S T R I B U T E D A N S Y S S T A T I S T I C S ------------+

Release: 14.5 Build: UP20120802 Platform: LINUX x64

Date Run: 08/09/2012 Time: 23:07

Processor Model: Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz

Total number of cores available : 32

Number of physical cores available : 32

Number of cores requested : 4 (Distributed Memory Parallel)

MPI Type: INTELMPI

Core Machine Name Working Directory

----------------------------------------------------

0 hpclnxsmc00 /data1/ansyswork

1 hpclnxsmc00 /data1/ansyswork

2 hpclnxsmc01 /data1/ansyswork

3 hpclnxsmc01 /data1/ansyswork

Latency time from master to core 1 = 1.171 microseconds

Latency time from master to core 2 = 2.251 microseconds

Latency time from master to core 3 = 2.225 microseconds

Communication speed from master to core 1 = 7934.49 MB/sec Same machine

Communication speed from master to core 2 = 3011.09 MB/sec QDR Infiniband

Communication speed from master to core 3 = 3235.00 MB/sec QDR Infiniband

Understanding the effect of the interconnect

Page 34: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201534

Understanding the effect of the interconnect- ANSYS Mechanical

V13sp-5 Model

Turbine

geometry

2,100 K DOF

SOLID187 FEs

Static, nonlinear

One iteration

Direct sparse

Linux cluster (8

cores per node) 0

10

20

30

40

50

60

8 cores 16 cores 32 cores 64 cores 128 cores

Rat

ing

(ru

ns/

day

)

Interconnect Performance

Gigabit Ethernet

DDR Infiniband

Page 35: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201535

Understanding the effect of the interconnect- ANSYS Mechanical

3 Millions DOF using direct sparse solverSolid95 elements, worst case for a direct solver

0

1000

2000

3000

4000

5000

6000

16 32 64 128

Wall Time(secs)

Cores

TrueScale versus GigEIn Core Memory

TrueScale

Gig-E

Using faster interconnects can be

helpful to realize productivity gains

- particularly at higher core/node counts

Page 36: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201536

GiGE (Gigabit Ethernet)

– 1 Gbits/sec ( 100 MB/sec )

10 GiGE

– 10 Gbits/sec ( 1000 MB/sec )

Myrinet (Myricom, Inc)

– 2 Gbits/sec ( 250 MB/sec )

– Myri 10G – 10 Gbits/sec (4th generation Myrinet)

Infiniband (many vendors/speeds)

– SDR/DDR/QDR

– 1x, 4x, 12x

– http://en.wikipedia.org/wiki/List_of_device_bandwidths

Not recommended!!

Bare minimum!!

Understanding the effect of the interconnect- ANSYS Mechanical

RECOMMENDATION

Over 1000 MB/s, especially when

running on more than 4 nodes

Page 37: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201537

10GiGE and InfiniBand are recommended for HPC Clusters.

• Currently InfiniBand only for large clusters is recommended

• QDR should be more than adequate for small to medium clusters. FDR for large clusters.

For more than 1 node you will see performance decrease using GiGE.

• For Mechanical users do not use GiGE at all if their jobs span more than one node.

Recap

Interconnect Tips

Page 38: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201538

HDD vs. SSD

What Hardware Configuration to Select?

SMP vs. DMP Interconnects?Clusters?

GPUs?CPUs?

Page 39: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201539

Understanding the effect of I/O- ANSYS Mechanical

89

145

180

301

419

89

146

180

275

384

88

144

180

283

368

88

124 118

95

52

0

50

100

150

200

250

300

350

400

450

1X1 1X2 1X4 1X8 1X16

4XSSD-RAID-0-SATA-3Gb/s

2XSSD-RAID-0-SATA-3Gb/s

SSD-SATA-6Gb/s

HD(7.2K RPM)-SATA-6Gb/s

29GB 33GB 35.6GB 40.8GB 47.8GB

Rat

ing

(jo

bs/

day

)

#Machine X #CoreMemory

SP-5 (in-core) R14.5 Benchmark Results

Page 40: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201540

Understanding the effect of I/O- ANSYS Mechanical

89

145

180

301

419

89

146

180

275

384

88

144

180

283

368

88

124 118

95

52

0

50

100

150

200

250

300

350

400

450

1X1 1X2 1X4 1X8 1X16

4XSSD-RAID-0-SATA-3Gb/s

2XSSD-RAID-0-SATA-3Gb/s

SSD-SATA-6Gb/s

HD(7.2K RPM)-SATA-6Gb/s

29GB 33GB 35.6GB 40.8GB 47.8GB

Rat

ing

(jo

bs/

day

)

#Machine X #CoreMemory

SP-5 (in-core) R14.5 Benchmark Results

Page 41: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201541

Understanding the effect of I/O- ANSYS Mechanical

89

145

180

301

419

89

146

180

275

384

88

144

180

283

368

88

124 118

95

52

0

50

100

150

200

250

300

350

400

450

1X1 1X2 1X4 1X8 1X16

4XSSD-RAID-0-SATA-3Gb/s

2XSSD-RAID-0-SATA-3Gb/s

SSD-SATA-6Gb/s

HD(7.2K RPM)-SATA-6Gb/s

29GB 33GB 35.6GB 40.8GB 47.8GB

Rat

ing

(jo

bs/

day

)

#Machine X #CoreMemory

SP-5 (in-core) R14.5 Benchmark Results

Page 42: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201542

Is Your Hardware Ready for HPC?- ANSYS Mechanical

100

200

400

600

800

1000

1200

I/O [Mb/s]

RAM [Gb]

8 16 32 48 64 96 128

2x S

SD

1x S

SD

2x S

AS

1x S

AS

0.2 Mdof

2 Mdof

4 Mdof

> 6 Mdof

Page 43: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201543

HDD vs. SSD

What Hardware Configuration to Select?

SMP vs. DMP Interconnects?Clusters?

GPUs?CPUs?

Page 44: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201544

DMP Outperforming SMP

6 Mio Degrees of FreedomPlasticity, ContactBolt pretension4 load steps

Page 45: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201545

DMP: Good Performance at High Core Counts

Number of Cores Number of Cores

• Intel Xeon E5-2690 processors (2.9 GHz, 16 cores total)

• 128 GB of RAM

10.7 Mio Degrees of FreedomStatic, linear, structural1 load step

1 Mio Degrees of FreedomHarmonic, linear, structural4 frequencies

Page 46: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201546

0

5

10

15

20

25

0 8 16 24 32 40 48 56 64

Spe

ed

up

Solution Scalability

Minimum time to solution more important than scaling

ANSYS Mechanical 14.5

DMP Enabling Scalability at High Core Counts

V14sp-5 Model

Turbine geometry

2.1 million DOF

Static, nonlinear analysis

1 loadstep, 7 substeps,

25 equilibrium iterations

8-node Linux cluster

(with 8 cores per node)

Page 47: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201547

1.3x1.7x

2.7x 2.4x

0

1

2

3

4

5

6

Engine (9 MDOF) Stent (520 KDOF) Clutch (160 KDOF) Bracket (45 KDOF)

Spe

ed

up

ove

r R

14

.5

Improved Scaling at 8 cores

by an enhanced domain decomposition method

ANSYS Mechanical 15.0

Faster Performance at Higher Core Counts

8-node Linux cluster (with 8 cores and 48 GB of RAM per node, InfiniBand DDR)

Page 48: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201548

1.6x 1.8x

3.8x4.0x

0

1

2

3

4

5

6

Engine (9 MDOF) Stent (520 KDOF) Clutch (160 KDOF) Bracket (45 KDOF)

Spe

ed

up

ove

r R

14

.5

Improved Scaling at 16 cores

ANSYS Mechanical 15.0

Faster Performance at Higher Core Counts

by an enhanced domain decomposition method

8-node Linux cluster (with 8 cores and 48 GB of RAM per node, InfiniBand DDR)

Page 49: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201549

1.8x2.2x

3.9x

5.0x

0

1

2

3

4

5

6

Engine (9 MDOF) Stent (520 KDOF) Clutch (160 KDOF) Bracket (45 KDOF)

Spe

ed

up

ove

r R

14

.5

Improved Scaling at 32 cores

ANSYS Mechanical 15.0

Faster Performance at Higher Core Counts

by an enhanced domain decomposition method

8-node Linux cluster (with 8 cores and 48 GB of RAM per node, InfiniBand DDR)

Page 50: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201550

Continually improving Core Solver Rating to 80 cores

Courtesy of HP

ANSYS Mechanical 15.0

Faster Performance at Higher Core Counts

Page 51: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201551

ANSYS Mechanical 15.0

HPC & Solver Technology Improvements

• Improved Scalability of Distributed solver at higher core counts

• NEW Subspace eigen solver supports Shared and Distributed Parallel technology

• NEW MSUP Harmonic method for unsymmetric systems e.g vibro-acoustics

Coupled Acoustic, 1.2 M DOF, Full Harmonic Response

2.09 MDOFsfirst 20 modes

Page 52: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201552

HDD vs. SSD

What Hardware Configuration to Select?

SMP vs. DMP Interconnects?Clusters?

GPUs?CPUs?

Page 53: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201553

GPUs are accelerators and can significantly speed up your simulations

• GPUs work hand in hand with CPUs

Most ANSYS GPU acceleration is user-transparent

• Only requirement is to inform ANSYS of how many GPUs to use

Schematic of a CPU with an attached GPU accelerator

• CPU begins/ends job, GPU manages heavy computations

Some Basics

ANSYS Software on NVIDIA GPUs

Page 54: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201554

GPU Accelerator Capability- ANSYS Mechanical

Supports majority of ANSYS structural mechanics solvers:

• Covers both sparse direct and PCG iterative solvers

• Only a few minor limitations

Ease of use:

• Requires at least one supported GPU card to be installed

• No rebuild, no additional installation steps

Performance:

• Offer significantly faster time to solution

• Should never slow down your simulation

V14sp-5 Model

Page 55: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201555

Influence of GPU Accelerator on Speedup

5.9x3.7x 2.4x

ANSYS Mechanical Model – ImpellerImpeller geometry of ~2M DOF, solid FEs

Normal modes analysis using cyclic symmetry

ANSYS Mechanical SMP and Block-Lanczos solver

SpeedupImpeller 2M DOF

Normal modes4 cores + GPU = 2.4x speedup

vs. 4 cores

ANSYS Mechanical Model – SpeakerSpeaker geometry of ~0.7M DOF, solid FEs

Vibroacoustic harmonic analysis for one frequency

ANSYS Mechanical distributed sparse solver

Speaker 0.7M DOFHarmonic analysis

4 cores + GPU = 2.7x speedup

vs. 4 cores

Speedup

Page 56: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201556

NVIDIA-GPU Solution Fit for ANSYS Mechanical

GPUs accelerate the solver part of analysis, consequently problems with high solver workloads benefit the most from GPUs

• Characterized by both high DOF and high factorization requirements

• Models with solid elements (such as castings) and have >500K DOF experience good speedups

Better performance when run on DMP mode over SMP mode

GPU and system memories both play important roles in performance

• Sparse solver:

– Bulkier and/or higher-order FE models are good and will be accelerated

– If the model exceeds 5M DOF, then either add another GPU with 5-6 GB of memory (Tesla K20 or K20X) or use a single GPU with 12 GB memory (Tesla K40 or Quadro K6000).

• PCG/JCG solver:

– Memory saving (MSAVE) option should be turned off for enabling GPUs

– Models with lower Level of Difficulty value (Lev_Diff) are better suited for GPUs

Page 57: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201557

2 CPU cores 2 CPU cores + Tesla K20

93

324

3.5X

Simulation productivity(with an HPC license)

2 CPU cores + Tesla K40

363

3.9X

K20K40

8 CPU cores 7 CPU cores + Tesla K20

275

576

2.1X

Simulation productivity (with an HPC Pack)

7 CPU cores + Tesla K40

600

2.2X

K20K40

V14sp-5 Model

Turbine geometry

2.1 million DOF

SOLID187 elements

Static, nonlinear analysis

One iteration

Sparse direct solver

Distributed ANSYS Mechanical 15.0 with Intel Xeon E5-2697 v2 2.7 GHz CPU; Tesla K20 GPU and a Tesla K40 GPU with boost clocks.

Higheris

Better

AN

SYS

Me

chan

ical

job

s/d

ay

GPU AchievementsANSYS Mechanical 15.0 Supporting Newest GPUs

Page 58: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201558

Simulation productivity(with an HPC license)

Simulation productivity (with an HPC Pack)

V14sp-6 Model

4.9 million DOF

Static, nonlinear analysis

One iteration

Sparse direct solver

AN

SYS

Me

chan

ical

job

s/d

ay

7 CPU cores + Tesla K20

Higheris

Better

GPU AchievementsANSYS Mechanical 15.0 Supporting Newest GPUs

Distributed ANSYS Mechanical 15.0 with Intel Xeon E5-2697 v2 2.7 GHz CPU; Tesla K40 GPU with boost clocks.

2 CPU cores 2 CPU cores + Tesla K20

59

165

2.8X Higheris

Better

8 CPU cores

180

270

1.5X

Page 59: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201559

GPU AchievementsANSYS Mechanical 15.0 Supporting Newest GPUs

Distributed ANSYS Mechanical 15.0 on Windows workstation with 16 Intel Xeon E5-2670 cores @ 2.7 GHz; 128 GB RAM; SSD.

Page 60: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201560

GPUs can offer significantly faster time to solution

Lower

core

counts

favor a

single

GPU

Higher

core

counts

favor

multiple

GPUs

Courtesy of HP

GPU AchievementsANSYS Mechanical 15.0 Supporting Newest GPUs

Page 61: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201561

Intel Xeon Phi coprocessors are now supported

• Use ‘-acc intel’ to activate this capability

• Xeon Phi models 7120, 5110, 3120 are supported

• Multiple cards

Note:

• Supported by sparse solver (symmetric matrices only)

• Linux only (no Windows support yet)

• SMP only supported

GPU AchievementsANSYS Mechanical 15.0 Supporting Xeon Phi

Page 62: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201562

Significant speedups can be achieved with Xeon Phi card

• Shared Memory Sparse Solver on Linux

3.3x

4.3x

5.1x

6.0x

6.8x

0

1

2

3

4

5

6

7

8

1 core 2 cores 4 cores 8 cores 16 cores

Spe

ed

up

Xeon Phi Acceleration (SMP)

CPU cores only

CPU cores + Xeon Phi

V14sp-5 Model

Turbine geometry

2.1 million DOF

SOLID187 elements

Static, nonlinear analysis

One iteration

Sparse direct solver

GPU AchievementsANSYS Mechanical 15.0 Supporting Xeon Phi

Linux workstation (16 Intel Xeon E5-2670 cores @ 2.6 GHz, 1 7120A Xeon Phi, 64 GB RAM).

Page 63: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201563

GPU AchievementsANSYS 15.0 License Scheme for GPUs – NEW!

6 CPU Cores + 2 GPUs1 x ANSYS HPC Pack

4 CPU Cores + 4 GPUs

Licensing Examples:

Total 8 HPC Tasks (4 GPUs Max)

2 x ANSYS HPC PackTotal 32 HPC Tasks (16 GPUs Max)

Example of Valid Configurations:

24 CPU Cores + 8 GPUs

(Total Use of 2 Compute Nodes)

.

.

.

.

.(Applies to all schemes: HPC, HPC Pack,HPC Workgroup, HPC Enterprise)

Page 64: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201564

ANSYS 15.0 License Scheme for GPUs- Implication of New HPC Pack Licensing

• With R14.5, you could run up to 8 CPU cores and 1 GPU.

• With R15.0, you can run up to to 7 CPU cores and 1 GPU, or 6C + 2G,etc.

Results Courtesy of MicroConsult Engineering, GmbH

Leda

BGA

Page 65: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201565

HDD vs. SSD

Maximizing Performance – Putting it Together

The right combination of hardware and software

leads to maximum efficiency

SMP vs. DMP

Interconnects?

Clusters?

GPUs?CPUs?

Page 66: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201566

#1 Rule Avoid waiting for I/O to complete

• Always check if job is I/O bound or compute bound

– Check output file for CPU and Elapsed times• When Elapsed time >> main thread CPU time I/O bound

– Consider adding more RAM or faster hard drive configuration

• When Elapsed time ≈ main thread CPU time Compute bound

– Considering moving simulation to a machine with newer, faster processors

– Consider using Distributed ANSYS (DMP) instead of SMP

– Consider running on more CPU cores or possibly using GPU(s)

Total CPU time for main thread : 159.8 seconds

. . .

. . .

Elapsed Time (sec) = 398.000 Date = 03/21/2013

Maximizing Performance – ANSYS Mechanical

Page 67: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201567

Maximizing Performance – ANSYS Mechanical

How to improve an I/O bound simulation

– First consider adding more RAM

• Always the best option for optimal performance

• Allows the operating system to cache file data in memory

– Next consider improving the I/O configuration

• Need fast hard drives to feed fast processors– Consider SSDs

– Higher bandwidths and extremely low seek times

– Consider RAID configurations

RAID 0 – for speed

RAID 1,5 – for redundancy

RAID 10 – for speed and redundancy

Page 68: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201568

Example of an I/O bound simulation

0.8x

2.9x2.7x

5.9x 5.9x

0

1

2

3

4

5

6

7

2 cores, HDD 8 cores, HDD 8 cores, SSD

Re

lati

ve S

pe

ed

up

Benefits of SSD and RAM

16 GB RAM

128 GB RAM

Maximizing Performance – ANSYS Mechanical

Adding RAM gives biggest gains & allows good scaling

Lack of RAM and slow HDD ruin scaling

Single SSD helps allow some scaling. Not as helpful as RAM, but cheaper

• 2.1 million DOF• Nonlinear static analysis • Direct sparse solver (DSPARSE)• 2 Intel Xeon E5-2670 (2.6 GHz, 16 cores total)• One 10k rpm HDD, one SSD• Windows 7

Page 69: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201569

Maximizing Performance – ANSYS Mechanical

How to improve a compute bound simulation

– First consider using newer, faster processors

• New CPU architecture and faster clock speeds always help

– Next consider using parallel processing

• DMP virtually always recommended over SMP• More computations performed in parallel with DMP

• Significantly faster speedups achieved using DMP

• DMP can take advantage of all resources on a cluster

• Whole new class of problems can be solved!!

– Last consider using GPU acceleration

• Can help accelerate critical, time-consuming computations

Page 70: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201570

Example of a compute bound simulation

Maximizing Performance – ANSYS Mechanical

1.8x

4.0x

11.0x

0

2

4

6

8

10

12

2 cores 8 cores 8 cores, GPU

Re

lati

ve S

pe

ed

up

Benefits of DMP and GPU

Xeon x5675

Xeon E5-2670Maximum performance found by adding GPU

Using newer Xeons gives big gain

Using 8 cores gives faster performance

• 2.1 million DOF• Nonlinear static analysis • Direct sparse solver (DSPARSE)• 2 Intel Xeon E5-2670 (2.6 GHz, 16 cores total)• 128 GB RAM • 1 Tesla K20c• Windows 7

Page 71: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201571

Balanced System for Overall Optimum Performance

Maximizing Performance – ANSYS Mechanical

1.0x2.7x 5.2x

12.5x

0

5

10

15

20

25

30

2 cores 8 cores 8 cores +GPU

8 cores +GPU + SSD

Re

lati

ve S

pe

ed

up

Balanced PerformanceIO Bound

• 2.1 million DOF• Nonlinear static analysis • Direct sparse solver (DSPARSE)• 2 Intel Xeon E5-2670 (2.6 GHz, 16 cores total)• 16 GB RAM • SSD and SATA disks• 1 Tesla K20c• Windows 7

Page 72: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201572

Balanced System for Overall Optimum Performance

Maximizing Performance – ANSYS Mechanical

• 2.1 million DOF• Nonlinear static analysis • Direct sparse solver (DSPARSE)• 2 Intel Xeon E5-2670 (2.6 GHz, 16 cores total)• 128 GB RAM • SSD and SATA disks• 1 Tesla K20c• Windows 7

1.0x2.7x 5.2x

12.5x

5.7x

12.0x

24.8x27.3x

0

5

10

15

20

25

30

2 cores 8 cores 8 cores +GPU

8 cores +GPU + SSD

Re

lati

ve S

pe

ed

up

Balanced Performance

IO Bound

Compute Bound

Page 73: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201573

• Why Talking About Hardware

• HPC Terminology

• ANSYS Work-flow

• Hardware Considerations

• Additional resources

Agenda

Page 74: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201574

Watch recorded webinars by clicking below:• Understanding Hardware Selection for ANSYS 15.0

• How to Speed Up ANSYS 15.0 with GPUs

• Intel Technologies Enabling Faster, More Effective Simulation

• Why HPC for ANSYS Mechanical and CFD

Click on webinars related to HPC/IT for more and upcoming ones!

Additional Resources

Page 75: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201575

Additional Resources- ANSYS IT Webcast Series

On-demand webinars:• Understanding Hardware Selection for ANSYS 15.0

• How to Speed Up ANSYS 15.0 with GPUs

• Cloud Hosting of ANSYS: Gompute On-Demand Solutions

• Simplified HPC Clusters for ANSYS Users

• Intel Technologies Enabling Faster, More Effective Simulation

• Accelerating Time-to-Results with Parallel I/O

• Extreme Scalability for High-Fidelity CFD Simulations

• Methodology and Tools for Compute Performance at Any Scale

• Understanding Hardware Selection for Structural Mechanics

• Optimizing Remote Access to Simulation

• Scalable Storage and Data Management for Engineering Simulation

http://www.ansys.com/Support/Platform+Support/IT+Solutions+for+ANSYS+Webcast+Series

Page 76: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201576

Additional Resources

ANSYS Platform Support• http://www.ansys.com/Support/Platform+Support

– Platform Support Policies

– Supported Platforms

– Supported Hardware

– Tested Systems

– ANSYS Benchmarks

Page 77: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201577

ANSYS Partner Solutions– http://www.ansys.com/About+ANSYS/Partner+Programs/HPC+Partners

• Reference configurations

• Performance data

• White papers

• Sales contact points

Performance Data– http://www.ansys.com/benchmarks

Additional Resources

Page 78: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201578

Additional Resources

The Manual• Sections on best practices and parallel

processing for various solvers

• Performance Guide for Mechanical

• Installation walkthroughs for installing the products, parallel processing, licensing and RSM (remote solve manager)

ANSYS Advantage• Online Magazine

Page 79: Understanding Hardware Selection to Speedup Your · PDF fileUnderstanding Hardware Selection to Speedup Your FEA Simulations ... HDD vs. SSD Interconnects? ... ~5.0x speed-up over

© 2013 ANSYS, Inc. May 5, 201579

• Connect with Me

[email protected]

• Connect with ANSYS, Inc.

– LinkedIn ANSYSInc

– Twitter @ANSYS_Inc

– Facebook ANSYSInc

• Follow our Blog

– ansys-blog.com

Thank You!