- for fea & cfd - ansys · 2012-10-26 · model other physical model case 1...

63
© 2012 ANSYS, Inc. June 21, 2012 1 Boost Your Productivity Through HPC - for FEA & CFD Duraivelan Dakshinamoorthy, Ph.D [email protected]

Upload: others

Post on 17-Feb-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 1

Boost Your Productivity Through HPC - for FEA & CFD

Duraivelan Dakshinamoorthy, Ph.D

[email protected]

Page 2: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 2

• Introduction

• Parallel Scalability

• FEA

• CFD

• HPC advancements

Agenda

Page 3: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 3

• Introduction

• Parallel Scalability

• FEA

• CFD

• HPC advancements

Agenda

Page 4: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 4

Why HPC?

Page 5: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 5

Why HPC for ANSYS Users Enhanced insight and productivity

It’s all about getting better insight into product behavior quicker!

HPC enables high-fidelity • Include details - for reliable results

• “Getting it right the first time”

• Innovate with confidence

HPC enables design exploration & optimization • Consider multiple design ideas

• Optimize the design

• Ensure performance across range of conditions

Page 6: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 6

HPC – A Software Development Imperative

• Clock Speed – Leveling off

• Core Counts – Growing

• Exploding (GPUs)

• Future performance depends on highly scalable parallel software

Source: http://www.lanl.gov/news/index.php/fuseaction/1663.article/d/20085/id/13277

Page 7: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 7

ANSYS HPC Leadership

A History of HPC Performance

2010 - 2011 ►Ideal scaling to 3072 cores (fluids) ►Hybrid parallelization (fluids) ►Network-aware partitioning (fluids) ►Large finite antenna arrays (HFSS 14) ►GPU acceleration with DMP(structures)

Today’s multi-core / many-core hardware evolution makes HPC a software development imperative. ANSYS is committed to maintaining performance

leadership.

Page 8: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 8

• Introduction

• Parallel Scalability

• FEA

• CFD

• HPC advancements

Agenda

Page 9: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 9

Impressive speed and scaling improvements in 13.0 release

Focus on resolving bottlenecks in the distributed memory solvers (DANSYS)

Sparse direct solver

• Parallelized equation ordering

• 40% faster w/ updated Intel MKL

Preconditioned Conjugate Gradient (PCG) iterative solver

• Parallelized preconditioning step

Support of unsymmetric eigensolver

ANSYS Mechanical Scaling

Page 10: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 10

ANSYS Mechanical Scaling

Number of Cores Number of Cores

• Intel Xeon E5-2690 processors (2.9 GHz, 16 cores total) • 128 GB of RAM • SUSE Linux Enterprise Server

10.7 Mio Degrees of Freedom Static, linear, structural 1 load step

1 Mio Degrees of Freedom Harmonic, linear, structural 4 frequencies

Page 11: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 11

ANSYS Mechanical Scaling

6 Mio Degrees of Freedom Plasticity, Contact Bolt pretension 4 load steps

1 HPC Pack

Page 12: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 12

What about GPU Computing?

CPUs and GPUs work in a collaborative fashion

Multi-core processors

•Typically 4-8 cores

•Powerful, general purpose

Many-core processors

•Typically hundreds of cores

•Great for highly parallel code, within memory constraints

CPU GPU

PCI Express channel

Page 13: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 13

Solver Kernel

Speedups

Overall Speedups

ANSYS Mechanical SMP – GPU Speedup

• Intel Xeon 5560 processors (2.8 GHz, 8 cores total) • 32 GB of RAM • Windows XP SP2 (64-bit) • Tesla C2050 (ECC,ON; WDDM driver)

Page 14: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 15

Distributed ANSYS – GPU Speedup @ 14.0

Cores GPU Speedup

2 no 2.25

4 no 4.29

2 yes 11.36

4 yes 11.51

Vibroacoustic harmonic analysis of an audio speaker

• Direct sparse solver

• Quarter-symmetry model with 700K DOF:

– 657424 nodes

– 465798 elements

– higher-order acoustic fluid elements (FLUID220/221)

Distributed ANSYS Results (baseline is 1 core):

• With GPU, ~11x speedup on 2 cores!

• 15-25% faster than SMP with same number of cores

Windows workstation: Two Intel Xeon 5530 processors (2.4 GHz, 8 cores total), 48 GB RAM, NVIDIA Quadro 6000

Speedup

SMP DANSYS

SMP+GPU DANSYS+GPU

0.00

2.00

4.00

6.00

8.00

10.00

12.00

2

4

Page 15: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 16

1848

1192

846

564 516 399 444

342 314 273 270

0

1000

2000

3000

Xeon 5670 2.93 GHz Westmere (Dual Socket)

Xeon 5670 2.93 GHz Westmere + Tesla C2075

AN

SY

S M

echanic

al

Tim

es

in S

econds

4.2x

2.7x

3.5x

2.1x 1.9x

1 Core 2 Core 4 Core 6 Core 12 Core

1 Socket 2 Socket

8 Core

Results from HP Z800 Workstation, 2 x Xeon X5670 2.93GHz

48GB memory, CentOS 5.4 x64; Tesla C2075, CUDA 4.0.17

V13sp-5 Model

Turbine

geometry

2,100 K DOF

SOLID187 FEs

Static, nonlinear

One iteration

Direct sparse

Lower is

Better

ANSYS Mechanical 14.0 Performance for Tesla C2075

Page 16: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 18

ANSYS Mechanical HPC Case Study Enabling Enhanced Productivity

“Parallel processing makes it possible to evaluate five to 10 design iterations per day, enabling Cognity to rapidly improve their design...”

- Rae Younger, Managing Director, Cognity Limited

Application: Stress analysis of hydraulic deflection housing Software: ANSYS Mechanical HPC Solution: Critical to meet delivery-time requirements for this project Business Ability to complete the design in approx. 70% Solution: less time than would have been required

Page 17: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 19

ANSYS Mechanical HPC Case Study Enabling Enhanced Productivity

“By optimizing our solver selection and workstation configuration, and including GPU acceleration, we’ve been able to dramatically reduce turnaround time — from over two days to just an hour. This enables the use of simulation to examine multiple design ideas and gain more value out of our investment in simulation.”

- Berhanu Zerayohannes, Senior Mechanical Engineer, NVIDIA

Application: Deflection and bending of 3-D glasses Software: ANSYS Mechanical HPC Solution: From 60 hours per simulation to 47 minutes (77x speedup) Business Ability to ensure robust performance of the 3-D Solution: glasses via examining multiple design ideas

Co

pyr

igh

t 20

11

NV

IDIA

Co

rpo

rati

on

. A

ll ri

ghts

res

erve

d.

Page 18: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 20

• Introduction

• Parallel Scalability

• FEA

• CFD

• HPC advancements

Agenda

Page 19: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 21

ANSYS CFX Partitioning

Optimize parallel partitioning in multi-core clusters (CFX)β

• Partitioner determines number of connections between partitions and optimizes part.-host assignments

Re-use previous results to initialize calculations on large problem (CFX) β

• Large case interpolation for cases with >~100M nodes

Clean up of coupled partitioning option for multi-domain cases (CFX)

• Eliminates ‘isolated’ partition spots

Dramatically reduced partitioning times for cases with fluid-solid interfaces and very large numbers of regions

Compute Node 1 Compute Node 2

P1

P5

P3

P6

P2 P7

P4 P8

P1 P5

P3

P6 P2

P7

P4

P8

Partitioning step finds adjacency amongst partitions; partitions with max adjacency are grouped on same compute nodes

Page 20: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 22

ANSYS CFX Scalability - Not just for large meshes

Problem Description

Name of Definition file

Type Number of nodes Turbulence

model Other physical

model

Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy

Case 2 perf_AirliftReactor_R

12.def Multiphase 474993 KE Inhomogeneous

Case 3 perf_IndyCar_R12.de

f External Aero 483460 KE Incompressible

Case 4 perf_Internal_R12.d

ef Multiphase 943175 KE Thermal Energy

Case 5 perf_LeMansCar_R1

2.def External Aero 1864025 KE Thermal Energy

Case 6 perf_Pump_R12.def Rotating Machine

1305718 KE Frozen Rotor

Page 21: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 23

ANSYS CFX Test Case 1 – Airfoil

0

5

10

15

20

25

30

35

40

45

50

0 10 20 30 40 50

Performence Linear

Number of Cores

Spee

d U

p

Mesh Count – 9930000 nodes

Page 22: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 24

ANSYS CFX Test Case 2 - Air Lift Reactor

0

1

2

3

4

5

6

7

8

9

10

0 2 4 6 8 10

Performence Linear

Number of Cores

Spee

d U

p Mesh Count – 474993 nodes

Page 23: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 25

ANSYS CFX Test Case 3 - Indy Car

0

1

2

3

4

5

6

7

8

9

10

0 2 4 6 8 10

Performence Linear

Number of Cores

Spee

d U

p

Mesh Count – 483460 nodes

Page 24: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 26

ANSYS CFX Test Case 4 – Internal

0

2

4

6

8

10

12

14

16

18

20

0 5 10 15 20

Performence Linear

Number of Cores

Spee

d U

p

Mesh Count – 943175 nodes

Page 25: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 27

ANSYS CFX Test Case 5 – Le Mans

0

5

10

15

20

25

30

35

0 5 10 15 20 25 30 35

Performence Linear

Number of Cores

Spee

d U

p

Mesh Count – 1864025 nodes

Page 26: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 28

ANSYS CFX Test Case 6 – Pump

0

2

4

6

8

10

12

14

16

18

20

0 5 10 15 20

Performence Linear

Number of Cores

Spee

d U

p

Mesh Count – 1305718 nodes

Page 27: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 29

ANSYS Fluent Parallel Scalability

Intel Westmere

Consistently improved scalability

across releases

Sedan, 4M cells

Solv

er R

atin

gs

0

1000

2000

3000

4000

5000

6000

7000

0 100 200 300 400 500 600

6.3.0

12.0.0

Intel Harpertown

Number of Cores

Solv

er R

atin

gs

0

10000

20000

30000

40000

50000

60000

70000

0 500 1000 1500 2000

13.0.0

14.0.0

Number of Cores

0

5000

10000

15000

20000

25000

0 100 200 300 400 500 600

12.0.0

13.0.0

Xeon X5560 @ 2.80GHz (Nehalem EP)

Number of Cores

Solv

er R

atin

gs

Page 28: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 30

ANSYS Fluent Parallel Scalability

SGI ICE 8400EX, Intel 6-core

Consistently improved scalability

across releases

Truck, 111M cells

0

200

400

600

800

1000

1200

1400

0 500 1000 1500 2000

12.0.0

13.0.0

Solv

er R

atin

gs

Number of Cores

0

500

1000

1500

2000

2500

0 1000 2000 3000 4000 5000

13.0.0

14.0.0

Solv

er R

atin

gs

Number of Cores

Intel Westmere hex-core 2.93 GHz

0

50

100

150

200

250

300

350

400

450

0 200 400 600 800 1000 1200

6.3.0

12.0.0

Number of Cores

Intel Harpertown

Solv

er R

atin

gs

Page 29: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 31

Hybrid Parallelisation

Hybrid parallel: fast shared memory communication (OpenMP) within a machine to speed up overall solver performance; distributed memory (MPI) between machines

Page 30: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 32

Including Monitors

Scalability with Monitors

• Scalability to higher core counts

• Simulations with monitors including plotting and printing

Hex-core mesh, F1 car, 130 million cells monitor-enabled

0

5

10

15

20

25

30

35

0 200 400 600 800 1000

Example data for scaling with R14 monitors

3072 cores

Monitor support optimizations

maintain scalability expectations

Page 31: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 35

ANSYS Fluent Read and Write Times

File I/O has typically been a bottle

neck for large grids

Consistently reduced I/O times

across releases

Time to Read 111M Truck Case

Time to Write 111M Truck Case Time to Read 200M Cavity Case

Page 32: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 36

ANSYS Fluent Auto-Partitioning

Auto partitioning is now very quick

Less than 10s to process 800M cells!

Serial pre-partitioning step no

longer required

200M 400M 600M 800M

Time 2.914 4.706 6.617 9.86

0

2

4

6

8

10

12

Tim

e in

se

con

ds

cavity case, 768 cores

192 384 768 1536

Time 5.307 4.542 6.177 8.109

0

1

2

3

4

5

6

7

8

9 Ti

me

in s

eco

nd

s

truck_111m

Time to Partition 200M Cavity Case over 768 cores

Time to Partition 111M Truck Case

Page 33: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 37

Fluids I/O

Fluent and CFX use a “singular” file structure

• This means there is one global set of files and every process writes to them.

This methodology falls down at a large number of cores where the file I/O becomes a bottleneck

• CFX deals with this by using inline compression (cdat)

• Fluent has both inline compression (cdat) and at v12.x introduced support for a Parallel File (pdat).

Parallel file system support in ANSYS Fluent

– ~10x - 20x speedup for data write

– Eliminates scaling bottleneck for data intensive simulations on large clusters (e.g., transient flows)

Serial I/O Parallel I/O

ANSYS Fluent

Page 34: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 38

Fluids I/O

Mesh File Location Async I/O Time

15M Cas NFS OFF 217s

15M Cas NFS ON 62s

15M Dat NFS OFF 113s

15M Dat NFS ON 8s

30M Cas NFS OFF 207s

30M Cas NFS ON 75s

30M Dat NFS OFF 144s

30M Dat NFS ON 10s

Asynchronous I/O for Linux Fluent Total write time 3-5x quicker over NFS

Even larger speed-ups on bigger cases and local disk (up to 10x)

Page 35: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 39

Parallel Scaling for Complex Physics Innovation in Fluent 14.5 for Discrete Phase Performance

Original partitions

New partition boundary

Contours of particle weights

MPI Hybrid

Original 180.7 78.1

Banded Partition 159.5 47.6

0

50

100

150

200

Tim

e in

se

con

ds

100x100 mesh 30000 particles 2 machines, 8 cores each

Hybrid particle tracking balances load within a machine, while the enhanced partitioning spreads it across machines Original Hybrid

Time 33.47 8.89

0

10

20

30

40 Ti

me

in s

eco

nd

s

DEM, 1 time step, 44k cells, 600k particles,

12x Intel Westmere 2.93GHz

Page 36: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 40

• To Demonstrate 50:50:50 Method – Volvo XC60 vehicle model

– Four shape parameters

– RBF Morph (Integrated within FLUENT) to define shape parameters

– Grid morphing in parallel

• ANSYS WorkBench (Frame Work to Automate Process) – To drive shape parameters

– To create DOE

– To perform Goal Driven Optimization

HPC Fluids Demonstration Case

The 50:50:50 Method

50 50 design points in the design

space EXTENT

50 50 million cells used in CFD simulation of each design

point ACCURACY

50 50 hours total elapsed time to simulate all the design points

SPEED

“One – Click” – Entire design space is simulated and post-processed completely automatically after the initial baseline

case setup

Page 37: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 41

HPC Fluids Demonstration Case Prepare Meshed Model for

Baseline Vehicle Shape

CFD Solver Setup, Define Shape Parameters

Generate DOE using Input Shape Parameters

Collate Data, Perform Optimization

Morph Vehicle Shape

Run CFD Simulation

STEP 1

STEP 2

STEP 3

STEP 4

STEP 5

Mesh Morpher Integrated within FLUENT Solver (FLUENT), Optimizer (DX) & Post Processor (CFD Post) Integrated within

ANSYS WorkBench

Page 38: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 42

Boat tail angle

Page 39: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 43

Boat tail angle

Page 40: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 44

Boat tail angle

Page 41: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 45

Long roof drop angle

Page 42: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 46

Long roof drop angle

Page 43: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 47

Long roof drop angle

Page 44: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 48

Front Spoiler

Page 45: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 49

Front Spoiler

Page 46: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 50

HPC Fluids Demonstration Case

768 Cores 384 Cores 288 Cores 240 Cores 144 Cores

Task Time (Seconds) Time (Seconds) Time

(Seconds) Time

(Seconds) Time

(Seconds)

Baseline Case (i.e. Design Point 1)

Read volume mesh of baseline case into the CFD solver and apply solver settings

225 340 365 481 228

CFD Solution 6979 11153 14409 17256 27246

Writing CFD data file 681 538 558 600 532

Each Subsequent Design Point

Morph vehicle shape 84 59 65 69 100

CFD Solution 1284 1754 2208 2630 4100

Writing CFD data file 734 559 572 621 532

Total Run Time (Wall Clock) Needed for All 50 Design Points (Hours)

30.80 35.63 42.98 50.28 72.19

Page 47: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 51

HPC Fluids Demonstration Case

Compute Cluster Details

1. Intel’s Endeavor Cluster

2. Intel Xeon X5670 (dual socket)

3. Clock speed 2.93 GHz

4. Six cores per socket (12 cores per node)

5. 24 GB RAM @ 1333 MHz, SMT ON, Turbo ON

6. QDR Infiniband

7. RHEL Server Release 6.1

Page 48: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 52

ANSYS Fluids HPC Case Study Enabling Enhanced Insight

“Petrobras relies on ANSYS software for its superior parallel scalability, together with advanced multiphase models and dynamic meshing.”

- Carlos Alberto Capela Moraes, Technical Consultant, CENPES (Petrobras R&D Center)

Application: Transient multiphase simulation of sand transportation Software: ANSYS CFD HPC Solution: Consider more detailed, accurate and complete 3-D flow assurance simulations than ever before Business Ability to understand critical scenarios and complex Solution: operations of upstream processing systems

Page 49: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 53

ANSYS Fluids HPC Case Study Enabling Enhanced Insight

“ANSYS HPC technology has ensured that we can test and implement changes quickly and competitively. This allows us to turn around simulation results for multiple designs between race qualifications on Fridays and Saturdays...”

- Nathan Sykes, CFD Team Leader , Red Bull Racing

Application: Car aerodynamics, braking, exhaust systems Software: ANSYS Fluent HPC Solution: Ability to obtain high-fidelity insight into car performance in shorter turnaround times Business ANSYS HPC crucial to Red Bull Racing Solution: Formula One Championships in 2010 & 2011

Page 50: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 54

ANSYS Fluids HPC Case Study Enabling Enhanced Insight

“ANSYS HPC technology is enabling Cummins to use larger models with greater geometric details and more-realistic treatment of physical phenomena...”

- John Horsley, Engineer, Cummins Turbo Technologies

Application: Turbochargers of diesel engines Software: ANSYS CFX HPC Solution: High-fidelity results; 12 times faster; ability to simultaneously evaluate 5 full-stage compressor or turbine designs in a few hours Business Ability to bring new products to market in less Solution: time while substantially reducing expenses

Page 51: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 55

3 Millions of Cells (6 Days)

25 Millions (4 Days)

10 Millions (5 Days)

50 Millions (2 Days)

Increase of :

Spatial-temporal Accuracy

Complexity of Physical Phenomenon

Supersonic Multiphase Radiation

Compressibility Conduction/Convection

Transient Optimisation / DOE Dynamic Mesh

LES Combustion Aeroacoustic Fluid Structure Interaction

HPC for High Fidelity CFD

EURO/CFD • Model sizes up to 200M cells (ANSYS Fluent)

• 2011 cluster of 700 cores

– 64-256 cores per simulation

Page 52: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 56

• Introduction

• Parallel Scalability

• FEA

• CFD

• HPC advancements

Agenda

Page 53: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 57

HPC Advancements in Licensing

2048

32 8

128 512

Parallel Enabled (Cores)

Packs per Simulation

1 2 3 4 5

Scalable licensing

• ANSYS HPC (per-process)

• ANSYS HPC Pack

– Each simulation consumes one or more Packs

– Parallel enabled increases quickly with added Packs

• ANSYS HPC Workgroup

– 128 to 2048 parallel shared across any number of simulations on a single server

• ANSYS HPC Enterprise

– Similar to HPC Workgroup but deploy and use anywhere in the world

Single solution for multiphysics and any level of fidelity

Page 54: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 58

HPC Advancements Through Partnerships

• ANSYS maintains close technical collaboration with the leaders in HPC

• This mutual commitment ensures that you get the most possible value from your overall HPC investment

• Some current examples:

– Optimized performance on multicore processors from Intel, with R&D focused on Intel’s Many Integrated Core (MIC)

• Over 60% performance boost for the latest Intel® Xeon® E5-2600 processor (Sandy Bridge) family compared to previous Intel (Westmere) generation

– GPU computing accelerates ANSYS Mechanical today, with very active R&D engagement with NVIDIA across full portfolio

– ANSYS and HP – Tuning Performance and Productivity at any scale

– ANSYS and IBM – Optimized cluster and storage architectures for ANSYS

– ANSYS and Cray – Support for extreme scalability of ANSYS CFD on the Cray XE, up to 1000’s of cores

Page 55: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 60

1

1.63

6 core Xeon x5675 8 core E5-2680

Intelligent Performance for structural/thermal simulation

The memory capacity of the Intel® Xeon® processor E5-2600 product family allows even the largest of workloads to be handled in-core, significantly improving run times.

Intel® AVX is delivering significant speed up in factoring and solving the FE assembly equations matrix which are floating point intensive.

Users will usually see significant reduction in simulation runtimes even for the largest of models due to the additional cores and larger memory capacity. This will allow customers to run larger, higher fidelity models in more iterations within a set time and cost constraints to improve their product quality and enable innovation.

ANSYS Mechanical 14 Relative Performance Higher is better

HPC Advancements in Processor Technology - ANSYS Mechanical Parallel Scalability on Xeon E5

Data Source: Intel approved/published results as of February 1, 2012.

Page 56: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 61

Leading Performance for fluid flow simulation

The memory bandwidth of the Intel® Xeon® processor E5-2600 product family allows excellent scalability and per core performance.

Support for higher speed memory DIMMs, added on-core capacity for memory loads, as well as a larger cache size are key to extending performance and scalability.

Higher memory bandwidth has a pronounced impact with fully coupled solver applications, which are the most memory intensive. Sedan_4m is shown as an example of fully coupled solver performance. Truck_14m is representative of segregated solver performance. The horizontal line at 1.63 represents the geomean speedup over 6 standard benchmarks.

Data Source: Approved/published results as of February 1, 2012.

ANSYS Fluent 14 Relative Performance Higher is better

1

1.86

6 core Xeon X5675 8 core Xeon E5-2680

Sedan_4m

Geomean

1

1.53

6 core Xeon X5675 8 core Xeon E5-2680

Truck_14m Geomean

HPC Advancements in Processor Technology - ANSYS Fluent Parallel Scalability on Xeon E5

Page 57: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 62

Good scalability and more operations per clock make obtaining results on Intel® Xeon® E5 1.68x faster than on Intel Xeon 5600 platforms For end user it is about faster turnaround or solving larger tasks with the same resources along with lower TCO

ANSYS CFX 14 Relative Performance Higher is better

1

1.76

6 core Xeon X5675 8 core Xeon E5-2680

LeMansCar

Geomean

1

1.49

6 core Xeon X5675 8 core Xeon E5-2680

Stage Compressor Geomean

HPC Advancements in Processor Technology - ANSYS CFX Parallel Scalability on Xeon E5

Data Source: Approved/published results as of February 1, 2012.

Page 58: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 63

Upcoming HPC Advancements

• ANSYS focus on HPC is ongoing… – Architecting for extreme scalability

• Performance at 100’s and 10,000’s of cores for FEA and CFD, respectively • Innovative mechanical solvers: Multilevel PCG, 2D parallel DSPARSE fronts • GP-GPUs for radiation, UDFs, DEM and possibly other CFD solvers • Hybrid distributed/shared memory and vector processing paradigms

– Scalability across all components and full simulation process • Meshing, setup, solver, I/O, visualization, optimization… • Distributed parallel meshing integrated with solver • Parallel for linear dynamics, including mode superposition-based analyses

– Ongoing optimization and performance tuning • Dynamic load balancing; optimized resource mapping, compiler evaluation

– Usability • Multi-component parallel execution environment, job scheduler support • Hardware fault tolerance, system performance tracking and debugging

• All to achieve next-generation capability / performance!

Page 59: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 64

Information Available

ANSYS HPC Partner Solutions – http://www.ansys.com/About+ANSYS/Partner+Programs/Strategic+HPC+Partnerships

– Reference configurations

• Performance data

• White papers

• Sales contact points

Performance Data – http://www.ansys.com/benchmarks

Page 60: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 65

Information Available

ANSYS Platform Support • http://www.ansys.com/Support/Platform+Support

– Platform Support Policies

– Supported Platforms

– Supported Hardware

– Tested systems

ANSYS Resource Library • http://www.ansys.com/demoroom/

– Search for HPC!

ANSYS Advantage • Online Magazine

Page 61: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 66

Information Available

Customer Portal • http://www1.ansys.com/customer/

– Knowledge Resources

– Installation and Systems FAQ’s

Customer Support • http://www1.ansys.com/customer/

• Portal, Email or Phone

Global ANSYS network providing Comprehensive Support

Page 62: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 67

“Take Home” Points / Discussion

ANSYS HPC performance enables scaling for high-fidelity – What could you learn from a 10M (or 100M) cell / DOF model?

– What could you learn if you had time to consider 10 x more design ideas?

– Scaling applies to “all physics”, “all hardware” (desktop and cluster)

ANSYS continually invests in software development for HPC • Committed to leading edge scalability, performance and usability

• Maximized value from your HPC investment

Comments / Questions / Discussion

Page 63: - for FEA & CFD - Ansys · 2012-10-26 · model Other physical model Case 1 perf_Airfoil_R12.def External Aero 9933000 KE Compressible, Total Energy Case 2 perf_AirliftReactor_R 12.def

© 2012 ANSYS, Inc. June 21, 2012 68

THANK YOU