team 180: cfd speedit on gpus in the cloud cfd · pdf fileteam 180: cfd speedit on gpus in the...

Team 180: CFD SpeedIT on GPUs in the Cloud

With Support From

CFD SpeedIT on GPUs in the Cloud

An UberCloud Experiment

UberCloud Case Study 180:

CFD SpeedIT on GPUs in the Cloud http://www.TheUberCloud.com

October 23, 2015

http://www.theubercloud.com/


Welcome!

The UberCloud* Experiment started in July 2012, with a discussion about cloud adoption in technical computing and a list of technical and cloud computing challenges and potential solutions. We decided to explore these challenges further, hands-on, and the idea of the UberCloud Experiment was born, also due to the excellent support from INTEL generously sponsoring these experiments! We found that especially small and medium enterprises in digital manufacturing would strongly benefit from technical computing in HPC centers and in the cloud. By gaining access on demand from their desktop workstations to additional compute resources, their major benefits are: the agility gained by shortening product design cycles through shorter simulation times; the superior quality achieved by simulating more sophisticated geometries and physics and by running many more iterations to look for the best product design; and the cost benefit by only paying for what is really used. These are benefits that increase a company’s innovation and competitiveness. Tangible benefits like these make technical computing - and more specifically technical computing as a service in the cloud - very attractive. But how far away are we from an ideal cloud model for engineers and scientists? In the beginning, we didn’t know. We were just facing challenges like security, privacy, and trust; conservative software licensing models; slow data transfer; uncertain cost & ROI; availability of best suited resources; and lack of standardization, transparency, and cloud expertise. However, in the course of this experiment, as we followed each of the 175 teams closely and monitored their challenges and progress, we’ve got an excellent insight into these roadblocks, how our teams have tackled them, and how we are now able to reduce or even fully resolve them. This case study from Team 180 is about benchmarking SpeedIT Flow™, one of the fastest Computational Fluid Dynamics implicit single-phase flow solvers currently available. In contrary to other solutions a Semi-Implicit Method for Pressure Linked Equations (SIMPLE) and the Pressure Implicit with Operator Splitting (PISO) algorithms have been implemented on graphics processing unit (GPU). The software is particularly useful in acceleration of time-consuming external and internal flow simulations, such as aerodynamics studies in car and aviation industries. SpeedIT has been benchmarked in the cloud (on Ohio Supercomputer resources) and performance compared for CPUs versus GPUs, for four different applications, AeroCar, SolarCar, Motorbike, and DrivAer. We want to thank the team members for their continuous commitment and voluntary contribution to this experiment, and thus to our technical computing community. And we want to thank our main Compendium sponsor INTEL for generously supporting the 180 UberCloud experiments. Now, enjoy reading! Wolfgang Gentzsch & Burak Yenier The UberCloud, October 2015 *) UberCloud is the online community and marketplace where engineers and scientists discover, try, and buy Computing Power as a Service, on demand. Engineers and scientists can explore and discuss how to use this computing power to solve their demanding problems, and to identify the roadblocks and solutions, with a crowd-sourcing approach, jointly with our engineering and scientific community. Learn more about the UberCloud at: http://www.TheUberCloud.com.

Please contact UberCloud [email protected] before distributing this material in part or in full.

© Copyright 2015 UberCloud™. UberCloud is a trademark of TheUberCloud Inc.



Team 180:

CFD SpeedIT on GPUs in the Cloud

MEET THE TEAM

End-User: Andrzej Kosior, Vratis Ltd.

Resource Provider: Alan Chalker, Ohio Supercomputer Center (OSC).

HPC Cloud Experts: Karen Tomko and David Hudak, OSC.

Team Mentor: Wolfgang Gentzsch, UberCloud.

USE CASE

SpeedIT Flow™ is one of the fastest Computational Fluid Dynamics (CFD) implicit single-phase flow

solvers currently available. In contrary to other solutions a Semi-Implicit Method for Pressure Linked

Equations (SIMPLE) and the Pressure Implicit with Operator Splitting (PISO) algorithms have been

completely implemented on graphics processing unit (GPU). The software is particularly useful in

acceleration of time-consuming external and internal flow simulations, such as aerodynamics studies

in car and aviation industries.

SpeedIT Flow robust solver technology empowers users by providing accuracy in double precision on

fully unstructured meshes up to 11 million cells. Our implementation was validated on standard

steady and unsteady industry-relevant problems with RANS turbulence model. Its computational

efficiency has been evaluated against the modern CPU configurations using OpenFOAM1.

Graphics Processing Units provide completely new possibilities for significant cost savings because

simulation time can be reduced on hardware that is often less expensive than server-class CPUs.

Almost every PC contains a graphics card that supports either CUDA or OpenCL and SpeedIT Flow,

supports both platforms. If there are multiple GPUs in the system, independent computing tasks

such as in the parameter sweep studies can be solved simultaneously. When cases are solved on

GPU the CPU resources are free and can be used for other tasks such as pre- and post-processing.

1 This offering is not approved nor endorsed by OpenCFD Limited, the producer of the OpenFOAM software

and owner of the OPENFOAM® and OpenCFD® trademarks.

“Flexible licensing which is

not dependent on number of

CPU cores but on the

number of GPUs reduce the

cost of software licensing.”


Three most important characteristics of GPUs affecting the performance of numerical simulations

are: amount of RAM, number of streaming processors and memory bandwidth. Major GPU

manufacturers compete by optimizing these characteristics and every new family of GPU cards

outperforms the previous one. Server-class GPUs usually have more RAM, higher memory

bandwidth and larger number of streaming processors than desktop and laptop GPUs which have

worse characteristics but lower power consumption. While the latter GPUs can accelerate cases of

size ca. 1-2 million cells, much larger cases can fit only to server-class cards that are quite expensive.

This is where cloud computing can be helpful. Together with the Ohio Supercomputer Center (OSC)

and UberCloud we have prepared a preconfigured, tested and validated environment where SpeedIT

Flow is ready to be used. The customer pays for actual consumption only. Real consumption model is

particularly important for small companies and research groups with limited budget and for bigger

companies who want to reduce the simulation costs. Also, this technology may be particularly

interesting for HPC centers, such as OSC, because our software offers better utilization of their

resources. The computations may be done on the CPUs and GPUs concurrently. Moreover the power

efficiency per simulation, which is an important factor in high performance computing, is

comparable for a dual-socket multicore CPU and a GPU.

Figure 1: SpeedIT Flow is running from the command line.

Running simulations with SpeedIT Flow is easy. First, the OpenFOAM case must be converted to

GPU-friendly format2. As this step is not computationally demanding it can be run by the user on the

local computer or on the cluster just before the simulation. Next, our solver is executed from the

command line and the output file is being generated in a user-friendly format and can be viewed in

ParaView - an open-source, multi-platform data analysis and visualization application.

RESULTS

Four OpenFOAM test cases (Figure 2) were run on two different clusters in OSC3, Oakley with Intel

Xeon X5650 processors and Ruby with Intel Xeon E5-2670 v2 processors. Results were compared to

SpeedIT Flow which was run on the Ruby using NVIDIA Tesla K40 GPU.

2 Converter is available to download from https://github.com/VratisLtd/OpenFOAM_to_SpeedIT_converters

3 Full specification can be found at https://www.osc.edu/supercomputing/hpc


Figure 2: Test cases. From top left: AeroCar and SolarCar - geometries from 4-ID Network; motorBike - geometry from OpenFOAM tutorial; DrivAer - geometry from Institute of Aerodynamics and Fluid Mechanics at TUM. All geometry providers are kindly acknowledged.

Figure 3: Scaling OpenFOAM results for different test cases on two OSC clusters (Oakley with Intel Xeon X5650 processors and Ruby with Intel Xeon E5-2670 v2 processors), and on a single GPU (NVIDIA Tesla K40), using SpeedIT Flow.


Results of OpenFOAM scaling on different number of cores are given in the Figure 3. The graphs

show that SpeedIT Flow is capable of running CFD simulations on a single GPU in times comparable

to those obtained using 16-20 cores of a modern server-class CPU. In Figure 4 electric energy

consumption per simulation together with scaling for the biggest test case is shown. Again electric

energy consumption is comparable to those needed by computations on multicore CPUs.

Figure 4: Number of simulations per day (left) and per day per Watt (right) of the DrivAer test case

computed on a single node of Oakley (12 cores of Intel Xeon X5650) and Ruby (20 cores of Intel

Xeon E5-2670 v2) using OpenFOAM and a single GPU (NVIDIA Tesla K40) using SpeedIT Flow.

CONCLUSIONS

SpeedIT Flow has been developed in the past two years as the result of projects aimed at

acceleration of CFD. The biggest challenge was to implement Navier-Stokes solver in CUDA and

OpenCL in the efficient way. Supported by our team of engineers and also by the GPU vendors we

were able to complete this task. Once this has been done, the tests showed that our software is

optimally using graphics card resources which means that the performance of our solver will raise

with every new generation of the graphic cards.

It gives the end-user an alternative way to reduce turnaround times. By taking advantage of GPUs,

that are available in most of the systems, the simulation time can be reduced bringing significant

cost savings to the production pipeline. Finally, flexible licensing that is not dependent on number of

CPU cores but on number of GPUs reduce the costs of software licensing.

With our software resource providers such as OSC and private or public cloud providers can utilize

their hardware more efficiently. On a cluster with GPUs CFD simulations can be run at the same time

on CPUs as on GPUs. For example for a node with two GPUs and two CPUs with ten cores each three

simulations could be run: two on GPUs and one on unused eighteen cores. As shown in our tests the

turnaround times and the power consumption per simulation are comparable.

SpeedIT Flow is a new solver that completely changes the paradigm of running CFD calculations. It is

an attractive solution for both individual users and HPC centers. It can be also an alternative to new

investments in hardware. As the computations on GPUs are comparable to the once on the modern

CPUs instead of buying a new CPU-based system the end-user could just furnish an old one with

GPUs. This solution should be cheaper than and as effective as the new hardware.

Case Study Author – Andrzej Kosior, Vratis Ltd.


Thank you for your interest in the free and voluntary UberCloud Experiment. If you, as an end-user, would like to participate in this Experiment to explore hands-on the end-to-end process of on-demand Technical Computing as a Service, in the Cloud, for your business then please register at: http://www.theubercloud.com/hpc-experiment/ If you, as a service provider, are interested in promoting your services on the UberCloud Marketplace then please send us a message at https://www.theubercloud.com/help/ 1st Compendium of case studies, 2013: https://www.theubercloud.com/ubercloud-compendium-2013/ 2nd Compendium of case studies 2014: https://www.theubercloud.com/ubercloud-compendium-2014/ 3rd Compendium of case studies 2015: https://www.theubercloud.com/ubercloud-compendium-2015/ HPCwire Readers Choice Award 2013: http://www.hpcwire.com/off-the-wire/ubercloud-receives-top-honors-2013-hpcwire-readers-choice-awards/ HPCwire Readers Choice Award 2014: https://www.theubercloud.com/ubercloud-receives-top-honors-2014-hpcwire-readers-choice-award/ In any case, if you wish to be informed about the latest developments in technical computing in the cloud, then please register at http://www.theubercloud.com/. It’s free.

Please contact UberCloud [email protected] before distributing this material in part or in full. © Copyright 2015 UberCloud™. UberCloud is a trademark of TheUberCloud Inc.

http://www.theubercloud.com/hpc-experiment/

https://www.theubercloud.com/help/

https://www.theubercloud.com/ubercloud-compendium-2013/



http://www.hpcwire.com/off-the-wire/ubercloud-receives-top-honors-2013-hpcwire-readers-choice-awards/

http://www.hpcwire.com/off-the-wire/ubercloud-receives-top-honors-2013-hpcwire-readers-choice-awards/

https://www.theubercloud.com/ubercloud-receives-top-honors-2014-hpcwire-readers-choice-award/

https://www.theubercloud.com/ubercloud-receives-top-honors-2014-hpcwire-readers-choice-award/


team 180: cfd speedit on gpus in the cloud cfd · pdf fileteam 180: cfd speedit on gpus in the...

Documents