simulations of ice-flow and its interaction with the lower...

27
CSC – Finnish research, education, culture and public administration ICT knowledge center Elmer/Ice advanced workshop 2017 Grenoble, France

Upload: others

Post on 09-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

CSC – Finnish research, education, culture and public administration ICT knowledge center

Elmer/Ice advanced workshop 2017

Grenoble, France

Page 2: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

Including pre- and postprocessing in parallel

Thomas Zwinger

Lecturers (in alphabetical order):Jaakko LeinonenAtte SillanpääThomas Zwinger

Parallel strategies in Elmer/Ice

Page 3: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

SOLUTION

VISUALIZATION

ASSEMBLY

PARTITIONING

MESHING GMSH

Parallel concept of Elmer

Page 4: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

MPI

Parallel concept of Elmer

Trilinos

HUTIter

MATC

BLAS LAPack UMFPack

cPardiso

Hypre

SuperLU

CholMOD

MUMPS

ScaLAPack

BLACS

Page 5: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

Parallel Concept of Elmer

• Domain decomposition

• Additional pre-processing step

(splitting)

• Every domain is running its ”own”

ElmerSolver

• Parallel process communication:

Message Passing Interface (MPI)

Page 6: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

Parallel Concept of Elmer

Page 7: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

• Ice sheet = flat geometry

• Meshing tools (e.g., Gmsh) have

difficulties with aspect ratios

deviating from unity

• Creating footprint mesh in 2D

• Extruding it to 3D

Pre-processing in Elmer/Ice (focus parallel runs)

Page 8: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

Internal Extrusion

• Implemented as an internal strategy in Elmer (2013)

oJuha, Peter & Rupert

• First partition a 2D mesh, then extrude into 3D

• Implemented also for partitioned meshes

oExtruded lines belong to the same partition by construction!

• Deterministic, i.e. element and node numbering determined

by the 2D mesh

oComplexity: O(N)

• Neecessary, if using parallel runs with

StructuredMeshMapper (see later)

Page 9: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

Internal extrusion

Extruded Mesh Levels = 11

By default z in [0,1]

Else give:

Extruded Max Coordinate = Real 10000.0

Extruded Min Coordinate = Real -100.0

Page 10: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

Internal extrusion

Extruded Mesh Levels = 11

Extruded Mesh Density = Variable Coordinate 1

Real MATC "0.2+sin(pi*tx)"

Any functional dependenceis ok as long as it is positive!

The optimal division is founditeratively using Gauss-Seidel type of iteration and large variations make the iterations converge slowly.

Page 11: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

Internal Extrusion

• StructuredMesh Mapper

to impose geometry to a

topological prism (3D)

• Define functions at

bottom:

Bottom Surface = Variable

"Coordinate 1“

Real MATC "0.1*cos(5*tx)"

• And surface:

Bottom Surface = Variable

"Coordinate 1“

Solver 2

Equation = "MapCoordinate"

Procedure = "StructuredMeshMapper"

"StructuredMeshMapper"

Active Coordinate = Integer 2

End

Page 12: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

ElmerSolver parallel

• Same executable: ElmerSolver

• Depending on platform/MPI: mpirun –np N

$ mpirun –np 6 ElmerSolver

• Needs information for different processes, which SIF to load:

ELMERSOLVER_STARTINFO

• User defined functions/routines usually do not need special

rewriting for MPI

Page 13: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

Improvements on block preconditioners

Page 14: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

Massive parallel Stokes problem

• Solving Stokes equations:

• No connection between pressure and density – saddle point

problem:

Page 15: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

Massive parallel Stokes problem

• Needs stabilization to avoid null-space:

• Yet, this matrix usually has a bad condition number:

oDirect parallel solver (expensive): MUMPS, cPardiso (N2 –algorithm)

oGood pre-conditioner + iterative solvers (N log(N) –algorithm)

Page 16: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

Block pre-conditioner

• Using following form:

oM is a by viscosity weighted unit-matrix

• Yet, we need the inverse of P, which requires exact solutions

of linear systems of the blocks A and M

• Those systems are

oway smaller sub-problems

obetter conditioned

osolvable using effective linear Algebra (=Krylov subspace methods) to solve them

Page 17: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

ParStokes

• The previously described method is implemented in the

ParStokes-solver (Mika Malinen, CSC)

• ParStokes is nowadays part of the Elmer-distribution

oNo extra compilation is needed

o Interface in SIF has been tremendously simplified

oStill needs extra dummy-routines for providing the solution space for the preconditioning blocks (pressure,velocity); but their syntax is simplified. Basically only procedure call, if happy with default settings

oFull documentation under ElmerModelsManual (chapter 24)

oEplained on elmerice-Wiki

oTest-case: elmerfem/fem/tests/PasStokes_ISMIP_HOM_A010

Page 18: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

(New) direct parallel solver: cPardiso

• Solver very similar to MUMPS

• Included in MKL (comes with Intel compiler suite) – not open

source (to my knowledge)

• cPardiso can run mixed MPI/OpenMP codes, which Elmer is

oAlso OK to just use MPI side, like we can test in a session on Friday

• Simply include these lines into Solver section:

Linear System Solver = Direct

Linear System Direct Method = "cPardiso"

Page 19: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

Future developments

Page 20: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

• Modern CPU’s have 12+ cores

• Modern CPU’s have inproved vector

units

• Multi-threading (OpenMP) might

replace MPI for workstations

• Optimizing for that, using SIMD-

instructions within the code

Multi-threaded and SIMD version of advection/diffusion solver

Page 21: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

Optimization of Elmer(/Ice) using OpenMP Threading and SIMD

• Article in the proceedings which appear in the Lecture Notes

in Computer Science (vol 10468), Springer

• The following slides are an excerpt of a presentation shown at

this year’s 13th International Workshop on OpenMP (IWOMP)

in Stony Brook University, NY, USA

• Presentation permission granted by Intel Corp.

Byckling, M., J. Kataja, M. Klemm and T. Zwinger, 2017. OpenMP SIMD Vectorization and Threading of the Elmer Finite Element Software, Proceedings 13th International Workshop on OpenMP, Springer Lecture Notes in Computer Sicnece,123-137, doi:10.1007/978-3-319-65578-9_9

Page 22: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

Suggested exercise

Page 23: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

Bueler pofile parallel runs

• Based on the analytic equilibrium profile (SIA) from Bueler

• Training accounts + remote desktop on taito.csc.fi

1. Creating the footprint mesh

o Partitioning of mesh

2. Setting up diagnostic parallel run using

o Internal extrusion

o Free surface imposed by external routine buelerprofile.f90

3. Running the case

o With classic Navier-Stokes solver (MUMPS + cPardiso)

o With ParStokes Solver

Page 24: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

Bueler pofile parallel runs

• Two different kind of footprint meshes

1. footprint_bueler.geo for small runs on laptop (ParStokesmight not scale)

2. footprint_bueler_f.geo for node-size runs on supercomputer (< 24 partitions)

3. footprint_bueler_ff.geo for (massive) parallel runs

• Create mesh from gmsh:

$ gmsh -2 footprint_bueler.geo

• Convert mesh into Elmer mesh

$ ElmerGrid 14 2 footprint_bueler.msh –autoclean

Add –metis 24 4 at the end to create a 24 partition parallel mesh

Page 25: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

• Using GetEmergenceVelocity

after the Stokes solution to

display the corresponding

equilibrium accumulation-

ablation function of the

diagnostic solution

• Use footprint_f.geo

Bueler pofile parallel runs

Page 26: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

MUMPS cPardiso ParStokes

> 30 mins > 30 mins 132.66 s

Bueler pofile parallel runs

Page 27: Simulations of ice-flow and its interaction with the lower ...elmerfem.org/elmerice/wiki/lib/exe/fetch.php?media=... · this year’s 13th International Workshop on OpenMP (IWOMP)

End of session