applied math issues in facets speaker:lois curfman mcinnes, anl core: alexander pletzer, tech-x...

Applied Math Issues in FACETS Speaker: Lois Curfman McInnes, ANL Core: Alexander Pletzer, Tech-X John Cary, Johan Carlsson, Tech-X: Core solver Srinath Vadlamani, Tech-X: Turbulent flux computation via FMCFM Ammar Hakim, Mahmood Miah, Tech-X: FACETS infrastructure Allen Malony, Alan Morris, Sameer Shende, Paratools: Performance analysis Greg Hammett, PPPL: Suggesting stable time stepping schemes Alexei Pankin, Lehigh University: Providing core transport benchmark against ASTRA Edge: Tom Rognlien, LLNL John Cary et al., Tech-X: FACETS integration Ron Cohen, LLNL: Edge physics, scripting Hong Zhang, ANL: Nonlinear solvers Satish Balay, ANL: Portability and systems issues via TOPS, no FACETS funding Maxim Umansky, LLNL: BOUT physics, no FACETS funding Sean Farley, LSU: Math grad student, summer 2008 at ANL + ongoing: BOUT/PETSc interface Mike McCourt, IIT and Cornell: Applied math grad student, summer 2007 at ANL: UEDGE/PETSc interface Coupling: Don Estep, CSU Du Pham, Simon Tavener, CSU: Analysis of stability and accuracy issues in coupling Ron Cohen, Tom Rognlien, LLNL: Physics issues in coupling FACETS SciDAC Review May 14, 2009

Upload: justin-white

Post on 18-Dec-2015




0 download


Applied Math Issues in FACETS

Speaker: Lois Curfman McInnes, ANL

Core: Alexander Pletzer, Tech-X– John Cary, Johan Carlsson, Tech-X: Core solver– Srinath Vadlamani, Tech-X: Turbulent flux computation via FMCFM– Ammar Hakim, Mahmood Miah, Tech-X: FACETS infrastructure– Allen Malony, Alan Morris, Sameer Shende, Paratools: Performance analysis– Greg Hammett, PPPL: Suggesting stable time stepping schemes– Alexei Pankin, Lehigh University: Providing core transport benchmark against ASTRA

Edge: Tom Rognlien, LLNL– John Cary et al., Tech-X: FACETS integration– Ron Cohen, LLNL: Edge physics, scripting– Hong Zhang, ANL: Nonlinear solvers– Satish Balay, ANL: Portability and systems issues via TOPS, no FACETS funding– Maxim Umansky, LLNL: BOUT physics, no FACETS funding– Sean Farley, LSU: Math grad student, summer 2008 at ANL + ongoing: BOUT/PETSc interface– Mike McCourt, IIT and Cornell: Applied math grad student, summer 2007 at ANL: UEDGE/PETSc interface

Coupling: Don Estep, CSU− Du Pham, Simon Tavener, CSU: Analysis of stability and accuracy issues in coupling− Ron Cohen, Tom Rognlien, LLNL: Physics issues in coupling

FACETS SciDAC ReviewMay 14, 2009


Nonlinear PDEs pervade FACETS components

– Core (via new core solver, Tech-X)– Edge (via UEDGE and BOUT, LLNL)

• Discussion emphasizes– PDE representation of physics– Parallelization and performance analysis– Stability and accuracy issues in

coupling– Collaborations with SciDAC CETs and Institutes

• Future work– Core-edge coupling as we move to

implicit coupling– Possibly kinetic models in edge physics

via Edge Simulation Laboratory (ESL)– Possibly wall and sources components

• Initial focus: Fully implicit Newton methods in


TOPS provides enabling technology to FACETS; FACETS motivates enhancements to TOPS

TOPS Overview −TOPS develops, demonstrates, and

disseminates robust, quality engineered, solver software for high-performance computers

−TOPS institutions: ANL, LBNL, LLNL, SNL, Columbia U, Southern Methodist U, U of California - Berkeley, U of Colorado - Boulder, U of Texas - Austin

PI: David Keyes, Columbia

Towards Optimal Petascale Simulations





FACETS fusion


• Design and implementation of “solvers”

– Linear solvers

– Eigensolvers

– Nonlinear solvers

– Time integrators

– Optimizers

• Software integration• Performance optimization

Overall scope of TOPS

0),,,( =ptxxf &

0),( =pxF

bAx =

BxAx λ=

0,0),(..),(min ≥= uuxFtsuxu


(with sensitivity analysis)

(with sensitivity analysis)


Linear solver


Time integrator

Nonlinear solver

Indicates dependence

Sensitivity AnalyzerSUNDIALS, TrilinosTAO, Trilinos

PARPACK, SuperLU, Trilinos

hypre, PETSc, SuperLU, Trilinos

PETSc, Trilinos


Primary emphasis of TOPS numerical software


Nonlinear PDEs in Core and Edge Components

Dominant computation of each can be expressed as nonlinear PDE: Solve F(u) = 0, where u represents the fully coupled vector of unknowns

Core: 1D conservation laws:

where q = {plasma density, electron energy density, ion energy density}

F = fluxes, including neoclassical diffusion,electron/ion temperature,gradient induced turbulence, etc.

s = particle and heating sources and sinks

Challenges: highly nonlinear fluxes


∂t+∇ • F = s

Edge: 2D conservation laws: Continuity, momentum, and thermal energy equations for electrons and ions:

, where & are electron and ion densities and mean velocities

where are masses, pressures, temperatures are particle charge, electric & mag. fields are viscous tensors, thermal forces, source

where are heat fluxes & volume heating termsAlso neutral gas equation

Challenges: extremely anisotropic transport, extremely strong nonlinearities, large range of spatial and temporal scales


∂t+∇ • (ne,ive,i) = Se,i




∂t+ me,ine,ive,i • ∇ve,i =∇pe,i +qne,i(E + ve,i × B /c)







2nve,i • ∇Te,i + pe,i∇ • ve,i = −∇ • qe,i −Π e,i • ∇ve,i +Qe,i€

me,i, pe,i,Te,i

q, E, B

−∇• Πe,i −Re,i + Se,im


Πe,i, Re,i, Se,im


FACETS/TOPS collaboration focuses on nonlinearly implicit methods

• Popular nonlinear solution approaches– Explicit Methods

• Splitting of coupled variables– Often by equation or by coordinate direction– Motivated by desire to solve complicated

problems with limited computer resources– Semi-Implicit Methods

• Maintain some variable couplings– Fully Implicit Methods

• Maintain all variable couplings• For example, preconditioned Newton-Krylov methods

• Implicit algorithms have demonstrated efficient and scalable solution for many magnetic fusion energy problems


• Newton: Solve: Update:• Krylov: Projection methods for solving linear

systems, Ax=b, using the Krylov subspace

– Popular methods: GMRES, TFQMR, BiCGStab, CG, etc.

• Preconditioning: In practice, typically needed− Transform Ax=b into an equivalent form: or where inverse action of B approx. that of A, but at a smaller cost

• Matrix-free: Newton-like convergence without the cost of computing/storing the true Jacobian, F’(u)− Krylov: Compute only Jacobian-vector products, F’(u) v− Preconditioning: Typically use ‘cheaper’ approx. for F’(u) or its

inverse action


F’(u ) u = – F(u ) u = u + dul-1l


l-1 l-1

Newton-Krylov methods are efficient and robust

F '(ul+1) ∂ul = − F(ul−1)

ul = ul+1 + λ ∂ul

K j = span(r0,Ar0,A2r0,K ,A


B−1Ax = B−1b

(AB−1) (Bx) = b


PETSc provides parallel Newton-Krylov solvers via SNES

•PETSc: Portable, Extensible Toolkit for Scientific computation–– Targets parallel solution of large-scale PDE-based

problems•SNES: Scalable Nonlinear Equations Solvers

– Emphasizes Newton-Krylov methods– Uses high-level abstractions for matrices, vectors,

linear solvers • Easy to customize and extend• Supports matrix-free methods• Facilitates algorithmic experimentation

– Jacobians available via application, Finite Differences (FD) and Automatic Differentiation (AD)


Core and Edge components use PETSc flexibility via SNES

Solve F(u) = 0: Fully implicit matrix-free Newton-Krylov methods

F '(ul+1) ∂ul = − F(ul−1)

ul = ul+1 + λ ∂ul

– Can choose from among a variety of algorithms and parallel data structures

– UEDGE now has access to many more parallel solver options






PETSc code

Application code

application or PETSc for Jacobian (finite differencing)

Matrices Vectors

Krylov SolversPreconditioners






















UEDGE + Core Solver Drivers (+ Timestepping + Parallel Partitioning)

Nonlinear Solvers (SNES)Options originally used by UEDGE


Challenges in nonlinear solvers for core

• Plasma core is the region well inside the separatrix• Transport along field lines >> perpendicular transport, leading to

homogenization in poloidal direction• Core satisfies set of 1D conservation laws:

q = {plasma density, electron energy density, ion energy density} F = highly nonlinear fluxes including neoclassical diffusion, electron/ion temperature gradient induced turbulence, etc.s = particle and heating sources and sinks

− New FACETS capability: get s from NUBEAM via core/sources coupling


hot plasma core


∂t+∇ • F = s


Implicit core solver applies nested iteration with parallel flux computation

• Extremely nonlinear fluxes lead to stiff profiles (can be numerically challenging)– Implicit time stepping for stability– Coarse-grain solution easier to find– Nested iteration used to obtain fine-grain solution– Flux computation typically very expensive, but problem dimension relatively small– Parallelization of flux computation across “workers” …“manager” solves nonlinear equations on 1 proc using PETSc/SNES

• Runtime flexibility in assembly of time integrator (including any diagonally implicit Runge-Kutta scheme) for improved accuracy

Nonlinear solve


Flexibility of FACETS framework allows users to explore time stepping schemes with no change to source code

Nested iterationimproves robustnessof nonlinear solver

• Explicit method is unstable• Crank-Nicholson is marginally stable• Use BDF1 for stability and accuracy• Other schemes, e.g., various IMEXSSP can be coded at runtime.

Stable to ETGmodes

Sharp kink develops

Ref: A. Pletzer,, "Benchmarking the parallel FACETS core solver," Poster presented at the 50th Annual Meeting of the Division of Plasma Physics, Dallas, TX, November 17-21, 2008.

Radial coordinate





re (unstable)


Participation of Paratools identified performance bottleneck in core solver

Paratools (A. Malony et al.)affiliated with the SciDAC Performance Engineering Research Institute (PERI)

• Load imbalance responsible for lack of scalability at high processor count (128)

• Also, careful profiling identifies redundant flux computation at low processor count (8)


Challenges in nonlinear solvers for edge

•UEDGE Issues– Strong nonlinearities– Parallel Jacobian


•UEDGE Features– Multispecies plasma; variables ni,e, u||i,e, Ti,e for

particle density, parallel momentum, and energy balances

– Reduced Navier-Stokes or Monte Carlo neutrals– Multi-step ionization and recombination– Finite volume discretization; non-orthogonal mesh– Steady-state or time dependent


More complete parallel Jacobian data enables robust solution for problems with strong nonlinearities

• New capability: Computing parallel Jacobian matrix using matrix coloring for finite diff.– More complete parallel Jacobian data enables

more robust parallel preconditioners• Impact

– Enables inclusion of neutral gas equation (difficult for highly anisotropic mesh, not possible in prior parallel UEDGE approach)

– Useful for cross-field drift casesPoloidal distance

UEDGE parallel partitioning

8 proc: Matrix-free Newton w. GMRES: Block Jacobi stagnates; complete Jacobian data enables convergence

Previous parallel UEDGE Jacobian(Block Jacobi only)

Recent progress: Complete parallel Jacobian data

5 equations: ion density, ion velocity, gas density diffusion, electron temp, ion temp

Missing Jacobian elements


Computational experiments explore efficient and robust edge solvers

Matrix-free Newton with GMRES,

8-proc case for LU preconditioner:

• 57% time: UEDGE parallel setup (17 sec)

• 43% time: parallel nonlinear solver (13 sec)

• 8%: Create Jacobian data structure, determine parallel coloring, scaling

• 6%: Compute Jacobian: FD approx via coloring, including 40 f(u) computations• 4%: Compute f(u) for RHS + line search • 25%: Linearized Newton solve: GMRES/LU via MUMPS (hold Jacobian/PC fixed

for 5 Newton iterations)

Problem size: 24,576 (128x64 mesh w. 3 unknowns per mesh pointComputational environment: Jazz @ ANL: Myrinet2000 network, 2.4 GHz Pentium Xeon procs with 1-2 GB of RAM


New work with BOUT uses both SUNDIALS (integrators) and PETSc (preconditioners)

BOUT (BOUndary Turbulence), LLNL• Motivation and physics

– Radial transport driven by plasma turbulence; BOUT(C++) to provide fundamental edge model

• 2D UEDGE approx. turbulent diffusion• 3D BOUT models turbulence in detail

– Ion and electron fluids; electromagnetic – Full tokamak edge cross-section

BOUT edge density turbulence, ni/ni

• Numerics and tools– Finite-difference; 2D parallel partitioning– Time dep; implicit PVODE/CVODE– Can couple turbulent fluxes to UEDGE

• Current status within FACETS− Parallel BOUT/PETSc/SUNDIALS verified against original BOUT− Transitioning to BOUT++− Experimenting with preconditioners


Preliminary investigation of model problems reveals stability issues arising from coupling

Explicit coupling• Implicit Euler for each

component solve• “Nonoverlapping” coupling

strategy• 512 cells in each component

Simple model problem

Weak instability• There is a weak instability for equal diffusion constants


Numerical analysis tasks over the next year

• Devise and analyze a sequence of model problems– The model problems should have increasing complexity

• Two coupled heat equations in one dimension with various coupling strategies

• Coupled one-dimension – two dimension heat equations• Add strong inhomogeneous behavior “parallel” to the interface

boundary• Add complications: rapid changes in diffusion in the interior,

nonlinear diffusion, multirate time integration– Conduct numerical studies using “manufactured” solutions

with realistic behavior for various coupling strategies– Carry out rigorous stability analysis for various coupling

strategies and general solutions– Carry out analogous tests for FACETS codes

• Extend a posteriori analysis techniques to finite volume methods for coupled problems– Apply to nonlinear problems with realistic discretizations

by computing stability


FACETS motivates new PETSc capabilities that benefit the general community

•New features included in Dec 2008 release of PETSc-3.0– SNES: limit Newton updates based on application-

defined criteria for maximum allowable step• Needed by UEDGE

– MatFD: parallel interface to matrix coloring for sparse finite difference Jacobian estimation• Needed by UEDGE

•New research: FACETS core-edge coupling inspires support for strong coupling between models in nonlinear solvers– multi-model algebraic system specification– multi-model algebraic system solution


FACETS/TOPS work inspires new research for SciDAC CS/math teams

• General Challenge: How to make sound choices during runtime among available implementations and parameters, suitably compromising among – accuracy, performance, algorithmic robustness, etc.

• FACETS Challenge: How to select and parameterize preconditioned Newton-Krylov algorithms at runtime based on problem instance and computational environment?

• Research in Computational Quality of Service (CQoS)– Goal: Develop general-purpose infrastructure for dynamic

component adaptivity, i.e., composing, substituting, and reconfiguring running component applications in response to changing conditions

– Collaboration among SciDAC math/cs teams• Center for Technology for Advanced Scientific Component Software (TASCS),

Paratools, Performance Engineering Research Institute (PERI), and TOPS

– FACETS-specific capabilities can leverage this infrastructure


FACETS collaborations on ‘solvers’ with SciDAC math/CS teams & CSU are essential

•TOPS, CSU, Paratools, PERI, and TASCS provide enabling technology to FACETS– TOPS: Parallel solvers via numerical libraries– CSU: Insights to stability/accuracy in coupling– PERI/Paratools: Performance analysis and tuning– TASCS: Component technology (ref: T. Epperly)

•FACETS motivates new work by CSU, TOPS, Paratools, PERI, and TASCS– New CSU research on stability & accuracy issues– New TOPS library features + algorithmic research– New capabilities in TASCS/PERI/Paratools for