fast iterative solution of models of incompressible...
TRANSCRIPT
Fast Iterative Solution of Models of Incompressible Flow
Howard ElmanUniversity of Maryland
1
In collaboration with:
• Victoria Howle • David Kay• Daniel Loghin • Milan Mihajlovic • John Shadid • Robert Shuttleworth• David Silvester • Ray Tuminaro • Andy Wathen
Sandia National LaboratoriesUniversity of SussexUniversity of Birmingham University of Manchester Sandia National Laboratories University of MarylandUniversity of ManchesterSandia National Laboratories University of Oxford
2
Outline
1. General approach: Block preconditioners for Navier-Stokes problems
2. Performance in an applied setting: MPSalsa
3. Application: Microfluidics
4. Ongoing / future research
3
Goal: Robust general solution algorithmsEasy to implementDerived from subsidiary building blocksAdaptible to a variety of scenarios
(steady / evolutionary / Stokes / Boussinesq)
General Statement of Problem:Incompressible Navier-Stokes Equations
0 div grad)grad( 2
t
=−=+⋅+∇−
ufpuuuu να
α=0 → steady state problemα=1 → evolutionary problem
=
− gf
pu
CBBF T
δδ
Discretization and linearization Matrix equation
Ax=b
4
xxbx ˆ ,]ˆ][[ 11 −− == QAQ
General Approach to Preconditioning
−=
S
TF
QBQ
0Q
Ax=bSolving
Use preconditioner of form
=
− gf
pu
CBBF T
δδ
Solve right-preconditioned system
using Krylov subspace method (GMRES)
+−=
−
−= −−−
−−−−
111
1111
)(
)(0 S
TFF
ST
FF
S
TF
T
QCBBQBQ
QBIFQFQQ
BQCB
BF1-AQ
5
General Approach to Preconditioning
=
+=
+−=
−
−=
−
=
−−−
=
−−−
−−−−
IBFI
QCBBFBFI
QCBBQBQ
QBIFQFQQ
BQCB
BF
SQ
ST
FQ
ST
FF
ST
FF
S
TF
T
SF
1111
111
1111
0)(
0
)(
)(0
1-AQ
S
F ~ convection-diffusion operatorS = Schur complement matrix
Seek approximation to inverses of
Key point: Build using methods for scalar operators,use existing (multigrid) code
Eigenvalues 1 Convergence in two steps
6
Two Strategies for Preconditioning S
111 −−− ≡ pppS AFMQ1. Pressure Convection-Diffusion Preconditioner
−=
S
TF
QBQ
0Q
===
p
p
p
MFA
Discrete pressure Poisson operator
Discrete convection-diffusion operator on pressure space
Pressure mass matrix
2. Least Squares Commutator1111111 ))(()( −−−−−−− ≡ T
uT
uuT
uS BBMBFMBMBBMQ
Comments:• main cost: pressure Poisson solve• PCD (1): requires (user) specification of auxiliary operators• LSC (2): user independent
7
Derivation of these Methods
∇∇⋅+∇−≈∇⋅+∇−∇ up ww )()( 22 ννRequires pressure convection-diffusion operator
Discrete analogue:
ppT
usT
Tuupp
Tu
MFBBMQBBF
BMFMFMBM111
1111
−−−
−−−−
≡≈⇒
≈
pA
1. PCD: start with commutator of operators
2. LSC: define Fp to minimize
uMpu
Tu
Tuu FMBMBMFM ))(())(( 1111 −−−− −
1111111 ))(()( −−−−−−− ≡⇒ Tu
Tuu
TuS BBMBFMBMBBMQ
8
Properties of these Methods
1
0
−
−=
S
TF
QBQ1-QTo implement in GMRES: need action of
Convection-diffusion solve for Poisson solve(s) for
1−FQ
1−SQ
Both approximatedusing “off-the-shelf”algebraic MG
Implementation:
Convergence properties:• PCD: convergence rate independent of discretization mesh size• LSC: some dependence on mesh size, but often faster• Both: mild dependence on Reynolds number (steady-state)
no dependence on Re (transient)
9
Preliminary PerformanceResults
StepNewton system
CavityPicard system
E., Silvester, & Wathen
10
Relation to SIMPLE
−=
−
− I
BFI
BBFB
F
B
BF T
T
T
0
0
0
1
1
−
−
− I
BFI
BFBB
Q T
T
F
0
ˆ
ˆ0 1
1
≈
QF: approximate convection-diffusion solve
F: diagonal part of FN.B. Does not take convection into account
Many variants (SIMPLEC: F = diag(row-sum(F))
^
Semi-Implicit Method for Pressure-Linked EquationsPatankar & Spaulding, 1972
^
11
Benchmarking using MPSalsa
MPSalsa (Shadid, Salinger, Hennigan, Pawlowski, Smith, Wilkes,O’Rourke)
General purpose parallel code• models low Mach number, incompressible and variable densityfluid flows
• coupled with heat transport, multi-component species transport• discretizes using biquadratic Petrov-Galerkin (Galerkin least squares) finite elements on unstructured grids
• offers Krylov subspace solvers with ILU/domain decomposition
Task: • Integrate and test block preconditioner within MPSalsa• Build using existing Sandia software
12
Benchmark Problems
1. 2D Driven Cavity
2. 3D Driven Cavity
3. 2D flow over a diamond obstructionInflow-outflow b.c., unstructured grid
13
Benchmark Problems
4. 3D flow over a cube obstruction
14
Criteria used in Numerical Experiments
≤
−−
−gf
pu
CB
BuFgf T
410ˆ)(
Nonlinear residual
Solving nonlinear algebraic system
=
− gf
pu
CB
BuF T
ˆ)(
Using Newton’s method. Stop when iterate satisfies
pu
=
− g
fT
rr
pu
CB
BFδδ
ˆJacobean system:
15
Criteria used in Numerical Experiments
Stop GMRES iteration when
≤
−−
−
g
fk
kT
g
f
rr
pu
CBBF
rr 5
)(
)(10
δδ
Report average over Newton runiterationsCPU times
Computations done on Sandia National Laboratories’Institutional Computing Cluster, with up to 64 dual Intel 3.6GHz Xenon processors with 2GB RAM each.
Solve system using Pressure Convection-Diffusion (PCD) preconditioned GMRES
16
14
1664
86.5 26.4300.3 130.2528.8 593.1 NC NC
52.0 50.871.8 87.9
109.8 410.5169.4 941.2
35.0 28.734.9 59.541.3 102.141.0 345.7
64 x 64128 x 128256 x 256512 x 512
100
14
1664
79.4 19.4220.6 79.8467.2 619.41356.8 2901.9
41.8 32.966.0 78.9
104.3 229.2164.0 619.4
19.4 17.221.2 28.423.0 69.323.2 257.2
64 x 64128 x 128256 x 256512 x 512
10
14
1664
NC NC352.5 275.8839.5 2009.6NC NC
NC NC142.0 1220.4
251.6 3494.2401.2 7598.2
NC NC126.4 570.9 126.6 1207.6143.2 2563.2
64 x 64128 x 128256 x 256512 x 512
1000
Procs1-level DDIters Time
SIMPLEIters Time
PCDIters Time
Mesh sizeRe
Results: 2D Cavity
17
18
64
62.2 615.5162.6 1533.2385.5 6460.9
33.3 1302.652.5 2457.6291.2 14987.2
40.2 946.947.8 1061.650.1 2101.2
32 x 32 x 3264 x 64 x 64
128 x 128 x 128
50
18
64
67.0 634.6159.8 1507.5356.2 4529.3
30.5 1205.6 50.8 2034.1280.8 12490.5
28.0 802.328.4 865.231.1 1249.0
32 x 32 x 3264 x 64 x 64
128 x 128 x 128
10
18
64
67.0 730.7159.8 2131.6356.2 6953.9
40.8 1884.461.6 3184.4
299.1 17184.2
56.0 1232.762.1 1697.8 64.2 3019.2
32 x 32x 3264 x 64x 64
128 x 128 x 128
100
Procs1-level DDIters Time
SIMPLEIters Time
PCDIters Time
Mesh sizeRe
Results: 3D Cavity
14
1664
101.7 198.8273.8 1118.6864.5 6226.0
NC NC
66.5 760.5104.7 1920.3160.8 2985.2402.1 8241.3
34.9 248.040.4 384.643.6 445.949.1 736.6
62K256K1M4M
25
14
1664
110.8 186.6282.6 1054.9890.2 6187.4NC NC
52.8 502.283.6 1203.9
130.8 1845.3212.6 5834.6
21.7 138.822.6 192.725.6 252.329.7 397.5
62K256K1M4M
10
14
1664
70.4 267.2203.9 1269.3770.0 6933.5
NC NC
74.8 1278.7113.6 2718.9260.9 7535.0 410.1 11992.2
64.6 565.868.9 975.272.7 1039.278.3 1528.6
62K256K1M4M
40
Procs1-level DDIters Time
SIMPLEIters Time
PCDIters Time
UnknownsRe
Results: 2D Flow over Diamond Obstruction
19
18
64
69.4 889.2132.4 2676.1637.2 18646.0
49.2 2109.284.9 3201.3
140.2 28156.1
35.9 1209.738.7 1797.744.7 2397.7
270K2.1M
16.8M
50
18
64
67.2 859.8151.2 2004.0667.2 20908.0
45.2 1897.1 79.3 4593.2
118.7 19907.1
20.7 997.721.7 1507.524.7 1997.7
270K2.1M
16.8M
10
Procs1-level DDIters Time
SIMPLEIters Time
PCDIters Time
UnknownsRe
Results: 3D Flow over Cube Obstruction
Graphical Depiction of these Results
0
5000
10000
15000
20000
25000
30000
270K 2.1M 16.8M0
2000
4000
6000
8000
10000
12000
14000
16000
32^3 64^3 128^3
0
2000
4000
6000
8000
10000
12000
62K 256K 1M 4M
Pressure conv-diff
Simple
Domain decomposition
3D Cavity,Re=50
3D Flow over Obstacle, Re=50
2D Flow over Obstacle, Re=40
UnknownsUnknowns
UnknownsCPU Time
CPU Time
21
Implementation Issues
1. Solving subsidiary scalar problems (convection-diffusion andPoisson equations) using “off-the-shelf” algebraic multigridsoftware ML (smoothed aggregation).
2. Solving these systems “inexactly”.
3. Other components of the code built using Sandia tools,(Trilinos, Meros, Epetra, Aztec,CHACO, NOX), which handle nonlinear and Krylov subspace solvers and all parallelism.
22
Application: Topology of MicroFluidics Devices
High level problem statement:• Mix two liquids at low Re• Flow driven by electrokinetic means: induced chargeelectro-osmosis (ICEO), via charge on interior obstacles
• Goal: choose shape and topology of obstructions to optimize“mixing metric”
embeddedelectrodes
load
mix
Collaboration with SNL’s Thermal/Fluid Science & EngineeringGroup (M. P. Kanouff, J.Templeton)
23
Computational Procedure
Given topology of device (38 parameters):
Electric field on obstacles obtained by solving theLaplace equation for electric potential , tangential component of E= defines velocity b.c. along obstructions
Solve incompressible NS equations
Use computed velocity u to obtain mass fraction of solute
Calculate mixing metric = measure of extent of mixing
∇
0)grad(2 =⋅+∇− mumD
V
dVmmM∫ −
=2)(
24
Minimize M with respect to 38 design parameters
Optimization performed using derivative-free asynchronous parallel pattern search, via APPSPACK (Gray, Griffen, Hough, Kolda, Torczon)
Optimization loop:
Computational Procedure
Software environment:
SUNDANCE (K. Long)
Results: Use PCD-Preconditioned GMRES
CPU timeIteration Counts
20898.266.3
20488.967.3
20515.560.4
20173.869.2
20643.168.2
20923.966.1
21874.167.1
20831.162.1
21765.164.0
26
Examples of Flow Fields Computed
M = 0.0233216
M = 0.000923394M = 0.000811796
M = 0.032451
Original M = 0.0287106
27
=
hgf
pTu
BFH
BGF
T
Tu
δδδ
000
Ongoing Efforts
1. Extension of these ideas to spectral element methods
2. Use of these ideas for stability analysis of flows: solve
=
qwM
qw
BBF u
T
000
0λ
3. Extension of approach to handle thermal / chemical effects
E.g. Boussinesq model
Build using additive Schwarz methods with fast diagonalizationmethods on subdomains
4. Uncertainty quantification: solution algorithms for problemsposed with uncertainty