domain-specific languages and automated code generation ... · domain-specific languages and...
TRANSCRIPT
![Page 1: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/1.jpg)
Domain-specific languages and automated
code generation for scientific computing
Garth N. Wells
Department of Engineering, University of Cambridge
Software Frameworks for Challenging ComputationalProblems, University of Crete
14 January 2013
![Page 2: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/2.jpg)
Collaborators
Martin S. Alnæs, Johan Hake, Anders Logg, Marie E. Rognes,Kristian B. Ølgaard
http://www.eng.cam.ac.uk/~gnw20
![Page 3: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/3.jpg)
Outline
• Examples of expressive computing for PDEs
• Domain-specific languages for scientific computing
• FEniCS libraries for solving PDEs
• FEniCS examples
• Scientific software community building
http://www.eng.cam.ac.uk/~gnw20
![Page 4: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/4.jpg)
![Page 5: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/5.jpg)
Availability
All code available under GNU licenses:
http://www.fenicsproject.org
Book available under a Creative Commons license:
http://www.fenicsproject.org/book
http://www.eng.cam.ac.uk/~gnw20
![Page 6: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/6.jpg)
Reaction-diffusion equation
Differential format
−∇2u + u = f in Ω
∇u · n = 0 on ∂Ω
Variational format: find u ∈ V ⊂ H1 (Ω) such that
a (u, v) = L (v) ∀ v ∈ V
Bilinear and linear forms
a(u, v) :=
∫Ω∇u · ∇v + uv dx
L(v) :=
∫Ω
fv dx
http://www.eng.cam.ac.uk/~gnw20
![Page 7: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/7.jpg)
Reaction-diffusion equationComplete solver – Python interface
from dolfin import *
# Create mesh and define function spacemesh = UnitCubeMesh(16, 16, 16)V = FunctionSpace(mesh, "Lagrange", 1)
# Define variational problemu = TrialFunction(V)v = TestFunction(V)f = Expression("sin(x[0])*sin(x[1])")a = dot(grad(u), grad(v))*dx + u*v*dxL = f*v*dx
# Compute solutionu = Function(V)solve(a == L, u)
plot(u, interactive=True)
http://www.eng.cam.ac.uk/~gnw20
![Page 8: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/8.jpg)
Stokes equations
Differential format:
−∇2u +∇p = f in Ω
∇ · u = 0 in Ω
Find u,p ∈ V × Q such that
a((u,p), (v,q)) = L((v,q)) ∀ v,q ∈ V × Q
where
a((u,p), (v,q)) :=
∫Ω∇u : ∇v− p∇ · v + (∇ · u) q dx
L((v,q)) :=
∫Ω
f · v dx
http://www.eng.cam.ac.uk/~gnw20
![Page 9: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/9.jpg)
Stokes equationsDomain-specific language representation
# Create mixed space (Taylor-Hood)V = VectorElement("Lagrange", "tetrahedron", 2)Q = FiniteElement("Lagrange", "tetrahedron", 1)TH = V * Q
# Create trial and test functions(u, p) = TrialFunctions(TH)(v, q) = TestFunctions(TH)
# Coefficient function appearing in Lf = Coefficient(V)
# Define formsa = inner(grad(u), grad(v))*dx - p*div(v)*dx + div(u)*q*dxL = dot(f, v)*dx
http://www.eng.cam.ac.uk/~gnw20
![Page 10: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/10.jpg)
Nonlinear Poisson-like equation
Differential format:
−∇ ·(
1 + u2)∇u = f
Variational format: find u ∈ V such that
F (u; v) = 0 ∀ v ∈ V
where the functional F is linear in v and nonlinear in u
F (u; v) :=
∫Ω
(1 + u2
)∇u · ∇v− fv dx
http://www.eng.cam.ac.uk/~gnw20
![Page 11: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/11.jpg)
Nonlinear Poisson-like equationDomain-specific language representation
# Function spaceV = FiniteElement("Lagrange", "tetrahedron", 2)
# Coefficientsu = Coefficient(V)f = Coefficient(V)
# Define residual (want to solve F = 0)v = TestFunction(V)F = (1.0 + u*u)*dot(grad(u), grad(v))*dx - f*v*dx
# Jacobian and incremental correction for a Newton solverdu = TrialFunction(V)J = derivative(F, u, du)
http://www.eng.cam.ac.uk/~gnw20
![Page 12: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/12.jpg)
Domain-specific languages for scientific computing
• Language designed to support an application domain
• Expressive, mathematical syntax
• Support high-level abstractions
• Correctness checks
• Scope for domain-specific optimisations
• Represent intention – oblivious to low-level details
http://www.eng.cam.ac.uk/~gnw20
![Page 13: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/13.jpg)
Traditional development approaches
• Time consuming and error prone
• Mathematical abstraction discarded in softwarerepresentation
• Often blurred boundary between method definition andimplementation
• Efficiency – readability/generality paradox
• New hardware is shifting the burden back onto thedeveloper
• Traditional programming languages are static
http://www.eng.cam.ac.uk/~gnw20
![Page 14: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/14.jpg)
What DSLs can deliver: accelerated development
• Expressiveness
• Compact representations of intention
• Reduction in errors
• Address multiple low-level programming modelstransparently (threaded, MPI, OpenCL, CUDA, FPGA, . . .)
• Extensible (if well designed)
• Creation of auxiliary problems, e.g. Jacobians, adjoints forPDEs
http://www.eng.cam.ac.uk/~gnw20
![Page 15: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/15.jpg)
What DSLs can deliver: higher performance
• Readable input code→ fast execution code
• Algorithm-specific optimisations, e.g.• (AT)T = A,• ∇u = 0 if u is constant
• Generate code representations that are not feasible byhand
• Search an algorithm space
• Target-specific low-level code
http://www.eng.cam.ac.uk/~gnw20
![Page 16: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/16.jpg)
Not all domain-specific languages are equally
expressive . . .
Input File for Wave Eqn
@THORN SimpleWave@DERIVATIVESPDstandard2nd[i_] -> StandardCenteredDifferenceOperator[1,1,i],PDstandard2nd[i_, i_] -> StandardCenteredDifferenceOperator[2,1,i],PDstandard2th[i_, j_] -> StandardCenteredDifferenceOperator[1,1,i] *
StandardCenteredDifferenceOperator[1,1,j]@END_DERIVATIVES@TENSORSphi, pi
@END_TENSORS@GROUPSphi -> "phi_group",pi -> "pi_group"
@END_GROUPS@DEFINE PD = PDstandard2nd...
...@CALCULATION "initial_sine"@Schedule "AT INITIAL"@EQUATIONSphi -> Sin[2 Pi (x - t)],pi -> -2 Pi Cos[2 Pi (x - t)]
@END_EQUATIONS@END_CALCULATION@CALCULATION "calc_rhs"@Schedule "in MoL_CalcRHS"@EQUATIONSdot[phi] -> pi, dot[pi] -> Euc[ui,uj] PD[phi,li,lj]
http://hpc.pnl.gov/conf/wolfhpc/2011/talks/StevenBrandt.pdfhttp://www.eng.cam.ac.uk/~gnw20
![Page 17: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/17.jpg)
Some domain-specific languages for non-PDE
applications in scientific computing
• Elemental (dense linear algebra)
• SPL/Spiral (digital signal processing)
• Tensor Contraction Engine (quantum chemistry)
• . . .
http://www.eng.cam.ac.uk/~gnw20
![Page 18: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/18.jpg)
Some DSLs for solving PDEs numerically
Domain-specific languages (DSL)
• Analysa
• FreeFEM++
Domain-specific embedded languages (DSEL)
• Liszt (Scala)
• FEEL++ (C++)
• Sundance (C++)
• AceGen (Mathematica) – not open
• Unified Form Language (Python)
http://www.eng.cam.ac.uk/~gnw20
![Page 19: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/19.jpg)
FEniCS Project
• Collaborative project on automating the solution of PDEs
• Modular collection of free software libraries
http://www.fenicsproject.org
http://www.eng.cam.ac.uk/~gnw20
![Page 20: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/20.jpg)
Main FEniCS components
• FIAT (tabulation of basis functions)
• Unified Form Language (UFL)
• Instant (just-in-time compilation)
• FEniCS Form Compiler (FFC)
• UFC (generated code form specification)
• DOLFIN (problem solving environment)
http://www.eng.cam.ac.uk/~gnw20
![Page 21: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/21.jpg)
Main FEniCS components
• FIAT (tabulation of basis functions)
• Unified Form Language (UFL)
• Instant
• FEniCS Form Compiler (FFC)
• UFC (generated code form specification)
• DOLFIN (problem solving environment)
http://www.eng.cam.ac.uk/~gnw20
![Page 22: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/22.jpg)
UFL: Unified Form LanguageA language embedded in Python for variational forms basedon mathematical abstractions – involves both a specificationand algorithms
Sub-languages
• Function spaces• Expressions• Forms
Algorithms
• Adjoints• Differentiation• Extraction based on form arity• . . .
Alnæs, 2012; Alnæs, Logg, Rognes, Ølgaard, Wells,
http://arxiv.org/abs/1211.4047http://www.eng.cam.ac.uk/~gnw20
![Page 23: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/23.jpg)
UFL: example language elements (1)
Function spaces
P2 = VectorElement("Lagrange", "triangle", 2)P1 = FiniteElement("Discontinuous Lagrange", "triangle", 1)R = FiniteElement("Real", "triangle", 0)ME0 = P2*P1ME1 = MixedElement([P2, [P1, P1], P1, R])
Expressions
u = Function(P2)I = Identity(element.cell().d) # Identity tensorF = I + grad(u) # Deformation gradientC = F.T*F # Right Cauchy-Green tensor
# Invariants of deformation tensorsIc = tr(C)J = det(F)
# Stored strain energy densitypsi = (mu/2)*(Ic - 3) - mu*ln(J) + (lmbda/2)*(ln(J))**2
http://www.eng.cam.ac.uk/~gnw20
![Page 24: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/24.jpg)
UFL: example language elements (2)
FormsM = f*dx(2) + f*ds(5)L = f*v*dx + g*v*dsa = dot(grad(u),grad(v))*dx - dot(avg(jump(u,n), grad(v)))*dSa = dot(grad(u),grad(v))*dx(0, "quadrature_order": 1)
Form operatations
M = action(F, f)L = lhs(F)a = rhs(F)
Algorithms
L = derivative(F, u, v)a = derivative(L, u, du)
http://www.eng.cam.ac.uk/~gnw20
![Page 25: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/25.jpg)
UFL: abstract syntax tree for H1-conforming Poisson
formulation
Form a
Cell integral
*
kappa
L
inner
R
grad
u v
R
grad
L
Form L
Cell integral Exterior facet integral
*
v
L
f
R
*
-1
L
*
R
v g
L R
http://www.eng.cam.ac.uk/~gnw20
![Page 26: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/26.jpg)
UFL: abstract syntax tree for L2-conforming Poisson
formulationForm a
Cell integral Exterior facet integral Interior facet integral
*
dot
R
kappa
L
grad
u v
L
grad
R
+
+
R
*
L
*
v
L
[]
R
*
dot
R
-1
L
i_19
[]
grad
L
i_16
R
i_19
][
R
*
L
*
circumradius
R
2
L
[]
R
n
L
L
*
R
i_17
4
dot
RL
][
R
][
L
*
L
kappa
R
L
*
R
RL
u
L
i_17
R
[]
i_18
R
grad
L
*
R L
i_16
R
*
L
/
RL
R
L
][
L
i_18
R
R
L
R L
RL
+
*
L
+
R
[-]
*
][
*
L
i_12
R
[-]
u
[]
grad
L
i_12
R
][
*
L
i_9
R
*
[]
R
kappa
L
*
0.5
L
+
R
+
][
R
][
L
[+]
n
v i_14
grad
L
i_8
R
][
*
L
i_13
R
0.5
-1
*
L
+
R
2 *
[-]
R
[]
L
+
[+]
R
[-]
L
i_9
[+]
i_10
*
*
R
4
L
/
R L
[]
L R
0.5
i_11
][
i_14
R
*
L
*
L
dot
R
[]
R
[+]
L
[-][]
L R
[+]
+
R
[-]
L[+]
L
+
R
i_13
][
*
L
i_15
R
L
+
R
[]
L R
+
RL
i_15
circumradius
RL
[-]
[]
R L
*
R
[+]
L
R
0.5
L
[+]
L R
i_10
L
[+]
R
[-]
L
*
R
i_11
R L
*
L
[+]
R
RL
RL
R L
*
L
[-]
R
R
[-]
L
L
R
L R
dot
L
R
][
i_8
L R
L
R
LR
RL
L R
http://www.eng.cam.ac.uk/~gnw20
![Page 27: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/27.jpg)
UFL: features
• Mathematical error checking
• Basic optimisations (must be floating-point safe)• Multiply by one, zero• add zero• Constant folding
• Developed in Python
http://www.eng.cam.ac.uk/~gnw20
![Page 28: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/28.jpg)
Abstract syntax tree to concrete code: compilers
Generality Efficiency
Compiler
Some UFL compilers:
• FEniCS Form Compiler, FFC (Logg, Ølgaard, Rognes and Wells)
• Symbolic Form Compiler, SFC (Alnæs and Mardal)
• Manycore Form Compiler (Markall, Rathberger, et al.)
http://www.eng.cam.ac.uk/~gnw20
![Page 29: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/29.jpg)
Automation with FEniCS
Input
Equation (variational problem)
Output
Efficient application-specific code
Kirby and Logg 2006; Ølgaard, Logg and Wells, 2010, Logg and Wells, 2010, . . .http://www.eng.cam.ac.uk/~gnw20
![Page 30: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/30.jpg)
FFC: FEniCS Form Compiler// This code conforms with the UFC specification version 2.1.0+// and was automatically generated by FFC version 1.1.0+.//// This code was generated with the option ’-l dolfin’ and// contains DOLFIN-specific wrappers that depend on DOLFIN.//// This code was generated with the following parameters://// cache_dir: ’’// convert_exceptions_to_warnings: False// cpp_optimize: False// cpp_optimize_flags: ’-O2’// epsilon: 1e-14// error_control: False// form_postfix: True// format: ’dolfin’// log_level: 10// log_prefix: ’’// no_ferari: True// optimize: True
Kirby and Logg 2006; Ølgaard and Wells, 2010; Logg et al, 2012http://www.eng.cam.ac.uk/~gnw20
![Page 31: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/31.jpg)
FFC: generation-time performance optimisations
• Novel representations (Kirby and Logg, ACM TOMS 2006)
• Structure-based methods to reducing floating pointoperations (Kirby, et al, SISC 2005)
• Symbolic analysis to minimise floating point operations(Ølgaard & Wells, ACM TOMS 2010)
http://www.eng.cam.ac.uk/~gnw20
![Page 32: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/32.jpg)
FFC: representations
Poisson element matrix:
kij :=
∫E∇φi · ∇φj dx
1. ‘Tensor contraction’ representation (affine map only)
kij := AijklGkl
where A is independent of the geometry, G is dependenton geometry.
2. Quadrature
kij =N∑
q=1
d∑α1=1
d∑α2=1
d∑β=1
Wq ∂Xα1∂xβ
∂Φi(Xq)
∂Xα1
∂Xα2∂xβ
∂Φj(Xq)
∂Xα2det F
http://www.eng.cam.ac.uk/~gnw20
![Page 33: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/33.jpg)
FFC: tensor contraction representationPoisson element stiffness matrix
kij = AijklGkl
where
Aijkl =
∫E0
∂φi
∂Xk
∂φj
∂Xldx
Gkl = det F∂φk
∂xm
∂φl
∂xm
• A is model specific and can be evaluated prior to run-time
• G is dependent on element geometry and is evaluated atrun-time
• Contraction can be unrolled
Kirby & Logg, ACM TOMS 2006
http://www.eng.cam.ac.uk/~gnw20
![Page 34: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/34.jpg)
Tensor contraction optimisations
Matrix representation of A for Poisson equation (Lagrange,k = 2):
3 0 0 -1 1 1 -4 -4 0 4 0 00 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0-1 0 0 3 1 1 0 0 4 0 -4 -41 0 0 1 3 3 -4 0 0 0 0 -41 0 0 1 3 3 -4 0 0 0 0 -4-4 0 0 0 -4 -4 8 4 0 -4 0 4-4 0 0 0 0 0 4 8 -4 -8 4 00 0 0 4 0 0 0 -4 8 4 -8 -44 0 0 0 0 0 -4 -8 4 8 -4 00 0 0 -4 0 0 0 4 -8 -4 8 40 0 0 -4 -4 -4 4 0 -4 0 4 8
Exploit structures in A to reduce operation count – findcomplexity reducing relationshipsKirby et al. ACM TOMS 2005, 2006
http://www.eng.cam.ac.uk/~gnw20
![Page 35: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/35.jpg)
FFC: quadrature optimisations
1. Eliminate operations on zeroes a priori
2. Tabulate basis functions
3. Simplify expressions, e.g. x(y + z) + 2xy→ x(3y + z)
4. Loop invariant code motion to reduce floating pointoperations
Ølgaard and Wells, ACM TOMS 2010
http://www.eng.cam.ac.uk/~gnw20
![Page 36: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/36.jpg)
FFC: quadrature optimisations – runtime performanceWeighted Laplace
none
-zeros
-simplify
-simplify-zeros-ip
-ip-zeros
-basis
-basis-zeros
101
102
103tim
e[s
]
-O0
-O2
-O2 -funroll-loops
-O3
http://www.eng.cam.ac.uk/~gnw20
![Page 37: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/37.jpg)
FFC: quadrature optimisations – runtime performanceHyperelasticity
none
-zeros
-simplify
-simplify-zeros-ip
-ip-zeros
-basis
-basis-zeros
100
101
102
103
104
105tim
e[s
]
-O0
-O2
-O2 -funroll-loops
-O3
http://www.eng.cam.ac.uk/~gnw20
![Page 38: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/38.jpg)
FFC: relative performance
a (u,u) =
∫E(f0f1 . . . fnf ) ∇
su : ∇su dx
nf = 1 nf = 2 nf = 3flops q/t flops q/t flops q/t
p = 1, q = 1 888 0.34 3060 0.36 10224 0.11p = 1, q = 2 3564 1.42 11400 1.01 35748 0.33p = 1, q = 3 10988 3.23 34904 1.82 100388 0.63p = 1, q = 4 26232 5.77 82548 2.87 254304 0.93
p = 2, q = 1 888 1.20 8220 0.31 54684 0.09p = 2, q = 2 7176 1.59 41712 0.49 284232 0.11p = 2, q = 3 22568 2.80 139472 0.71 856736 0.17p = 2, q = 4 54300 4.36 337692 1.01 2058876 0.23
p = 3, q = 1 3044 0.36 30236 0.16 379964 0.02p = 3, q = 2 12488 0.92 126368 0.26 1370576 0.03p = 3, q = 3 36664 1.73 391552 0.37 4034704 0.05p = 3, q = 4 92828 2.55 950012 0.49 9566012 0.06
p: order of fi
q: order of u and vhttp://www.eng.cam.ac.uk/~gnw20
![Page 39: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/39.jpg)
DOLFIN: problem solving environment
• Main FEniCS user interface
• Re-usable library designed to support generatedapplication-specific code
• Third-party linear algebra interfaces (PETSc, Trilinos,NumPy, ...)
• Distributed and shared memory parallelism
http://www.eng.cam.ac.uk/~gnw20
![Page 40: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/40.jpg)
DOLFIN: interfaces
• Near identical C++ and Python interfaces
• Python interface generated largely automatically fromC++ using SWIG
• Smart pointers provide robust memory managementbetween C++ and Python interfaces
• Python interface dramatically reduces user adoptionthreshold
• Limited use of templates in high-level user interfacemakes Python wrapping tractable
• UFL, FFC and DOLFIN are seamlessly integrated in Pythoninterface
http://www.eng.cam.ac.uk/~gnw20
![Page 41: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/41.jpg)
DOLFIN: C++ Poisson demo#include <dolfin.h>#include "Poisson.h"using namespace dolfin;.int main()// Create mesh and function spaceUnitSquareMesh mesh(32, 32);Poisson::FunctionSpace V(mesh);
// Define boundary conditionConstant u0(0.0);DirichletBoundary boundary;DirichletBC bc(V, u0, boundary);
// Define variational formsPoisson::BilinearForm a(V, V);Poisson::LinearForm L(V);Source f;L.f = f;
// Compute solutionFunction u(V);solve(a == L, u, bc);
// Save solution in VTK formatFile file("poisson.pvd");file << u;
http://www.eng.cam.ac.uk/~gnw20
![Page 42: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/42.jpg)
DOLFIN: Python Poisson demofrom dolfin import *
# Create mesh and define function spacemesh = UnitSquareMesh(32, 32)V = FunctionSpace(mesh, "Lagrange", 1)
# Define Dirichlet boundary (x = 0 or x = 1)def boundary(x): return x[0] < DOLFIN_EPS or x[0] > 1.0 - DOLFIN_EPS
# Define boundary conditionu0 = Constant(0.0)bc = DirichletBC(V, u0, boundary)
# Define variational problemu, v = TrialFunction(V), TestFunction(V)f = Expression("10*exp(-(pow(x[0]-0.5, 2) + pow(x[1]-0.5, 2))/0.02)")g = Expression("sin(5*x[0])")a = inner(grad(u), grad(v))*dxL = f*v*dx + g*v*ds
# Compute solutionu = Function(V)solve(a == L, u, bc)
# Save solution in VTK formatFile("poisson.pvd") << u
# Plot solutionplot(u, interactive=True)
http://www.eng.cam.ac.uk/~gnw20
![Page 43: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/43.jpg)
Reconciling high-level scripted interfaces and
performanceJust-in-time compilation
http://www.eng.cam.ac.uk/~gnw20
![Page 44: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/44.jpg)
DOLFIN: parallel
mpirun -np 1024 python demo.py
http://www.eng.cam.ac.uk/~gnw20
![Page 45: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/45.jpg)
DOLFIN: parallel hardware
• Distributed paradigm (message passing) straightforward
• Intra-node hard• Changing hardware• Changing languages• Non-uniform memory access (NUMA)• Hard to develop good performance models to select best
strategy
• Effective threading crucial on modern lowmemory-per-core machines
• DOLFIN currently being tested/developed on two systemsin the top 10 of the Top 500 list
http://www.eng.cam.ac.uk/~gnw20
![Page 46: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/46.jpg)
DOLFIN: Mira at Argonne National Laboratory (Blue
Gene/Q)
http://www.eng.cam.ac.uk/~gnw20
![Page 47: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/47.jpg)
DOLFIN: intra-node parallelismColoured mesh
http://www.eng.cam.ac.uk/~gnw20
![Page 48: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/48.jpg)
DOLFIN: threaded matrix assembly – single socketIntel Core i7-980 (6 cores) data no re-ordering
1 2 3 4 5 6number of threads
0
1
2
3
4
5
6sp
eed
up fa
ctor
PoissonNavier-Stokesideal
http://www.eng.cam.ac.uk/~gnw20
![Page 49: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/49.jpg)
DOLFIN: threaded matrix assembly – single socketIntel Core i7-980 (6 cores) with re-ordering for data locality
1 2 3 4 5 6number of threads
1
2
3
4
5
6sp
eed
up fa
ctor
PoissonNavier-Stokesideal
http://www.eng.cam.ac.uk/~gnw20
![Page 50: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/50.jpg)
DOLFIN: threaded matrix assembly – dual socket
NUMA2 x Intel Xeon X5690 (12 cores) with for data locality re-ordering
2 4 6 8 10 12number of threads
2
4
6
8
10
12sp
eed
up fa
ctor
PoissonNavier-Stokesideal
http://www.eng.cam.ac.uk/~gnw20
![Page 51: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/51.jpg)
DOLFIN: threaded matrix assembly – dual socket
NUMA2 x Intel Xeon X5690 (12 cores) with re-ordering but no matrix insertion
2 4 6 8 10 12number of threads
2
4
6
8
10
12sp
eed
up fa
ctor
PoissonNavier-Stokesideal
http://www.eng.cam.ac.uk/~gnw20
![Page 52: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/52.jpg)
DOLFIN: some ongoing developments
• Hybrid threaded/MPI computation
• Distributed mesh refinement
• Multi-domain code generation
• New code generation optimisation strategies
• Target-specific code generation
http://www.eng.cam.ac.uk/~gnw20
![Page 53: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/53.jpg)
Examples: hyperelasticity
Displacement field u? given by:
u? = argminu∈V
Π(u)
where
• Π :=∫
Ω ψ(E(u))− B · v dx−∫∂Ω T · v ds
• ψ (E) is the strain energy density
• E :=(FTF− I
)/2 is the Green-Lagrange strain
• F := ∇Xu + I is the deformation gradient.
http://www.eng.cam.ac.uk/~gnw20
![Page 54: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/54.jpg)
Examples: hyperelasticity as a minimisation
problem (1)
V = VectorElement("Lagrange", "tetrahedron", 1)
# Current displacementu = Coefficient(V)
# Body force per unit volume and traction force (on reference config)B, T = Coefficient(V), Coefficient(V)
# KinematicsI = Identity(V.cell().d) # Identity tensorF = I + grad(u) # Deformation gradientC = F.T*F # Right Cauchy - Green tensor
# Invariants of deformation tensorsJ, Ic = det(F), tr(C)
http://www.eng.cam.ac.uk/~gnw20
![Page 55: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/55.jpg)
Examples: hyperelasticity as a minimisation
problem (2)
# Elasticity parametersmu, lmbda = 100, 0.3
# Stored strain energy density (compressible neo-Hookean model)psi = (mu/2)*(Ic - 3) - mu*ln(J) + (lmbda /2)*(ln(J))**2
# Total potential energyPi = psi*dx - dot(B, u)*dx - dot(T, u)*ds
# First variation of Piv = TestFunction(V)F = derivative(Pi, u, v)
# Compute Jacobian of Fdu = TrialFunction(V)a = derivative(F, u, du)
http://www.eng.cam.ac.uk/~gnw20
![Page 56: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/56.jpg)
Examples: time-dependent problemsLinear advection–diffusion equation
At time tn+1, given un, a and f , find un+1 ∈ V such that
F(un+1; v) = 0 ∀v ∈ V
where
F :=
∫Ω
un+1 − un
∆tv + a ·∇un+1/2v +∇un+1/2 ·∇v− fn+1/2v
http://www.eng.cam.ac.uk/~gnw20
![Page 57: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/57.jpg)
Examples: time-dependent problemsLinear advection–diffusion implementation
# Function spaceV = FunctionSpace(mesh, "Lagrange", 1)
# Advective velocityvelocity = Constant( (-100.0, 0.0) )
# Solution from previous time stepu0 = Coefficient(V)
# Trial and test functionsu, v = TrialFunction(V), TestFunction(V)
# Mid-point solutionu_mid = 0.5*(u0 + u)
# Variational problem posed at mid-pointF = (u - u0)*v*dx + dt*(dot(velocity, grad(u_mid)*v)*dx
+ dot(grad(u_mid), grad(v))*dx)
# Extract bilinear and linear formsa, L = lhs(F), rhs(F)
http://www.eng.cam.ac.uk/~gnw20
![Page 58: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/58.jpg)
Examples: coupled systems of PDEsV = VectorElement("Lagrange", "triangle", 2)W = FiniteElement("Discontinuous Lagrange", "triangle", 1)Q = FiniteElement("Brezzi-Marini-Douglas", "triangle", 2)P = FiniteElement("Nedelec 1st kind H(curl)", "triangle", 2)
# Define nested mixed spaceZ = MixedElement([[V, W], Q, P]).U = Coefficient(Z).p_mid = (1 - theta)*p0 + theta*p.# F_i for each processF0 = . . . .F1 = . . . .F2 = . . . ..# Want to solve F = 0F = F0 + F1 + F2 + . . .
# Jacobiana = derivative (F, U, dU)
http://www.eng.cam.ac.uk/~gnw20
![Page 59: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/59.jpg)
Community: third-party libraries and applications
http://www.eng.cam.ac.uk/~gnw20
![Page 60: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/60.jpg)
Community: forums
Development repositories, bug tracker, mailing lists, answerforums hosted at http://launchpad.net
http://www.eng.cam.ac.uk/~gnw20
![Page 61: Domain-specific languages and automated code generation ... · Domain-specific languages and automated code generation for scientific computing Garth N. Wells Department of Engineering,](https://reader034.vdocuments.us/reader034/viewer/2022042201/5ea17910c5a1c96b405fdc25/html5/thumbnails/61.jpg)
FEniCS’13 Workshop – University of Cambridge18–19 March 2013
http://www.eng.cam.ac.uk/~gnw20