a arxiv:1409.8116v2 [math.na] 11 dec 2014 · 2.1 pseudo-spectralmethod 3...

16
PoisFFT - A Free Parallel Fast Poisson Solver V. Fuka a,* a Department of Meteorology and Environment Protection, Faculty of Mathematics and Physics, V Holešovičkách 2, Prague 8, Czech Republic 18000 Abstract A fast Poisson solver software package PoisFFT is presented. It is available as a free software licensed under the GNU GPL license version 3. The package uses the fast Fourier transform to directly solve the Poisson equation on a uniform orthogonal grid. It can solve the pseudo-spectral approximation and the second order finite difference approximation of the continuous solution. The paper reviews the mathematical methods for the fast Poisson solver and discusses the software implementation and parallelization. The use of PoisFFT in an incompressible flow solver is also demonstrated. Keywords: Poisson equation, fast Poisson solver, parallel, MPI, OpenMP 1. Introduction The Poisson equation is an elliptic partial differential equation with a wide range of applications in physics or engineering. In this paper the form 2 ϕ = g, (1) in a rectangular domain Ω R d , for d =1 ... 3, with boundary conditions specified on Ω is considered. The solution ϕ and the specified source term g are both sufficiently smooth real functions on Ω. The equation can be used for computations of electrostatic and gravitational potentials, to solve potential flow of ideal fluids or to perform pressure correction when solving the incompressible Navier-Stokes equations. There are many methods for the approximate numerical solution of the Pois- son equation. It can be discretized in many ways using the finite difference, finite volume, finite element or (pseudo-)spectral methods. In many problems the goal is to compute the most accurate approximation of the continuous problem, while in other applications it is necessary to use the The source code for this software can be found at https://github.com/LadaF/PoisFFT. * Corresponding author Email address: [email protected] (V. Fuka) Preprint submitted to Applied Mathematics and Computation 12th December 2014 arXiv:1409.8116v2 [math.NA] 11 Dec 2014

Upload: others

Post on 20-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

PoisFFT - A Free Parallel Fast Poisson SolverI

V. Fukaa,∗

aDepartment of Meteorology and Environment Protection, Faculty of Mathematics andPhysics, V Holešovičkách 2, Prague 8, Czech Republic 18000

Abstract

A fast Poisson solver software package PoisFFT is presented. It is available as afree software licensed under the GNU GPL license version 3. The package usesthe fast Fourier transform to directly solve the Poisson equation on a uniformorthogonal grid. It can solve the pseudo-spectral approximation and the secondorder finite difference approximation of the continuous solution. The paperreviews the mathematical methods for the fast Poisson solver and discussesthe software implementation and parallelization. The use of PoisFFT in anincompressible flow solver is also demonstrated.

Keywords: Poisson equation, fast Poisson solver, parallel, MPI, OpenMP

1. Introduction

The Poisson equation is an elliptic partial differential equation with a widerange of applications in physics or engineering. In this paper the form

∇2ϕ = g, (1)

in a rectangular domain Ω ⊂ Rd, for d = 1 . . . 3, with boundary conditionsspecified on ∂Ω is considered. The solution ϕ and the specified source term gare both sufficiently smooth real functions on Ω. The equation can be used forcomputations of electrostatic and gravitational potentials, to solve potential flowof ideal fluids or to perform pressure correction when solving the incompressibleNavier-Stokes equations.

There are many methods for the approximate numerical solution of the Pois-son equation. It can be discretized in many ways using the finite difference, finitevolume, finite element or (pseudo-)spectral methods.

In many problems the goal is to compute the most accurate approximationof the continuous problem, while in other applications it is necessary to use the

IThe source code for this software can be found at https://github.com/LadaF/PoisFFT.∗Corresponding authorEmail address: [email protected] (V. Fuka)

Preprint submitted to Applied Mathematics and Computation 12th December 2014

arX

iv:1

409.

8116

v2 [

mat

h.N

A]

11

Dec

201

4

2

discretization consistent with other parts of a larger solver. This is true forthose incompressible flow solvers which enforce the discrete continuity equationby solving the discrete Poisson equation[1].

The fast direct solvers of the Poisson equation have a long history. Themethod of cyclic reduction was used by Hockney [2], Buzbee et al. [3] andBuneman [4]. The Fourier analysis method was developed by Cooley et al.[5], Wilhelmson and Ericksen [6] and Swarztrauber [7, 8] for various boundaryconditions. The methods of the cyclic reduction and Fourier analysis can becombined in the FACR algorithm [8]. The discrete Fourier transforms usedwith different boundary conditions were surveyed by Schumann and Sweet [9].

The requirement of a rectangular grid or other simple type of grid can betoo restrictive for many applications, but the fast solution on a regular grid canbe often used as a preconditioner for a solution on an unstructured grid or forproblems with non-constant coefficients (e.g., ∇ · (a(x)∇ϕ) = g)[10].

Many implementations of fast Poisson solvers exist, but their software im-plementation is often not publically available, or it is a part of a larger softwarepackage and is able to solve only those cases, which are applicable in that partic-ular area [11, 12]. One of the more general ones is the package FISHPACK[13]which solves the second order finite difference approximation of the Poisson orHelmholtz equation on rectangular grids. Its origin predates FORTRAN 77 andalthough it has been updated in 2004 to version 5.0 to be compatible with For-tran 90 compilers, the non-existent parallel version, the remaining non-standardfeatures and missing interfaces to other programming languages make it obso-lete.

This paper presents the free software implementation of a fast direct solverfor the Poisson equation in 1, 2 and 3 dimensions with several types of boundaryconditions and with both the pseudo-spectral and the finite difference approx-imation. The solver is called PoisFFT. Section 2 contains the overview of themethods used to solve the Poisson equations in PoisFFT. Section 3 briefly in-troduces the software implementation and its performance is tested in Section4. In Section 5 we present the application of the Poisson solver to the solutionof the incompressible flow.

2. Solution methods

2.1. Pseudo-spectral methodThe first type of approximation PoisFFT solves is the pseudo-spectral method

approximation which is achieved by decomposing the continuous problem us-ing the Fourier series and solving it in the frequency domain. Consider the 1DPoisson problem (1) on interval [0, L]. With periodic boundary conditions anysufficiently smooth function f on this interval can be decomposed as

f(x) =

∞∑k=−∞

fke2πikx/L, (2)

2.1 Pseudo-spectral method 3

where the complex coefficients are computed as

fk =1

L

ˆ L

0

f(x)e−2πikx/Ldx. (3)

After substituting from (2) for ϕ and g we get

−(

2πk

L

)2

ϕk = gk, ∀k, (4)

which can be used to compute the solution in the basis of the Fourier modes. Themodes e2πikxx/Lx are eigenvectors of the linear operator ∂2

∂x2 with the eigenvaluesλk = −

(2πkL

)2.

In the pseudo-spectral method the solution and the source term are evaluatedon a grid of points. Consider a uniform grid of n+ 1 points

xj = jL

n, j = 0 . . . n. (5)

Due to the periodic boundary conditions we can consider only n points duringthe computations

fn = f (xn) = f(x0) = f0. (6)

The point values fj can be transformed using

fj =

n−1∑k=0

fke2πikj/n, (7)

fk =1

n

n−1∑j=0

fje−2πikj/n. (8)

Eq. (8) is the discrete Fourier transform (DFT) which is used to compute thecoefficients of the vector fj in the basis of the discrete Fourier modes e2πikj/n.Eq. (7) is the inverse discrete Fourier transform (IDFT) and is used to computethe point values fj .

The pseudo-spectral approximate solution can be computed using Eq. (4)for the coefficients of the discrete Fourier modes on the finite grid.

When the boundary conditions are not periodic, different set of basis func-tions is used. For the Dirichlet boundary conditions

f = 0 on ∂Ω (9)

2.1 Pseudo-spectral method 4

the Fourier series uses only the real sine functions

f(x) =

∞∑k=1

fk sin (πkx/L) , (10)

fk =2

L

ˆ L

0

f(x) sin (πkx/L) dx, (11)

while for the Neumann boundary conditions

∂f

∂x= 0 on ∂Ω (12)

the Fourier series uses only the real cosine functions

f(x) =f0

2+

∞∑k=1

fk cos (πkx/L) , (13)

fk =2

L

ˆ L

0

f(x) cos (πkx/L) dx, (14)

and the Fourier coefficients fk are real numbers.Also, two different kinds of the uniform grid are possible. The first type

of the grid is the regular grid, which in the case of the Dirichlet boundaryconditions contains only the internal points

xj = (j + 1)L

n+ 1, j = 0 . . . n− 1, (15)

whereas in the case of the Neumann boundary conditions it also contains theboundary nodes

xj = jL

n− 1, j = 0 . . . n− 1. (16)

The other type is the staggered grid

xj =

(j +

1

2

)L

n, j = 0 . . . n− 1. (17)

In the psudo-spectral method on a grid different discrete sine transforms(DST) and discrete cosine transforms (DCT) have to be used depending on thegrid and the boundary conditions. They are summarized in Table 1.

Finally the eigenvalues are specified in the same notation as the discretetransform in Table 3. The whole algorithm of the Poisson solver can be sum-marized as

2.1 Pseudo-spectral method 5

Table 1Real discrete Fourier transforms

DST-I fk = 2∑n−1j=0 fj sin (π(j + 1)(k + 1)/(n+ 1))

DST-II fk = 2∑n−1j=0 fj sin (π(j + 1/2)(k + 1)/n)

DST-III fk = (−1)kfn−1 + 2∑n−2j=0 fj sin (π(j + 1)(k + 1/2)/n)

DCT-I fk = f0 + (−1)kfn−1 + 2∑n−2j=1 fj cos (πjk/(n− 1))

DCT-II fk = 2∑n−1j=0 fj cos (π(j + 1/2)k/n)

DCT-III fk = f0 + 2∑n−1j=1 fj cos (πj(k + 1/2)/n)

Table 2Discrete Fourier transforms used in the Poisson solver

boundary conditions grid forward backward

periodic regular DFT IDFT

Dirichlet regular DST-I 12(n+1)DST-I

Dirichlet staggered DST-II 12nDST-III

Neumann regular DCT-I 12(n−1)DCT-I

Neumann staggered DCT-II 12nDCT-III

2.2 Finite difference method 6

Table 3The eigenvalues for the eigenvectors corresponding to the transforms in Table 2 in thepseudo-spectral approximation

boundary conditions grid eigenvalues

periodic regular λk = −(

2πkL

)2, k < [n/2]− 1

λk = −(

2π(n−k)L

)2

, k = [n/2]− 1

Dirichlet regular λk = −(π(k+1)L

)2

Dirichlet staggered λk = −(π(k+1)L

)2

Neumann regular λk = −(πkL

)2Neumann staggered λk = −

(πkL

)21. Compute the coefficients of the right-hand side g in the basis of eigen-

vectors of the discrete Laplacian using the forward discrete transforms inTable 2.

2. Compute the solution in this basis by dividing the coefficients by theeigenvalues in Table 3.

3. Transform the solution back to the basis of the grid point values using thebackward transforms in Table 2.

The efficiency of this algorithm comes from the possibility of using the FastFourier Transform (FFT) to perform the discrete transforms.

2.2. Finite difference methodIn certain application (e.g., computational fluid dynamics) it may be neces-

sary to discretize the Poisson equation by a different scheme to ensure compat-ibility with other parts of a larger solver. A common way to discretize Eq. (1)are the second order central finite differences

∂2ϕ

∂x2(xj) '

ϕj+1 − 2ϕj + ϕj−1

(4x)2 , (18)

where 4x = xj − xj−1. This scheme was used in most of the early studies onfast Poisson solvers [6, 7, 8].

Discretization (18) leads to the system of linear algebraic equations

Lϕ = g, (19)

where L is the matrix resulting from (18) and from the boundary conditions.As reviewed by Schumann and Sweet [9] the eigenvectors of matrix L are thoseused in the discrete transforms in Table 2 with eigenvalues summarized in Table4.

2.3 Multiple dimensions 7

Table 4The eigenvalues for the eigenvectors corresponding to the transforms in Table 2 in thesecond order central finite difference approximation according to Schumann and Sweet [9]

boundary conditions grid eigenvalues

periodic regular λk = −(

2 sin( kπn )∆x

)2

Dirichlet regular λk = −(

2 sin(π(k+1)2(n+1) )

∆x

)2

Dirichlet staggered λk = −(

2 sin(π(k+1)2n )

∆x

)2

Neumann regular λk = −(

2 sin( πk2(n−1) )∆x

)2

Neumann staggered λk = −(

2 sin(πk2n )∆x

)2

The algorithm for the solution of the algebraic system (19) is therefore equiv-alent to the algorithm for the pseudo-spectral approximation in Section 2.1 witha different set of eigenvalues.

2.3. Multiple dimensionsIf the boundary conditions on all domain boundaries are equal, the extension

of the algorithm to multidimensional problems is straightforward. The multidi-mensional discrete Fourier transforms are defined as separable products of theone dimensional transforms. For example, for the periodic boundary conditionsin 3D the following transform can be used

fj1j2j3 =

n1−1∑k1=0

e2πik1j1/n1

n2−1∑k2=0

e2πik2j2/n2

n3−1∑k3=0

fk1k2k3e2πik3j3/n3 , (20)

fk1k2k3 =1

n1n2n3

n1−1∑j1=0

e−2πik1j1/n1

n2−1∑j2=0

e−2πik2j2/n2

n3−1∑j3=0

fj1j2j3e−2πik3j3/n3 .

(21)

Efficient multidimensional discrete transforms are commonly included in theFFT libraries. Similar approach can be used when the boundary conditionsdiffer, as shown for the combination of the periodic and the staggered Neumannboundary conditions by Wilhelmson and Ericksen [6]. It is then necessary touse a sequence of transforms of smaller dimension.

The eigenvalues of the discrete 3D Laplace operator associated with theFourier mode fk1k2k3 are computed from the one dimensional eigenvalues as

λk1k2k3 = λk1 + λk2 + λk3 . (22)

8

x y

z

Fig. 1. Decomposition of the domain in MPI into a 2D 3×3 process grid.

3. Software implementation

The fast Poisson solver PoisFFT is a library written in Fortran 2003 withbindings to C and C++. It uses the FFTW3[14] library for the discrete Fouriertransforms and the PFFT[15] library for the MPI parallelization of FFTW3transforms. It is distributed as a free software with the GNU GPLv3 license,which also covers FFTW3 and PFFT. The solver objects in PoisFFT are avail-able in Fortran as separate derived types in the IEEE single and double precisionand as a template class in C++.

The library is parallelized using OpenMP for shared memory computers andusing MPI for distributed memory computers (e.g., clusters). Combination ofOpenMP and MPI at the same time is not implemented. It can also employ theinternal FFTW parallelization using POSIX threads. The MPI version of thelibrary uses the PFFT library to partition the 3D grid using a 2D decompositionalong the last two dimensions in Fortran (along the first two dimensions in C)as shown in Fig. 1.

The main advantage of PoisFFT should be its simplicity and generality. Theinitialization of the solver, the computation and the finalization of the solver arejust three calls to the library. The arrays containing the right hand side or thesolution can be a subset of a larger array. This situation arises in computationalfluid dynamics when ghost cells are defined around the solution domain.

PoisFFT can solve the Poisson equation in 1, 2 and 3 dimensions with any ofthe boundary conditions presented in Section 2 provided all domain boundariesuse the same boundary condition.

Because of practical applications in geophysical fluid dynamics, particularlyin the computational fluid dynamics model CLMM (Section 5), the combinationof the periodic boundary condition in the x direction and Dirichlet or Neumannboundary condition in the z direction is also implemented. In this case boundarycondition in the y direction can be periodic or identical to the condition usedin the z direction.

Because the parallel real 1D transform is not available in FFTW3 or PFFTwe parallelized this step using a global array transpose via the subroutine

9

MPI_Alltoallv() and local array transposes. This is necessary for the com-bined boundary conditions.

4. Performance evaluation

Due to the differences in the discrete Fourier transforms for different bound-ary conditions, the wall-clock times and the CPU times necessary for the solutionof the Poisson equation depend on the boundary conditions. We chose the 3Dsolvers with the periodic boundary conditions and the Neumann boundary con-ditions on the staggered grid to test the solver with the 3D complex transformand with the 3D real transform.

All tests were performed at the cluster zapat.cerit-sc.cz at the Czechsupercomputing center Cerit-SC. The computational nodes contained 2 sock-ets with 8-core Intel E5-2670 2.6 GHz processors and 128 GB RAM and wereconnected using an InfiniBand network. PoisFFT was compiled with the IntelFortran compiler 14.1 and linked with OpenMPI 1.8.

The first test examined the strong scaling on a fixed domain with 512 pointsin every dimension. The same problem was solved with an increasing numberof CPU cores. From the measured times in Fig. 2 it is clear that the OpenMPversion, which directly calls FFTW, is considerably faster than the MPI version,which calls the PFFT wrapper over FFTW and which has to perform significantamount of message passing between processes. However, OpenMP is limited tosingle node on the cluster used for the computations. One can also notice adecrease of efficiency when going from 16 to 32 cores. One node has 16 coresand the expensive network memory transfers slow the computation down whenmore nodes are needed.

For the weak scaling test we used a domain discretization in which the num-ber of points increased with the number of CPU cores. There were 256 pointsin each dimension per core. The complete problem size was therefore propor-tional to the number of cores. The results of the time measurements are in Fig.3. They are consistent with the scaling O (N ln (N)) which holds for the FFTtransforms[14].

5. Application of PoisFFT to simulation of incompressible flow

5.1. Model CLMMOriginally, PoisFFT was developed to support the atmospheric computa-

tional fluid dynamics code CLMM (Charles University Large-Eddy MicroscaleModel)[16]. This model simulates atmospheric flows using the incompressibleNavier-Stokes equations with an eddy viscosity subgrid model.

The incompressible Navier-Stokes equations are

∂u

∂t+∇ · (u⊗ u) = −∇p+ νsgs∇2u, (23)

∇ · u = 0, (24)

5.1 Model CLMM 10

0.1

1

10

1 2 4 8 16 32 64 128 256

t[s]

cores

3D Periodic

OpenMPMPI

1

10

100

1 2 4 8 16 32 64 128 256

speed

up

cores

3D Periodic

OpenMPMPI

0.1

1

10

1 2 4 8 16 32 64 128 256

t[s]

cores

3D Neumann

OpenMPMPI

1

10

100

1 2 4 8 16 32 64 128 256

speed

up

cores

3D Neumann

OpenMPMPI

Fig. 2. Strong scaling of PosFFT with 5123 points.

1

2

5

10

20

1 2 4 8 16 32 64 128 256

t[s]

cores

3D Periodic

OpenMPMPI

3.7 log10(10 x)

1

2

5

10

20

1 2 4 8 16 32 64 128 256

t[s]

cores

3D Neumann

OpenMPMPI

2.3 log10(10 x)

Fig. 3. Weak scaling of PoisFFT with 2563 points per core.

5.1 Model CLMM 11

where u is the velocity vector, p is the pressure and νsgs is the eddy viscositymodeled according to Nicoud et al. [17]. Eqs. (23)–(24) do not contain an evo-lution equation for pressure. The model CLMM uses the projection method[1]and the 3 stage Runge-Kutta method[18]

u(1) = un, p(1) = pn−1/2, (25)

u∗ − u(k)

∆t= −αk∇p(k) + γk

(−∇ · (u(k)u(k)) + ν∇2u(k)

)+

+ ζk

(−∇ · (u(k−1)u(k−1)) + ν∇2u(k−1)

)(26)

∇2φ =∇ · u∗

αk∆t, (27)

u(k+1) − u∗

αk∆t= −∇φ, (28)

p(k+1) = p(k) + φ, (29)for k = 1..3, and

un+1 = u(4), pn+1/2 = p(4), (30)

with the set of coefficients

α1 =8

15, α2 =

2

15, α3 =

1

3,

γ1 =8

15, γ2 =

5

12, γ3 =

3

4,

ζ1 = 0, ζ2 = −17

60, ζ3 = − 5

12. (31)

The spatial derivatives in (25)–(30) are discretized using the second ordercentral finite differences on a staggered grid[19] with the exception of the ad-vective terms −∇ · (u⊗ u) for which fourth order central differences are used.

In every Runge-Kutta stage first a predictor velocity field u∗ is computedand afterward it is corrected to be divergence free. The discrete Poisson prob-lem (27) must be solved in the corrector step, with the staggered Neumannboundary conditions on all outside boundaries except the boundaries where theperiodic boundary condition for the velocity are imposed and where the periodiccondition applies to the discrete Poisson equation as well. It is the task of thePoisFFT library to solve Eq. (27).

It is possible to use this model on the uniform grid to simulate certain typesof complex geometries thanks to the immersed boundary method (IBM)[20],which introduces additional sources of momentum and mass inside or near thesolid obstacles. In CLMM the IBM interpolations of Peller et al. [21] and themass sources according to Kim et al. [19] are implemented.

5.2 Performance test 12

100

1000

1 2 4 8 16 32 64 128

t[s]

cores

total - OpenMPtotal - MPI

PoisFFT - OpenMPPoisFFT - MPI

1

10

100

1 2 4 8 16 32 64 128

speed

up

cores

total - OpenMPtotal - MPI

Fig. 4. Strong scaling of CLMM and PoisFFT.

5.2. Performance test

To evaluate the effect of CLMM we chose the test case defined by the COSTAction ES1006 “Evaluation, improvement and guidance for the use of local-scale emergency prediction and response tools for airborne hazards in builtenvironments” with a virtual city called Michelstadt which serves for validationof models for flow and dispersion in urban environments[22]. The description ofthe geometry and the flow validation data are available at the CEDVAL-LESwebsite1. In this paper only the velocity field is simulated for a direct comparisonof the relative performance of the Poisson solver PoisFFT and other parts ofthe model.

The computational domain of the Michelstadt city had dimensions 1326 m× 836 m × 399 m and was discretized using a constant resolution of 3m in alldirections leading to the total number of 16.3 million cells. The performance wasmeasured for the first 100 time-steps for different parallel setups – OpenMP andMPI – with different numbers of CPU cores. The fastest configuration was usedto compute the flow for 10000 s of the real scale time and the average flow andturbulence was compared to the experimental values. The performance wastested on the cluster luna.fzu.cz in the Czech national grid MetaCentrum.The computational nodes have 2 sockets with the 8-core Intel Xeon E5-2650 2.6GHz CPU, 96 GB of RAM and are connected using an InfiniBand network.

The scaling of the wall-clock time for the solution of first 100 time-steps is inFig. 4. Although the MPI version of PoisFFT is slower than the OpenMP ver-sion for the same number of CPU cores, the whole model CLMM is as fast withMPI as with OpenMP on the used nodes up to 8 threads. The OpenMP ver-sion with 16 threads is slower than the MPI version with 16 processes. CLMMappears to benefit from the improved data locality in the distributed MPI ver-sion. The relative contribution of PoisFFT to the total CLMM solution timeis between 22 % – 36 % when using MPI. When using OpenMP the PoisFFTportion starts at 20 % for one and two threads and decreases for higher numbersof threads as the parallel efficiency of CLMM in OpenMP decreases. From these

1http://www.mi.uni-hamburg.de/CEDVAL-LES-V.6332.0.html, accessed on September28th, 2014.

13

2 0 2 4 6 8

U [m.s−1 ] wind tunnel

2

0

2

4

6

8

U[m.s−

1]

sim

ula

tion

3 2 1 0 1 2 3

V [m.s−1 ] wind tunnel

3

2

1

0

1

2

3

V[m.s−

1]

sim

ula

tion

Fig. 5. Scatter plots for the comparison of simulated and measured meanstream-wise (U) and lateral (V ) velocity components.

ratios it is clear that the efficient Poisson solver is a necessity for an efficientcomputation and that PoisFFT scales well enough so that it doesn’t become abottleneck at the moderate numbers of cores used for the test.

The correctness of the computation was evaluated by comparison of thetime-averaged simulated wind field with the measurements. Fig. 5 shows thecomparison of the values of the velocity components in the reference points ofCEDVAL-LES. An illustration of the simulated wind field at height 7.5 m abovethe ground is in Fig. 6.

6. Conclusions

The fast Poisson solver PoisFFT is able to compute a discrete approximatesolution to the Poisson equation in the pseudo-spectral or second order finitedifference approximation. It is parallel using OpenMP and MPI and the parallelscaling was demonstrated to be usable for moderate numbers of CPU cores.The scaling for the number of cores in the order of thousands remain untesteddue to the lack of suitable hardware. The experiments of Pippig [15] with theunderlying PFFT library suggest that the solver could scale even for thousandsor tens of thousands of cores.

It was demonstrated that PoisFFT can be employed for the solution of thediscrete Poisson equation in the projection step in an incompressible flow solver.It does not become a bottleneck at 128 cores and it is expected to scale welleven further.

The library can be used from Fortran, C and C++. Interfaces to other com-puter languages (e.g., Python) are planned.

Acknowledgment

Computational resources were provided by the MetaCentrum under the pro-gram LM2010005 and the CERIT-SC under the program Centre CERIT Sci-

REFERENCES 14

0

2

4

6

u [m/s]

-400

400

600

-600

0

0x[m]

y[m]

Fig. 6. The Michelstadt geometry with the instantaneous field of the windvelocity magnitude at height 7.5 m above ground. The blue, green and redbuildings are 15, 18 and 24 m high, respectively.

entific Cloud, part of the Operational Program Research and Development forInnovations, Reg. no. CZ.1.05/3.2.00/08.0144. The Michelstadt flow simula-tion was done in the framework of the COST Action ES1006. The work wassupported by the University Research Centre UNCE 204020/2012.

References

[1] D. L. Brown, R. Cortez, M. L. Minion, Accurate Projection Methodsfor the Incompressible Navier–Stokes Equations , Journal of Computa-tional Physics 168 (2) (2001) 464–499, ISSN 0021-9991, URL http://www.sciencedirect.com/science/article/pii/S0021999101967154.

[2] R. W. Hockney, A fast direct solution of Poisson’s equation using Fourieranalysis, Journal of the ACM (JACM) 12 (1) (1965) 95–113.

[3] B. L. Buzbee, G. H. Golub, C. W. Nielson, On direct methods for solvingPoisson’s equations, SIAM Journal on Numerical analysis 7 (4) (1970) 627–656.

[4] O. Buneman, Analytic inversion of the five-point Poisson operator, Journal of Computational Physics 8 (3) (1971) 500–505, ISSN0021-9991, URL http://www.sciencedirect.com/science/article/pii/0021999171900295.

[5] J. Cooley, P. Lewis, P. Welch, The fast Fourier transform algorithm: Pro-gramming considerations in the calculation of sine, cosine and Laplace

REFERENCES 15

transforms , Journal of Sound and Vibration 12 (3) (1970) 315–337, ISSN 0022-460X, URL http://www.sciencedirect.com/science/article/pii/0022460X70900751.

[6] R. B. Wilhelmson, J. H. Ericksen, Direct solutions for Poisson’s equationin three dimensions , Journal of Computational Physics 25 (4) (1977) 319–331, ISSN 0021-9991, URL http://www.sciencedirect.com/science/article/pii/0021999177900018.

[7] P. Swarztrauber, The Methods of Cyclic Reduction, Fourier Analysis andthe FACR Algorithm for the Discrete Solution of Poisson’s Equation on aRectangle, SIAM Review 19 (3) (1977) 490–501, URL http://dx.//doi.org/10.1137/1019071.

[8] P. N. Swarztrauber, Symmetric FFTs, Mathematics of Com-putation 47 (175) (1986) 323–346, URL http://www.ams.org/journals/mcom/1986-47-175/S0025-5718-1986-0842139-3/S0025-5718-1986-0842139-3.pdf.

[9] U. Schumann, R. A. Sweet, Fast Fourier transforms for direct solution ofPoisson’s equation with staggered boundary conditions , Journal of Com-putational Physics 75 (1) (1988) 123–137, ISSN 0021-9991, URL http://www.sciencedirect.com/science/article/pii/0021999188901027.

[10] M. S. Gockenbach, Understanding and implementing the finite el-ement method., SIAM, ISBN 978-0-89871-614-6, URL http://www.ec-securehost.com/SIAM/OT97.html, 2006.

[11] S. Laizet, N. Li, Incompact3d: A powerful tool to tackle turbulence prob-lems with up to O (105) computational cores, International Journal forNumerical Methods in Fluids 67 (11) (2011) 1735–1757.

[12] P. García-Risueño, J. Alberdi-Rodriguez, M. J. T. Oliveira, X. Andrade,M. Pippig, J. Muguerza, A. Arruabarrena, A. Rubio, A survey of the par-allel performance and accuracy of Poisson solvers for electronic structurecalculations, Journal of Computational Chemistry 35 (6) (2014) 427–444,ISSN 1096-987X, URL http://dx.//doi.org/10.1002/jcc.23487.

[13] P. N. Swarztrauber, R. A. Sweet, Algorithm 541: Efficient Fortran sub-programs for the solution of separable elliptic partial differential equations[D3], ACM Transactions on Mathematical Software (TOMS) 5 (3) (1979)352–364.

[14] M. Frigo, S. Johnson, The Design and Implementation of FFTW3, Pro-ceedings of the IEEE 93 (2) (2005) 216–231, ISSN 0018-9219, URL http://www.bibsonomy.org/bibtex/2744e75c05789f40e27a64bd7d40462df/jpowell.

[15] M. Pippig, PFFT: An extension of FFTW to massively parallel architec-tures, SIAMJSC 35 (03) (2013) C213–C236.

REFERENCES 16

[16] V. Fuka, J. Brechler, Large Eddy Simulation of the Stable Boundary Layer,in: J. Fořt, J. Fürst, J. Halama, R. Herbin, F. Hubert (Eds.), Finite Vol-umes for Complex Applications VI Problems & Perspectives, vol. 4 ofSpringer Proceedings in Mathematics, Springer Berlin Heidelberg, ISBN978-3-642-20671-9, 485–493, 2011.

[17] F. Nicoud, H. B. Toda, O. Cabrit, S. Bose, J. Lee, Using singular valuesto build a subgrid-scale model for large eddy simulations, Physics of Fluids23 (8) (2011) 085106, URL http://dx.//doi.org/10.1063/1.3623274.

[18] P. R. Spalart, R. D. Moser, M. M. Rogers, Spectral methods for the Navier-Stokes equations with one infinite and two periodic directions, J. Comput.Phys. 96 (2) (1991) 297–324, ISSN 0021-9991.

[19] J. Kim, D. Kim, H. Choi, An Immersed-Boundary Finite-Volume Methodfor Simulations of Flow in Complex Geometries , Journal of Compu-tational Physics 171 (1) (2001) 132–150, ISSN 0021-9991, URL http://www.sciencedirect.com/science/article/pii/S0021999101967786.

[20] R. Mittal, G. Iaccarino, Immersed Boundary Methods, Annual Review ofFluid Mechanics 37 (1) (2005) 239–261, URL http://dx.//doi.org/10.1146/annurev.fluid.37.061903.175743.

[21] N. Peller, A. L. Duc, F. Tremblay, M. Manhart, High-order stable interpo-lations for immersed boundary methods, International Journal for Numer-ical Methods in Fluids 52 (11) (2006) 1175–1193, ISSN 1097-0363, URLhttp://dx.//doi.org/10.1002/fld.1227.

[22] R. Fischer, I. Bastigkeit, B. Leitl, M. Schatzmann, Generation of spatio-temporally high resolved datasets for the validation of LES-models sim-ulating flow and dispersion phenomena within the lower atmosphericboundary layer, in: Proceedings of Computational Wind Engineer-ing 2010, URL ftp://ftp.atdd.noaa.gov/pub/cwe2010/Files/Papers/461_Fischer.pdf, 2010.