computational plasma physics

Computational plasma physics

Eric SonnendruckerMax-Planck-Institut fur Plasmaphysik

undZentrum Mathematik, TU Munchen

Lecture notesSommersemester 2015

July 8, 2015

Contents

1 Introduction 3

2 Plasma models 52.1 Plasmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 The N -body model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.3 Kinetic models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.3.1 The Vlasov-Maxwell model . . . . . . . . . . . . . . . . . . . . . . 72.3.2 Reduction of Maxwells equation . . . . . . . . . . . . . . . . . . . 92.3.3 Collisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.4 Fluid models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3 Steady-state problems 153.1 The Finite difference method . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.1.1 The 1D Poisson equation and boundary conditions . . . . . . . . . 153.1.2 Obtaining a Finite Difference scheme . . . . . . . . . . . . . . . . . 163.1.3 Higher order finite differences . . . . . . . . . . . . . . . . . . . . . 18

3.2 Convergence of finite difference schemes . . . . . . . . . . . . . . . . . . . 183.2.1 Homogeneous Dirichlet boundary conditions . . . . . . . . . . . . . 203.2.2 Periodic boundary conditions . . . . . . . . . . . . . . . . . . . . . 213.2.3 The method of manufactured solutions . . . . . . . . . . . . . . . . 25

3.3 Finite difference methods in 2D . . . . . . . . . . . . . . . . . . . . . . . . 253.4 The Finite element method . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.4.1 Principle of the method . . . . . . . . . . . . . . . . . . . . . . . . 273.4.2 The variational (or weak) form of a boundary value problem . . . 313.4.3 Lagrange Finite Elements . . . . . . . . . . . . . . . . . . . . . . . 333.4.4 Formal definition of a Finite Element . . . . . . . . . . . . . . . . 363.4.5 Convergence of the Finite Element method . . . . . . . . . . . . . 37

3.5 The spectral method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4 Fluid models 434.1 An isothermal Euler-Poisson model . . . . . . . . . . . . . . . . . . . . . . 43

4.1.1 The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434.1.2 Study of the linearised equations . . . . . . . . . . . . . . . . . . . 444.1.3 Hyperbolicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.2 A model problem - 1D advection . . . . . . . . . . . . . . . . . . . . . . . 474.2.1 Obtaining a Finite Difference scheme . . . . . . . . . . . . . . . . . 474.2.2 The first order explicit upwind scheme . . . . . . . . . . . . . . . . 474.2.3 The first order upwind implicit scheme . . . . . . . . . . . . . . . . 48

1

4.2.4 The explicit downwind and centred schemes . . . . . . . . . . . . . 494.2.5 Stability and convergence . . . . . . . . . . . . . . . . . . . . . . . 49

4.3 The Finite Volume method . . . . . . . . . . . . . . . . . . . . . . . . . . 504.3.1 The first order Finite Volume schemes . . . . . . . . . . . . . . . . 504.3.2 Higher order schemes . . . . . . . . . . . . . . . . . . . . . . . . . 524.3.3 The method of lines . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.4 Systems of conservation laws . . . . . . . . . . . . . . . . . . . . . . . . . 564.4.1 Linear systems - The Riemann problem . . . . . . . . . . . . . . . 56

4.5 Nonlinear systems of conservation laws . . . . . . . . . . . . . . . . . . . . 584.5.1 The Rusanov flux . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594.5.2 The Roe flux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

2

Chapter 1

Introduction

Understanding an experiment in physics relies on a model which is generally a differentialequation or a partial differential equation or a system involving many of these. Insufficiently simple cases analytical solutions of these exist and then this can be used topredict the behaviour of a similar experiment. However in many cases, especially whenthe model is based on first principles, it is so complex that there is no analytical solutionavailable. Then there are two options: the first is to simplify the model until it can beanalytically solved, the second is to compute an approximate solution using a computer.In practice both are usually done, the simplified models being used to verify that thecode is working properly. Due to the enormous development of computer resources inthe last 50 years, quite realistic simulations of physical problems become now possible. Alot of theoretical work in physics and related disciplines, in particular in plasma physicsnow rely quite heavily on numerical simulation.

Computational sciences have emerged next to theory and experiments as a third pillarin physics and engineering. Designing efficient, robust and accurate simulation codes is achallenging task that is at the interface of the application domain, plasma physics in ourcase, applied mathematics and computer science. The main difficulties are to make surethat the implemented algorithms provide a good approximation of the physics model andalso that the algorithms use efficiently the available computer resources which are costly.Even though many basic simulations can be performed nowadays on a laptop, state ofthe art computations require huge super-computers that consists of many computingunits and nowadays often heterogeneous computing elements (CPU, GPU, MIC, ...).These require parallel algorithms and often to achieve optimal efficiency a very goodknowledge of the computer architecture.

This lecture will provide an introduction to scientific computing. Its aim is to in-troduce the process of developing a simulation code and some standard methods andnumerical issues. The models we will consider come from plasma physics, but the modelsand even more so the techniques and ideas will be useful for many other applications.Specific skills and methodologies for high performance computing that are also very im-portant in computational physics are beyond the scope of this lecture. We refer to [5]for an introduction.

The first step is to find an appropriate model, which is often a set of coupled differ-ential or partial differential equations. If the model is continuous the solution lives in aninfinite dimensional space which cannot be represented on a computer. The second stepthen will be to discretise it, i.e. represent the unknowns by a large but finite number ofvalues, typically its values on a finite grid or the coefficients of its expression on the basis

3

of a a finite dimensional linear space. Then from the equations of the starting modelsrelations between the values representing the discrete unknowns should be found. Thisyields a finite number of linear or non linear equations that can be solved on a com-puter. This will require methods of numerical linear algebra, which are introduced in[16] or iterative methods for linear or nonlinear equations, a good introduction of whichis available in [7]. We wont focus on these either. They are generally taught in NumericsBachelor classes. They are generally available in numerical software tools like Matlabor Numpy and those programming in a low level language like Fortran, C or C++ canuse efficient libraries, like LAPACK, ScaLAPACK, PETSc, Trilinos, to name a few, arefreely available.

There are many possible ways to discretise a differential or partial differential equa-tion. They are not all equal and many things need to be considered, when choosingan appropriate one. The most important is naturally that the discrete model convergestowards the initial model when the number of discrete values goes to infinity. Becausecomputer arithmetics is not exact, as real numbers cannot be represented exactly on acomputer, sensitivity to round-off errors is very important. Then some algorithms needmore operations than other. Optimally an algorithm dealing with an unknown vector ofN points can use O(N) operations, but some can use O(N logN) or O(Nd) or even more,which will make a huge difference for large values of N . Then again some algorithmswill be easier to parallelise than others, which is an important issue when developing asimulation code on a massively parallel computer.

Also, before choosing a discretisation, it is important to understand the structureof the equations that are being discretised. Analytical theory plays an essential role inthis aspect. What are the conserved quantities? Are there analytical solution in somespecial cases? What is the evolution of the solution or some quantities depending onthe solution? And so on. The more information is available from analytical theory,the easiest it will be to check whether the code is correctly approximating the analyticalmodel. The process of verification of a computer code, consists precisely in checking thatthe code can reproduce as expected information available from the theory. Verification ofa computer code is essential to gain confidence in its correctness. Only once the computercode has been appropriately verified, one can proceed with the validation process, whichconsists in comparing the code to actual experiments and checking that those can bereproduced within the available error bars. If this is not the case, one needs to checkthe initial model, including initial and boundary conditions and all external parametersthat could have an influence on the results. Possibly one also needs to develop moreverification tests to check that there is no error in the implementation. This process ofVerification and validation (V & V) is essential for developing a simulation code withreliable predictive capabilities.

In this lecture, starting from a few classical models from plasma physics, we willlearn how to write a simulation code for solving them. This includes finding a gooddiscrete model, implementing it and verifying it. This is the scope of applied numericalmathematics. The physics exploitation of the code can start after those steps. We willcover most of the classical discretisation methods, finite differences, finite elements, finitevolumes and also spectral methods.

4

Chapter 2

Plasma models

2.1 Plasmas

When a gas is brought to a very high temperature (104K or more) electrons leave theirorbit around the nuclei of the atom to which they are attached. This gives an overallneutral mixture of charged particles, ions and electrons, which is called plasma. Plasmasare considered beside solids, liquids and gases, as the fourth state of matter.

You can also get what is called a non-neutral plasma, or a beam of charged particles,by imposing a very high potential difference so as to extract either electrons or ionsof a metal chosen well. Such a device is usually located in the injector of a particleaccelerator.

The use of plasmas in everyday life has become common. This includes, for example,neon tubes and plasma displays. There are also a number industrial applications: am-plifiers in telecommunication satellites, plasma etching in microelectronics, productionof X-rays. Some examples are given in Figure 2.1.

Figure 2.1: Examples of plasmas at different densities and temperatures

5

We should also mention that while it is almost absent in the natural state on Earth,except the Northern Lights at the poles, the plasma is 99% of the mass of the visibleuniverse. Including the stars are formed from plasma and the energy they release fromthe process of fusion of light nuclei such as protons. More information on plasmas andtheir applications can be found on the web site http://www.plasmas.org.

2.2 The N-body model

At the microscopic level, a plasma or a particle beam is composed of a number ofparticles that evolve following the laws of classical or relativistic dynamics. So eachparticle, characterised by its position x, velocity x, as well as its mass m and charge q,obeys Newtons law

dmv

dt=

Fext,

where m is the mass of the particle, v its velocity = (1 |v|2c2

)12 is the Lorentz factor (c

being the speed of light). The right hand side Fext is composed of all the forces appliedto the particle, which in our case reduce to the Lorentz force induced by the externaland self-consistent electromagnetic fields. Other forces, as the weight of the particles,are in general negligible. Whence we have, labelling the different particles in the plasma,

dimividt

=j

qi(Ej + vi Bj) + qi(Eext + vi Bext).

The sum on the right hand side is over all the particles in the plasma and Ej ,Bj denotethe electric and magnetic fields fields generated by particle j and Eext,Bext denote theexternal electric and magnetic fields fields, i.e. those that are not generate by particlesof the plasma itself. The latter could be for example coils in an accelerator or in atokamak. On the other hand the velocity of a particle vi is linked to its position xi by

dxidt

= vi.

Thus, if the initial positions and velocities of the particles are known as well as theexternal fields, the evolution of the particles is completely determined by the equations

dxidt

= vi, (2.1)

dimvidt

=j

q(Ej + v Bj), (2.2)

where the sum contains the electric and magnetic field generated by each of the otherparticles as well as the external fields.

In general a plasma consists of a large number of particles, 1010 and more. Themicroscopic model describing the interactions of particles with each other is not used ina simulation because it would be far too expensive. We must therefore find approximatemodels which, while remaining accurate enough can reach a reasonable computationalcost. There is actually a hierarchy of models describing the evolution of a plasma. Thebase model of the hierarchy and the most accurate model is the N -body model wehave described, then there are intermediate models called kinetic and which are based

6

on a statistical description of the particle distribution in phase space and finally themacroscopic or fluid models that identify each species of particles of a plasma with afluid characterized by its density, its velocity and energy. Fluid models are becoming agood approximation when the particles are close to thermodynamic equilibrium, to whichthey return in long time do to the effects of collisions and for which the distribution ofparticle velocities is a Gaussian.

When choosing a model for a simulation code, one should try to take into accountaccuracy and computational cost and take the model that will allow us to find a solutionthat is accurate enough for the problem we are considering in the shortest possible time.In particular because of the very large number of particles in a plasma, kinetic modelsobtained by statistical arguments are almost always accurate enough. The question willthen be if a further model reduction, which could diminish cost, can be performed atleast for part of the plasma.

2.3 Kinetic models

In a kinetic model, each particle species s in the plasma is characterized by a distributionfunction fs(x,v, t) which corresponds to a statistical mean of the repartition of particlesin phase space for a large number of realisations of the considered physical system. Notethat phase space consists of the subspace of R6 containing all possible positions andvelocities of the particles. The product fs dx dv is the average number of particles ofspecies s, whose position and velocity are in the box of volume dx dv centred at (x,v).Normalising fs to one, fs becomes the probability density related to the probability ofa particle of species s if being at point (x,v) in phase space.

The distribution function contains much more information than a fluid descriptionas it includes information on the distributions of particle velocities at each position. Akinetic description of a plasma is essential when the distribution function is far awayfrom the Maxwell-Boltzmann distribution (also called Maxwellian) that corresponds tothe thermodynamic equilibrium of plasma. Otherwise a fluid description is sufficient.

2.3.1 The Vlasov-Maxwell model

In the limit where the collective effects are dominant on binary collisions between par-ticles, the kinetic equation that is derived, by methods of statistical physics from theN -body model is the Vlasov equation which reads

fst

+ v xfs + qsms

(E + v B) vfs = 0, (2.3)

in the non relativistic case. In the relativistic case it becomes

fst

+ v(p) xfs + qs(E + v(p)B) pfs = 0. (2.4)

We denote by xfs, vfs and pfs, the respective gradients of fs with respect to thethree position, velocity and momentum variables. The constants qs and ms denote thecharge and mass of the particle species. The velocity is linked to the momentum by therelation v(p) = pmss , where is the Lorentz factor which can be expressed from the

momentum by s =

1 + |p|2/(m2sc2).

7

This equation expresses that the distribution function fs is conserved along the tra-jectories of the particles which are determined by the mean electric field. We denoteby fs,0(x,v) the initial value of the distribution function. The Vlasov equation, whenit takes into account the self-consistent electromagnetic field generated by the parti-cles, is coupled to the Maxwell equations which enable to computed this self-consistentelectromagnetic field from the particle distribution:

1c2E

t+ B = 0 J, (Ampe`re)

B

t+ E = 0, (Faraday) E =

0, (Gauss)

B = 0. (magnetic Gauss)The source terms for Maxwells equation, the charge density (x, t) and the currentdensity J(x, t) can be expressed from the distribution functions of the different speciesof particles fs(x,v, t) using the relations

(x, t) =s

qs

fs(x,v, t) dv,

J(x, t) =s

qs

fs(x,v, t)v dv.

Note that in the relativistic case the distribution function becomes a function of positionand momentum (instead of velocity): fs fs(x,p, t) and charge and current densitiesverify

(x, t) =s

qs

fs(x,p, t) dp, J(x, t) =

s

qs

fs(x,p, t)v(p) dp.

The Maxwell equations need to be supplemented by boundary and initial conditionsso that they admit a unique solution. A classical boundary condition is the perfectconductor boundary condition E n = 0, where n denotes the unit outgoing normal.No additional condition on B is needed in that case.

The macroscopic quantities, associated to each particle species are defined as follows:

The particle density, in physical space, for species s, is defined by

ns(x, t) =

fs(x,v, t) dv,

The mean velocity us(x, t) verifies

ns(x, t)us(x, t) =

fs(x,v, t)v dv,

The kinetic energy is defined by

ns(x, t)Es(x, t) = m2

fs(x,v, t)|v|2 dv,

The temperature Ts(x, t) is related to the kinetic energy, mean velocity and densityby

Ts(x, t) = Es(x, t) u2s(x, t).

8

2.3.2 Reduction of Maxwells equation

In some cases the frequency of the phenomena of interest is sufficiently small that theelectric and magnetic fields can be considered quasi-static. This means that the timederivatives can be neglected in Maxwells equations. We then get two decoupled set ofequations. The electric field is then determined by

E = 0, E = 0,

and appropriate boundary conditions.Moreover, provided the computational domain is simply connected, E = 0

implies that there exists a scalar potential such that E = and is determinedby = /0, which becomes that standard Poisson equation:

= 0.

The perfectly conducting boundary condition E n = 0 implies that is constant onthe boundary, and as E and not is the physically meaningful field, is determined upto a constant, and so is generally set to 0 on the boundary for perfect conductors.

Note that in many low frequency plasma, E is by far the dominating term in theLorentz force, and hence a good model consists just of the Vlasov-Poisson equations(where the magnetic field is set to 0 or is a known external field).

When still the magnetic field needs to be considered it is solution of

B = 0J, B = 0.

Because of the vanishing divergence, provided again some geometric assumptions on thedomain, B derives from a vector potential: B = A and A is a solution of

A = 0J,

along with appropriate boundary conditions.

2.3.3 Collisions

The Vlasov equation describes a smoothed interaction of the plasma particles, for whichthe electromagnetic field is averaged over all the plasma particles. In practice, however,when two particles of the same charge get close together there is a strong repulsionbetween the two particles and this force dominates the force generated by all the otherparticles. This is called a binary collision. In this case the interaction is modelled by abinary collision operator, like the Boltzmann operator that is used in neutral gases.

When binary collisions between particles are dominant with respect to mean fieldeffects, the distribution function satisfies the Boltzmann equation

f

t+ v f

x=s

Q(f, fs),

where Q is the non linear Boltzmann operator. This operator is sometimes replaced bysimpler models. A sum on the collisions with all the species of particles represented byfs, including the particles of the same species, is considered. In many cases not all the

9

collisions might be considered. In most plasmas either binary collisions are completelyneglected or only represent a complement to the averaged interaction. In this casethe collision operator appears on the right-hand side of the full Vlasov equation. Thecollision operator considered appropriate in most cases then is the Fokker-Planck-Landauoperator.

The Fokker-Planck-Landau operator, which is the most commonly used in plasmaphysics, is the limit of the Boltzmann operator for grazing collisions [8]. It reads forcollisions with particles of the same species

QL(f, f)(v) = v A(v v)[f(v)vf(v) f(v)vf(v)] dv] (2.5)

where, for the case of Coulomb collisions, being a positive constant,

A(v v) = |v v|(I3 (v v) (v v)|v v|2

).

As for the Boltzmann operator, the first three velocity moments of the Landau opera-tor vanish, which implies conservation of mass, momentum and kinetic energy. Moreoverthe H-theorem is satisfied, so that the equilibrium states are also the Maxwellians.

Sometimes it is more convenient to express the Landau collision operator using theRosenbluth potentials [11], which reads

QL,R(f, f)(v) = v [v (fv vG(f)) 4fv H(f)]where the Rosenbluth potentials are defined by

G(f)(v) =|vv|f(v) dv, pace10mmH(f)(v) =

1

|v v|f(v) dv. (2.6)

When the distribution function is close enough to a Maxwellian, the Fokker-Planck-Landau operator can be linearised around a Maxwellian, the collision operator takes amuch simpler for as can be seen from the Rosenbluth potential for by taking f in theexpression of the potentials G and H to be a given Maxwellian. Then we get the linearFokker-Planck operator, which takes the form

QFP (f)(v) = v (fv + D2

2vf). (2.7)

This operator is also known as the Lenard-Bernstein operator in the plasma physicscommunity [9]. For given constants , and D, the Lenard-Bernstein operator conservesmass, but not momentum and energy and its equilibrium function is a Maxwellian of

the form ev2

D2 , where the constant is determined by the total mass (or number ofparticles).

The linear Fokker-Planck operator can be made to conserve also total momentumand kinetic energy by using the mean velocity u and the temperature T associated tof . Then the operator reads

QFPC(f)(v) = v (f v uT

+vf).

A simplified collision operator that has been build to conserve mass, momentum andkinetic energy and have the Maxwellian as equilibrium states, as the Boltzmann and

10

Fokker-Planck-Landau operators, has been derived by Bhatnagar, Gross and Krook [2].In the mathematics community this is known as the BGK operator and in the physicscommunity it is called the Krook operator. It simply reads

QK(f)(v) = (fM [f ] f),where fM [f ] is the Maxwellian, which has the same mass, mean velocity and temperatureas f . It reads

fM (t,x,v) =n(t,x)

(2piT (t,x)/m)32

e |vu(x,t)|2

2T (x,t)/m .

2.4 Fluid models

Due to collisions, the particles relax in long time to a Maxwellian, which is a ther-modynamical equilibrium. When this state is approximately attained particles can bedescribed by a fluid like model, where each particle species is modelled as a chargedfluid.

This fluid model for each particle species can be derived from the correspondingVlasov equation. The fluid models then replace the Vlasov equations and are stillcoupled to Maxwells equation, or some reduced model, for the determination of theself-consistent electromagnetic field.

We start from the Vlasov-Boltzmann equation:

f

t+ v f

x+

q

m(E + v B) f

v= Q(f, f). (2.8)

Remark 1 The Boltzmann collision operator Q(f, f) on the right hand side is necessaryto provide the relaxation to thermodynamic equilibrium. However it will have no directinfluence on our derivation, as we will consider only the first three velocity momentswhich vanish for the Boltzmann operator.

The macroscopic quantities on which the fluid equations will be established are de-fined using the first three velocity moments of the distribution function f(x,v, t)

The particle density is defined by

n(x, t) =

f(x,v, t) dv,

The mean velocity u(x, t) verifies

n(x, t)u(x, t) =

f(x,v, t)v dv,

The pressure tensor P(x, t) is defined by

P(x, t) = mf(x,v, t)(v u(x, t)) (v u(x, t)) dv.

The scalar pressure is one third of the trace of the pressure tensor

p(x, t) =m

3

f(x,v, t)|v u(x, t)|2 dv,

11

The temperature T (x, t) is related to the pressure and the density by

T (x, t) =p(x, t)

n(x, t).

The energy flux is a vector defined by

Q(x, t) =m

2

f(x,v, t)|v|2v(x, t)) dv.

where we denote by |v| = v v and for two vectors a = (a1, a2, a3)T and b =(b1, b2, b3)

T , their tensor product a b is the 3 3 matrix whose components are(aibj)1i,j3.

We obtain equations relating these macroscopic quantities by taking the first velocitymoments of the Vlasov equation. In the actual computations we shall make use thatf vanishes at infinity and that the plasma is periodic in space. This takes care of allboundary condition problems.

Let us first notice that as v is a variable independent of x, we have vxf = x (fv).Moreover, as E(x, t) does not depend on v and that the ith component of

v B(x, t) =v2B3(x, t) v3B2(x, t)v3B1(x, t) v1B3(x, t)v1B2(x, t) v2B1(x, t)

is independent of vi, we also have

(E(x, t) + v B(x, t)) vf = v (f(E(x, t) + v B(x, t))).

Integrating the Vlasov equation (2.8) with respect to velocity v we obtain

t

f(x,v, t) dv +x

f(x,v, t)v dv + 0 = 0.

Whence, as n(x, t)u(x, t) =f(x,v, t)v dv, we get

n

t+x (nu) = 0. (2.9)

Multiplying the Vlasov by mv and integrating with respect to v, we get

m

t

f(x,v, t)v dv +mx

(v v)f(x,v, t) dv

q(E(x, t)f(x,v, t) dv +

f(x,v, t)v dv B(x, t) = 0.

Moreover, v vf(x,v, t) dv =

(v u) (v u)f(x,v, t) dv + nu u.

Whence

m

t(nu) +m (nu u) + P = qn(E + uB). (2.10)

12

Finally multiplying the Vlasov equation by 12m|v|2 = 12mv v and integrating withrespect to v, we obtain

1

2m

t

f(x,v, t)|v|2 dv + 1

2mx

(|v|2v)f(x,v, t) dv

+1

2q

|v|2v [(E(x, t) + v B(x, t))f(x,v, t)] dv = 0.

An integration by parts then yields|v|2v (E(x, t) + v B(x, t))f(x,v, t) dv

= 2

v [(E(x, t) + v B(x, t))f(x,v, t)] dv.

Then, developingf |v u|2 dv we get

f |v u|2 dv =f |v|2 dv 2u

vf dv + |u|2

f dv =

f |v|2 dv n|u|2,

whence

t(3

2p+

1

2mn|u|2) + Q = E (qnu). (2.11)

We could continue to calculate moments of f , but we see that each new expressionreveals a moment of higher order. So we need additional information to have as manyunknowns as equations to solve these equations. This additional information is called aclosure relation.

In our case, we will use as a closure relation the physical property that at ther-modynamic equilibrium the distribution function approaches a Maxwellian distributionfunction that we will note fM (x,v, t) and that can be expressed as a function of themacroscopic quantities n(x, t), u(x, t) and T (x, t) which are the density, mean velocityand temperature of the charged fluid:

fM (x,v, t) =n(x, t)

(2piT (x, t)/m)3/2e |vu(x,t)|2

2T (x,t)/m .

We also introduce a classical quantity in plasma physics which is the thermal velocity ofthe particle species considered

vth =

T

m.

It is easy to verify that the first three moments of the distribution function fM areconsistent with the definition of the macroscopic quantities n, u and T defined for anarbitrary distribution function. We have indeed performing each time the change ofvariable w = vuvth

fM (x,v, t) dv = n(x, t),fM (x,v, t)v dv = n(x, t)u(x, t),

fM (x,v, t)|v u|2 dv = 3n(x, t)T (x, t)/m.

13

On the other hand, replacing f by fM in the definitions of the pressure tensor P andthe energy flux Q, we can express these terms also in function of n, u and T whichenables us to obtain a closed system in these three unknowns as opposed to the case ofan arbitrary distribution function f . Indeed, we first notice that, denoting by wi the i

th

component of w, wiwje

|w|22 dw =

0 if i 6= j, e |w|22 dw if i = j.It follows that the pressure tensor associated to the Maxwellian is

P = mn

(2piT/m)3/2

e |vu|2

2T/m (v u) (v u) dv,

and so, thanks to our previous computation, the off diagonal terms of P vanish, and bythe change of variable w = vuvth , we get for the diagonal terms

Pii = mn

(2pi)3/2T

m

e

w2

2 w2i dv = nT.

It follows that P = nT I = pI where I is the 3 3 identity matrix. It now remains tocompute in the same way Q as a function of n, u and T for the Maxwellian with thesame change of variables, which yields

Q =m

2

n

(2piT/m)3/2

e |vu|2

2T/m |v|2v(x, t)) dv,

=m

2

n

(2pi)3/2

e

w2

2 (vthw + u)2(vthw + u) dw,

=m

2

n

(2pi)3/2

e

w2

2 (v2thw2u + 2v2thu w w + |u|2 u) dw,

=m

2n(3

T

mu + 2

T

mu + |u|2u),

as the odd moments in w vanish. We finally get

Q =5

2nTu +

m

2n|u|2u = 5

2pu +

m

2n|u|2u.

Then, plugging the expressions of P and of Q in (2.9)-(2.10)-(2.11) we obtain the fluidequations for one species of particles of a plasma:

n

t+x (nu) = 0 (2.12)

m

t(nu) +m (nu u) +p = qn(E + uB) (2.13)

t(3

2p+

1

2mn|u|2) + (5

2pu +

m

2n|u|2u) = E (qnu), (2.14)

which corresponds in three dimensions to a system of 5 scalar equation with 5 scalarunknowns which are the density n, the three components of the mean velocity u and thescalar pressure p. These equations need of course to be coupled to Maxwells equationsfor the computation of the self-consistent electromagnetic field with, in the case of onlyone particle species = q n and J = q nu. Let us also point out that an approximationoften used in plasma physics is that of a cold plasma, for which T = 0 and thus p = 0.Only the first two equations are needed in this case.

14

Chapter 3

Steady-state problems

3.1 The Finite difference method

3.1.1 The 1D Poisson equation and boundary conditions

We consider the Poisson equation in an interval [a, b]. For the problem to be well posed,boundary conditions are need for x = a and x = b. We will consider here three classicaltypes of boundary conditions.

1. Dirichlet boundary conditions: is given at x = a and x = b

(x) = in [a, b] (3.1)(a) = , (3.2)

(b) = . (3.3)

2. Neumann boundary conditions: is given at boundary. Note that we can dothis only at one end of the interval, else there is still an undetermined constant.Moreover as the potential is determined up to a constant, we can always set itto zero at one point for example at x = a. In this case the problem becomes

(x) = in [a, b] (3.4)(a) = 0, (3.5)

(b) = . (3.6)

3. Periodic boundary conditions. In this case all the functions are assume to beperiodic of period L and we can restrict the interval of computation to [0, L].Then, there are mathematically no boundaries and no boundary conditions areneeded. The terminology periodic boundary conditions is somewhat misleading.

(x) = (3.7)Note however in this case that is only determined up to a constant, which needsto be set for a numerical computation. Moreover, integrating (3.7) on a period,e.g. [0, L] yields

(L) (0) = L

0(x) dx = 0,

as (L) = (0) because is L-periodic. So a necessary condition for a solutionto exist is

L0 (x) dx = 0.

15

3.1.2 Obtaining a Finite Difference scheme

We first consider a uniform mesh of the 1D computational domain, i.e. of the interval[0, L] where we want to compute the solution, see Figure 4.1. The cell size or space step

. .

0 L

x1x0 x2 xNxN1

Figure 3.1: Uniform mesh of [0, L]

is defined by h = LN where N is the number of cells in the mesh. The coordinates ofthe grid points are then defined by xj = x0 + jh = jh as x0 = 0. The solution will bedefined by its values at xj for 0 j N .

The principle of Finite Differences is to replace derivatives by finite differences in-volving neighbouring points approximating those derivatives. The simplest way to dothis is to use Taylor expansions around the considered point. We do this for all pointson the grid. The Taylor expansion will also enable us to see the order of approximationof the derivative.

(xj+1) = (xj) + h(xj) +

h2

2(xj) +

h3

6(3)(xj) +

h4

24(4)(xj +

+j h), (3.8)

(xj1) = (xj) h(xj) + h2

2(xj) h

3

6(3)(xj) +

h4

24(4)(xj j h). (3.9)

We deduce

(xj+1) 2(xj) + (xj1) = h2(xj) + h4

24((4)(xj +

+j h) +

(4)(xj j h)). (3.10)

So that

(xj) =(xj+1) 2(xj) + (xj1)

h2 h

2

24((4)(xj +

+j h) +

(4)(xj j h)).

Plugging this into the equation (xj) = (xj) we get

(xj+1) 2(xj) + (xj1)h2

= (xj) +h2

24((4)(xj +

+j h) +

(4)(xj j h)). (3.11)

Let us now define j such that for (1 j N 1), we havej+1 + 2j j1

h2= (xj),

and we use the boundary conditions to determine the additional unknowns. Then jwill give an approximation of (xj) for all the points on the grid 0 j N .

1. Dirichlet: 0 = (x0) = , N = (xN ) = . So there remain N 1 unknowns1, . . . , N1 determined by the N 1 equations

j+1 + 2j j1h2

= (xj) 1 j N 1.

16

This can be written as a linear system Ahh = Rh with

Ah =1

h2

2 1 01 2 1 00

. . .. . .

. . .. . .

. . .. . . 1

0 1 2

, h =

12...

N1

, Rh =

(x1) +h2

(x2)...

(xN1) + h2

.

2. Neumann. Because we need to set the constant for the potential at one point, weconsider now the boundary conditions (0) = 0 and (L) = . In this case theunknown are 1, . . . , N . So there are N unknown. We can still use like before thefinite difference approximations of (xj) = (xj) at the N 1 interior points.Then the missing equation needs to be obtained from the Neumann boundarycondition (L) = . This needs to be expressed from the point values. For thiswe can simply set

N N1h

= .

This is only a first order approximation of the derivative, but this is enough tokeep the second order approximation on at the end.

In this case we get the linear system Ahh = Rh with

Ah =1

h2

2 1 01 2 1 00

. . .. . .

. . .. . .

. . .. . . 1

0 1 2 10 1 1

, h =

12...N

, Rh =

(x1)(x2)

...(xN1)

h

.

3. Periodic. This case is the simplest as there is no boundary. Here all points areinterior points and can be used to express (xj) = (xj). Because of the pe-riodicity (xj+N ) = (xj). Hence only the values of j 0 N 1 need to becomputed, the others being deduced by periodicity. So there will be N unknownsin our system that are determined by the N approximations to (xj) = (xj),0 j N 1 which are expressed by

j+1 + 2j j1h2

= (xj) 2 j N 2.

Moreover for j = 0 we have j1 = 1 = N1 so that the equation reads

1 + 20 N1h2

= (x0)

and for j = N 1 we have j+1 = N = 0 so thatN2 + 2N1 0

h2= (xN1).

17

So that in this case the system in Matrix form reads Ahh = Rh with

Ah =1

h2

2 1 0 0 11 2 1 0 . . . 00

. . .. . .

. . . 0...

. . .. . .

. . . 1 00 1 2 1

1 0 1 2

, h =

01...

N1

, Rh =

(x0)(x1)

...(xN1)

.

We notice that each diagonal of Ah has everywhere the same term. We see alsothat the vector (1, . . . , 1)> is in the kernel of Ah as the sum of all the terms of eachline vanishes. This means that the matrix is not invertible. Its kernel has rank oneand invertibility can be recovered by assuming zero average, which in the discretecase reads

0 + + N1 = 0.

3.1.3 Higher order finite differences

A fourth order formula for the second derivative can be obtained by adding Taylor ex-pansions expressing in addition (xj+2) and (xj2) with respect to and its derivativesat the point xj . Taking linear combinations of the four Taylor expansions such that allterms up to h5, except of course the function values and the second derivative vanish.We then get the formula

(xj) (xj+2) + 16(xj+1) 30(xj) + 16(xj1) (xj2)12h2

.

This can be used everywhere for periodic boundary conditions. For other types ofboundary conditions a non centred formula of the same order needs to be applied atx1 and xN1.

3.2 Convergence of finite difference schemes

Some theory on the convergence of finite difference schemes will help us understandwhat is needed for a good scheme and also provide verification tests for checking thatthe code implementing the scheme behaves correctly. In particular, checking the orderof convergence of a scheme is a very good verification test that should be used wheneversome theoretical order exists.

In this lecture, we will use mostly the decomposition in eigenmodes for our proofs.Sometimes easier and more general proofs exist, but understanding the behaviour ofthe eigenmodes is in general very useful to understand a scheme. Some continuous anddiscrete norms are needed to define rigorously the convergence. We will use mostly theL1, L2 and L (or max) norms defined as follows. In the continuous setting

f1 = ba|f(x)|dx, f2 =

ba|f(x)|2 dx, f = max

axb|f(x)|.

In the discrete setting

v1 =Nj=1

|vi|, v2 = (Nj=1

|vi|2)1/2, v = maxi

(|vi|).

18

A simple calculation yields the comparison inequalities between the 2-norm and themax-norm

v v2 Nv, v RN . (3.12)

After discretisation with Finite Differences we obtained in each case a linear systemof the form Ahh = Rh. If the matrix Ah is invertible, this enables to compute h.For this to provide a good approximation of the corresponding solution of the Poissonequation, we need to verify that for some norm we have h Chp for someinteger p 1 and a constant C independent of h (we then say that h = O(hp)).In the case of Dirichlet boundary conditions = ((x1), . . . , (xN1))> is the vectorcontaining the exact solution of Poissons equation at the grid points.

Because of (3.11), we have that Ah = Rh+h2Sh, where Sh is the vector containing

the rest term of the Taylor expansions. Then as Ahh = Rh, it follows that Ah(h) =h2Sh and also that

Ah( h) h2Sh.Assuming that the fourth derivative of is bounded, it follows that

Sh C = ( maxx[0,L]

|(4)(x)|)/12,

Ah( h) Ch2,where C is independent of h. We then say that the numerical scheme defined by thematrix Ah is consistent of order 2 for the max-norm. For the 2-norm, we can use thenorm comparison (3.12) and h = L/N to get that

Ah( h)2 C1h3/2, or equivalently 1NAh( h)2 C2h2,

where C1 and C2 are constants independent of h.Consistency gives us convergence of Ah( h) to zero. This is not yet enough

to prove that h converges to . This requires another property of the discretisationscheme, which is called stability, namely that the norm of the inverse of Ah, A1h , isbounded independently of h.

Definition 1 For a given vector norm , we define the induced matrix norm by

A = supx 6=0Axx .

The consistency of a Finite Difference scheme comes directly from its derivation usingTaylor expansions. Its stability is often more difficult to verify. One possible way to doit is to check its eigenvalues. This relies on the following proposition:

Proposition 1 Let A be diagonalisable in basis of orthonormal eigenvectors and 1, . . . , Ndenote the eigenvalues of A. Then we have

A2 max |i|, A max |i|.

19

Proof. Any vector x 6= 0 in RN can be expressed in the basis of orthogonal eigenvectorsof A, denote by e1, . . . , eN

x = x1e1 + + xNeN .Then assuming that ei is an eigenvalue of A associated to the eigenvalue i, we have

Ax = 1x1e1 + + NxNeN . (3.13)Hence for the 2-norm, using the orthonormality of the ei

Ax22 = 21x21 + + + 2Nx2N max(2i )x22.

From which it follows that A2

max(2i ) = max |i|.For the max-norm, we get from (3.13) that

Ax = max |ixi| max |i|x,from which the result follows.

Hence if Ah is an invertible symmetric matrix, it is diagonalisable in a basis oforthogonal eigenvectors and its eigenvalues are real and different from zero. Denotingby P the matrix whose columns are the eigenvectors of Ah, we have Ah = PP

>, where is the diagonal matrix containing the eigenvalues. Then A1h = P

1P>, where 1

contains the inverse of the eigenvalues.It follows that for the 2-norm and max-norm that we are interested in, we have

A1h 1

min |i| .

This leads us to the sufficient condition for stability that for all the eigenvalues i of Ahwe have

|i| C, for some constant C independent of h.Let us now check this for Dirichlet and periodic boundary conditions.

3.2.1 Homogeneous Dirichlet boundary conditions

In the continuous case for homogeneous Dirichlet boundary conditions the eigenvectorsand eigenvalues of the Laplace operator d2

dx2in 1D verify

d2kdx2

= kk, (0) = (L) = 0.

They can be computed by a sine transform and read:

k =k2pi2

L2, k(x) =

2

Lsin

kpix

L.

In the case of second order finite differences, the corresponding discrete eigenvalueproblem reads Ahk = kk, with h = L/N and the (N 1) (N 1) matrix

Ah =1

h2

2 1 01 2 1 00

. . .. . .

. . .. . .

. . .. . . 1

0 1 2

.

20

We can check that the components of the discrete eigenvectors (or eigenmodes), up tonormalisation, are the values of the continuous eigenmodes at the grid points

(k)j =

2

Nsin

kjpi

N,

and the corresponding eigenvalues are

k =4

h2sin2

kpi

2N=

4

h2sin2

khpi

2L, 1 k N 1.

As 0 < k < N , we have 0 kpi/(2N) pi/2, so that the eigenvalues are positive and inincreasing order as sinus is increasing on this interval. It follows that 1 is the smallesteigenvalue. Moreover, as h 0, we have

1 =4

h2sin2

hpi

2L 4h2h2pi2

4L2 pi

2

L2.

This corresponds to the continuous eigenvalue. Then because of the convergence prop-erty there exists h0 = L/N0 such that for h h0 we have for all N N0 that1 =

N1 (1/2)pi2/L2, where we use the exponent N to show the dependency of

N1 on N (or equivalently on h). Thus for any N 1, we have

N1 C = min(11, . . . ,

N011 ,

1

2

pi2

L2

),

where C is a constant independent on N which proves the stability.

3.2.2 Periodic boundary conditions

The Dirichlet theorem tells us that any periodic C1 function u can be expanded in aconverging Fourier series:

u(x) =

+k=

uke2ipikxL , with uk =

1

L

L0u(x)e

2ipikxL .

As the Fourier coefficient of a derivative is simply obtained by multiplying the Fouriercoefficient of the function, the Fourier transform diagonalises linear partial differentialequations with constant coefficients and provides a very simple way to solve it. Histor-ically Fourier series where used by Joseph Fourier at the beginning of the 19th centuryas a general tool to solve the heat equation.

Indeed, consider the following partial differential equation, with a, b, c R constants,in a periodic domain

ad2u

dx2+ b

du

dx+ cu = f.

Plugging in the expression of the Fourier series of u this becomes

+k=

uk

(a

4pi2k2

L2+ b

2ipik

L+ c

)e2ipikxL =

+k=

fke2ipikxL .

Identifying the Fourier coefficients gives a closed expression of uk as a function of fk

uk =fk

a4pi2k2

L2+ b2ipikL + c

21

and provides a solution of the PDE in form of a Fourier series.There exists a corresponding tool for solving linear constant coefficient finite differ-

ence equations on a uniform grid, namely the discrete Fourier transform (DFT), whichis extremely useful for analysing Finite Difference equations and in particular their sta-bility.

Discrete Fourier transform

Let P be the symmetric matrix formed with the powers of the N th roots of unity the

coefficients of which are given by Pjk =1Ne2ipijkN . Denoting by N = e

2ipiN , we have

Pjk =1NjkN . The adjoint, or conjugate transpose of P is the matrix P

with coefficients

P jk =1NjkN .

Notice that the columns of P , denoted by Pi, 0 i N 1 are the vectors Xknormalised so that P kPl = k,l, where the vector Xk corresponds to a discretisation ofthe function x 7 e2ipikx at the grid points xj = j/N of the interval [0, 1]. So theexpression of a periodic function in the base of the vectors Xk is naturally associated tothe Fourier series of a periodic function.

Definition 2 Discrete Fourier Transform.

The dicrete Fourier transform of a vector x CN is the vector y = P x. The inverse discrete Fourier transform of a vector y CN is the vectorx = P 1x = Px.

Lemma 1 The matrix P is unitary and symmetric, i.e. P1 = P = P .

Proof. We clearly have P T = P , so P = P . There remains to prove that PP = I.But we have

(PP )jk =1

N

N1l=0

jllk =1

N

N1l=0

e2ipiNl(jk) =

1

N

1 e 2ipiN N(jk)1 e 2ipiN (jk)

,

and so (PP )jk = 0 if j 6= k and (PP )jk = 1 if j = k.

Corollary 1 Let F,G CN and denote by F = P F and G = P G, their discreteFourier transforms. Then we have

the discrete Parseval identity:

(F,G) = F T G = F TG = (F , G), (3.14)

The discrete Plancherel identity:

F = F, (3.15)

where (., .) and . denote the usual euclidian dot product and norm in CN .

22

Proof. The dot product in CN of F = (f1, . . . , gN )T and G = (g1, . . . , gN )T is definedby

(F,G) =Ni=1

figi = FT G.

Then using the definition of the inverse discrete Fourier transform, we have F = PF ,G = PG, we get

F T G = (PF )TPG = F TP T PG = F T

G,

as P T = P and P = P1. The Plancherel identity follows from the Parseval identity bytaking G = F .

Remark 2 The discrete Fourier transform is defined as a matrix-vector multiplication.Its computation hence requires a priori N2 multiplications and additions. But becauseof the specific structure of the matrix there exists a very fast algorithm, called FastFourier Transform (FFT) for performing it in O(N log2N) operations. This makes itparticularly interesting for many applications, and many fast PDE solvers make use ofit.

Circulant matrices

Note that on a uniform grid if the PDE coefficients do not explicitly depend on x aFinite Diffference scheme is identical at all the grid points. This implies that a matrixAh defined with such a scheme has the same coefficients on any diagonal including theperiodicity. Such matrices, which are of the form

C =

c0 c1 c2 . . . cN1cN1 c0 c1 cN2cN2 cN1 c0 cN3

.... . .

...c1 c2 c3 . . . c0

with c0, c1, . . . , cN1 R are called circulant.

Proposition 2 The eigenvalues of the circulant matrix C are given by

k =N1j=0

cjjk, (3.16)

where = e2ipi/N .

Proof. Let J be the circulant matrix obtained from C by taking c1 = 1 and cj = 0 forj 6= 1. We notice that C can be written as a polynomial in J

C =N1j=0

cjJj .

23

As JN = I, the eigenvalues of J are the N-th roots of unity that are given by k =e2ikpi/N . Looking for Xk such that JXk =

kXk we find that an eigenvector associatedto the eigenvalue k is

Xk =

1k

2k

...

(N1)k

.We then have that

CXk =N1j=0

cjJjXk =

N1j=0

cjjkXk,

and so the eigenvalues of C associated to the eigenvectors Xk are

k =

N1j=0

cjjk.

Proposition 3 Any circulant matrix C can be written in the form C = PP whereP is the matrix of the discrete Fourier transform and is the diagonal matrix of theeigenvalues of C. In particular all circulant matrices have the same eigenvectors (whichare the columns of P ), and any matrix of the form PP is circulant.

Corollary 2 We have the following properties:

The product of two circulant matrix is circulant matrix. A circulant matrix whose eigenvalues are all non vanishing is invertible and its

inverse is circulant.

Proof. The key point is that all circulant matrices can be diagonalized in the samebasis of eigenvectors. If C1 and C2 are two circulant matrices, we have C1 = P1P

andC2 = P2P

so C1C2 = P12P .If all eigenvalues of C = PP are non vanishing, 1 is well defined and

PP P1P = I.

So the inverse of C is the circulant matrix P1P .

Stability of the discrete Laplacian with periodic boundary conditions

For periodic boundary conditions the matrix of the discrete Laplacian is circulant, withc0 = 2/h

2, c1 = 1/h2, cN1 = 1/h2 and all the other terms vanish. Hence the formulafor computing its eigenvalues can be used to verify its stability. It yields

k =1

h2(2 e2ikpi/N e2ikpi(N1)/N ) = 2

h2(1 cos 2kpi

N) =

4

h2sin2

kpi

N, 0 k N 1.

24

In order to fix the constant, we assume that 0 + + N1 = 0, this sets theconstant Fourier mode 0 to 0 and discards the eigenvalue 0 = 0, so that the rest ofthe matrix is invertible and the smallest eigenvalue corresponds to

1 = N1 =4

h2sin2

pi

N=

4

h2sin2

pih

L.

We can now proceed like in the case of homogeneous Dirichlet boundary conditions. Assinx x for x close to 0, we find

limh0

1 =4pi2

L2,

which is strictly larger than 0, so that all eigenvalues are bounded from below by thehalf of that number, after some small enough h0, and the others are a finite number ofstrictly positive values. This proves that for all N all eigenvalues of Ah are positive andbounded from below by a constant independent of h which proves stability.

3.2.3 The method of manufactured solutions

A simple and standard way of checking the correctness of the code implementing anumerical method is to use a known analytical solution. This can be done by usinga known solution in a specific case or also by picking a solution and constructing theproblem around it. This is called the method of manufactured solutions.

For example, for periodic boundary conditions one can pick any periodic function uand apply the operator to it, in our case the Laplacian, to find the corresponding righthand side , and then solve the problem with this for different mesh sizes and checkthe convergence order in some given norm.

For non periodic boundary conditions, one can pick a function satisfying homoge-neous Dirichlet or Neumann boundary conditions or on can pick any smooth functionand determine the boundary conditions according to the function we chose. In any caseit is important not to forget the boundary conditions when defining the problem.

3.3 Finite difference methods in 2D

Let us now extend the finite difference method to a cartesian 2D grid, which is a tensorproduct of 1D grids. We shall assume that the cell size is uniform and equal in the twodirections for simplicity, but this is not required. We will see that the notion of tensorproduct enables to construct the linear system directly from the 1D system, enablingeasy implementation and also extension of the analysis from the 1D case. For simplicitywe will only consider homogeneous Dirichlet boundary conditions. Other boundaryconditions can be adapted from the 1D case in the same manner.

The 2D Laplace operator is defined by

=2

x2+2

y2.

The second order finite difference approximation of the second derivative in x and y are

25

obtained from Taylor expansions

(xi, yj) = (xi+1, yj) + 2(xi, yj) (xi1, yj)h2

+(xi, yj+1) + 2(xi, yj) (xi, yj1)

h2+O(h2). (3.17)

Let us consider the natural numbering of the grid values of the approximate solutionh which is now a matrix with entries i,j (xi, yj) for all the grid points 0 i, j N . Considering the Poisson problem with homogenous Dirichlet boundary conditions: = in the domain and = 0 on the boundary, there are (N 1)2 unknownssatisfying the (N 1)2 equations

1

h2(i+1,j i1,j + 4i,j i,j+1 i,j1) = (xi, yj), 1 i, j N 1. (3.18)

Introducing the right hand side matrixRh = ((xi, yj))1i,jN1 and the 1D Dirichletsecond order discrete Dirichlet matrix

Ah =1

h2

2 1 01 2 1 . . .0

. . .. . .

. . .. . .

. . .. . .

. . . 1 0. . . 1 2 1

0 1 1

we notice that the matrix multiplication Ahh applies the 1D Finite Difference stencil tothe columns of h which corresponds to the differentiation in y and the left multiplicationof h by Ah, hAh, applies the 1D Finite Difference stencil to the lines of whichcorresponds to the differentiation in x.

Ahh =1

h2

21,1 2,1 21,2 2,2 . . .

1,1 + 22,1 3,1 1,2 + 22,2 3,2 . . .2,1 + 23,1 4,1 2,2 + 23,2 4,2 . . .

......

,

hAh =1

h2

21,1 1,2 1,1 + 21,2 1,3 . . .22,1 2,2 2,1 + 22,2 2,3 . . .23,1 3,2 3,1 + 23,2 3,3 . . .

......

,Then adding the two, yields at each matrix entry the 2D Laplacian stencil, so that theequations (3.18) can be written in matrix form

hAh +Ahh = Rh.

Denoting by Ih the identity matrix of the same size as Ah and h this reads equivalently

IhhAh +AhhIh = Rh. (3.19)

26

In order to solve such a matrix system it needs to be brought into the standard matrix-vector multiplication form. The Kronecker product formalism does that for us. Adetailed presentation can be found in the textbooks by Steeb [13, 14]. A nice reviewarticle on the properties and applications of the Kronecker product was written by VanLoan [17].

For our application, we first need to replace the matrix unknown h by a columnvector, which is called vec(h) in the Kronecker product formalism and is obtained bystacking the columns of h or equivalently numbering the grid points line by line:

vec(h) = (1,1, 2,1, . . . , N1,1, 1,2, 2,2, . . . , N1,2, 3,1, . . . N1,N1)>.

We then have for any two matrices B and C of appropriate dimensions and their Kro-necker product B C

CXB> = (B C)vec(X).This is all we need to rewrite our 2D discrete Poisson equation using Kronecker products.As Ah is symmetric, (3.19) is equivalent to

(Ah Ih + Ih Ah)vec(h) = vec(Rh).As the Kronecker product is available in numerical computing languages like Matlab ornumpy, this can be used directly to assemble the linear system in 2D, which means thatonly the 1D Finite Difference matrices need to be assembled explicitly.

As the eigenvalues of the Kronecker product of two square matrices is the productof the eigenvalues of each matrix, the stability of the 2D problem can also be studiedusing the eigenvalues of the 1D problems.

The tensor product ideas generalises to arbitrary dimensions and has the propertyof separating a nD problem into a sequence of 1D problem enabling to obtain some veryfast algorithms.

3.4 The Finite element method

3.4.1 Principle of the method

For solving a problem on a computer that can only store a finite amount of informationa discrete form of the problem is needed. In the Finite Difference method one simplycomputes an approximation of the solution at a finite number of grid points. In theFinite Element method, which is mathematically more involved, the idea is to look forthe solution in a finite dimensional vector space, i.e. for some well chosen vector spaceVh, with basis (i)0iN1, the approximate solution has the form

uh(x) =

N1i=0

uii(x).

The basis being given, the approximate solution uh is fully determined by its coefficientsui in this basis, which need not be values of uh at some points in the computationaldomain, but can be in some cases.

The question now becomes how to choose Vh and determine the coefficients ui suchthat uh is a good approximation of the solution u of the original problem, that we takeas a start as the Poisson problem with homogeneous Dirichlet boundary conditions:

u = f, in , u = 0, on . (3.20)

27

The first idea, introduced by Ritz in his thesis in Gottingen in 1902, was to transform theboundary problem into an equivalent minimisation problem. Indeed, via the Dirichletprinciple (3.20) is equivalent to the minimisation problem

minuH10 ()

(1

2

|u(x)|2 dx

f(x)u(x) dx

). (3.21)

We shall need the following Hilbert spaces, defined for a domain Rd

H1() = {u L2(), u (L2())d}, H10 () = {u H1(), u = 0 on }.The scalar product associated to these Hilbert spaces is

(u, v)H1 =

u(x) v(x) dx+

u(x)v(x) dx.

Then, the original problem being transformed into a minimisation problem in becomesquite natural to look for an approximation in a finite dimensional subspace of the functionspace in which the minimisation problem is posed (in our case H10 ()), which meansthat the minimisation is performed by considering only minima in a finite dimensionalsubspace. Then if the form of finite dimensional space is chosen such that any functionof the original space can be approximated to any given tolerance, by a function of theapproximation space, we should be able to get a good approximation. Ritz who wasactually looking at solutions for the bilaplacian equation, chose as basis functions for Vha finite number of eigenfunctions of his operator.

The standard method to solve a minimisation problem with a cost functional Jdefined on a Hilbert space V , of the form

minuV

J [u],

is to solve the associated Euler equation J [u] = 0 obtained by computing the Frechetderivative of the functional that we want to minimise. Note that the Frechet derivativegives a rigorous definition of the functional derivative used in physics for functions thatare in a Banach (including Hilbert) space. Consider a functional J from a Hilbert spaceV into R. Its Frechet derivative J , assuming it exists, is a linear form on V , whichmeans that it maps any function from V to a scalar. It can be computed using theGateaux formula:

J [u](v) = lim0

J [u+ v] J [u]

. (3.22)

Let us apply this formula to our problem for which

J [u] =1

2

|u(x)|2 dx

f(x)u(x) dx.

We have for any v V = H10 ()

J [u+ v] =1

2

|u(x) + v(x)|2 dx

f(x)(u(x) + v(x)) dx

=1

2

(|u(x)|2 dx+ 2

u(x) v(x) dx+ 2

|u(x)|2 dx

)

f(x)u(x) dx

f(x)v(x) dx.

28

From which we deduce, using the Gateaux formula (3.22) that

J [u](v) =

u(x) v(x) dx

f(x)v(x) dx.

Note that J [u] being a linear form on V is defined by applying it to some vector v V . Finally the solution of our minimisation problem (3.21), is a solution of the Eulerequation J [u] = 0 or equivalently J [u](v) = 0 for all v V , which reads

u(x) v(x) dx =

f(x)v(x) dx v H10 (). (3.23)

This is also what is called the variational formulation, or the weak formulation of theoriginal boundary value problem (3.20). Note that this variational formulation expressesin some sense the orthogonality of the residual to the space in which the solution issought. This is more general than Euler equations of minimisation problems as noticedby Galerkin and has a wide range of applications. One can even extend this concept bymaking the residual orthogonal to a different function space, than the one in which thesolution lives. Such methods are called Petrov-Galerkin methods and are beyond thescope of this lecture.

So the principle of the Galerkin Finite Element method is to look for a solution in afinite dimensional subspace Vh V of the original space and to use the same variationalformulation (3.23) as the one defining the exact solution, with test functions also in Vhto characterise the solution. What remains to be done now is to choose Vh with goodapproximation properties. As we will see later, the stability of the Galerkin methodfollows directly from the well-posedness of the variational problem (3.23).

The finite dimensional space Vh is in general defined by its basis functions. Forthose, Ritz used eigenfunctions of the problem. But those are in general cumbersometo compute. Galerkin proposed to use general classes of simple functions, trigonomet-ric functions or polynomials, that are know to be able to approximate any continuousfunction with a finite number of basis functions. Trigonometric polynomials which arelinked to Fourier series are very good in periodic domains, with a few simple extensions.Polynomials enjoy more widespread applications, however to get a good conditioning ofthe linear system that is obtained at the end, care needs to be taken in the choice ofthe basis functions. The monomial basis (1, x, x2, . . . ) has very bad properties. Bestapproximations are provided by the orthogonal Legendre polynomials or by the Cheby-shev polynomials which are used in practice. Note that all the basis functions we havementioned up to now have a global support in the computational domain and thus leadto full matrices in the linear system, which can be computationally expensive. Meth-ods using such bases are actually not known as Finite Element methods but rather asspectral methods. We will come back to those later.

Another ingredient is needed to define what is known as Finite Element methods.This was introduced by Courant in 1943 and consists in using basis functions witha small support in the computational domain, so that its product with other basisfunctions vanishes for most of the other basis functions leading to a very sparse matrixin the linear system, which can be solved very efficiently on a computer. For this thecomputational domain is decomposed into small elements, in general triangles or quadsin 2D and the basis functions are chosen to be relatively low order polynomials, on eachof these elements. Convergence being achieved by taking smaller elements like the cells inthe Finite Difference method. In 1D a finite element mesh will look like a finite difference

29

mesh. An example of an unstructured Finite Element mesh in 2D is displayed in Figure3.2, which shows the great flexibility in particular to handle complicated boundarieswith finite elements, which finite differences do not provide. This is a key to its verywide usage.

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35

0.0

0.1

0.2

0.3

0.4

Figure 3.2: Example of a 2D finite element mesh consisting of triangles.

The article by Gander and Wanner [6] provides a clear and well documented overviewof the historical developments of the Finite Element method. For more technical his-torical developments of the Finite Difference and Finite Element methods on can alsoconsult [15].

In summary, the finite element method consists in looking for a solution of a vari-ational problem like (3.23), in a finite dimensional subspace Vh of the space V wherethe exact solution is defined. The space Vh is characterised by a basis (1, . . . , N ) sothat finding the solution of the variational problem amounts to solving a linear system.Indeed, express the trial function uh and the test function vh on this basis:

uh(x) =

Nj=1

ujj(x), vh(x) =

Ni=1

vii(x),

and plug these expressions in the variational problem (3.23). This yields

Ni=1

Nj=1

ujvi

i(x) j(x) dx =

i=1

Nvi

f(x)i(x) dx.

This can be expressed in matrix form, U>h AhUh = U>h bh, which is equivalent to the

linear system AhUh = bh as the previous equality is true for all Uh, where

Uh = (u1, . . . , uN )>, Uh = (v1, . . . , vN )>, bh = (

f(x)1(x) dx, . . . ,

f(x)N (x) dx)

>

30

and the matrix Ah whose entries are

(

i(x) j(x) dx)1i,jN .

3.4.2 The variational (or weak) form of a boundary value problem

The variational form of a boundary value problem contains all its elements, which arethe partial differential equation in the interior of the domain and the boundary condi-tions. There are two very distinct ways to handle the boundary conditions depending onhow they appear when deriving the variational formulation. If they appear on the testfunction they are called essential boundary conditions and need to be included in thespace where the solution is looked for. If they appear on the trial function, which willbe the approximate solution, they can be handled in a natural way in the variationalformulation. Such boundary conditions are called natural boundary conditions. We willsee on the examples of Dirichlet and Neumann boundary conditions how this works inpractice.

In order to define the variational formulation, we will need the following Greenformula: For u H2() and v H1()

u v dx =

u v dx

u

nv d. (3.24)

Here H2() denotes the Hilbert space of the functions whose partial derivatives up tosecond order are in L2() and un = u n, where n is the outbound normal at anypoint of the boundary.

Case of homogeneous Dirichlet boundary conditions

Let f L2(). Consider the boundary value problemu = f in , (3.25)

u = 0 on . (3.26)

Assume that u H2(), multiply (3.25) by v H1() and integrate using the Greenformula (3.24), which yields

u v dx

u

nv d =

fv dx.

Here u does not appear in the boundary integral, so we cannot apply the boundarycondition directly. But in the end u will be in the same function space as the testfunction v, which appears directly in the boundary integral. This is the case of anessential boundary condition. So we take test functions v vanishing on the boundary.We then get the following variational formulation:Find u H10 () such that

u v dx =

fv dx v H10 (). (3.27)

The solutions of this variational formulation are called weak solutions of the originalboundary value problem. The solutions which are also in H2() are called strong solu-tions. Indeed we can prove that such a solution is also a solution of the initial boundary

31

value problem (3.25)-(3.26). If u H2(), the Green formula (3.24) can be used, andas vanishes on the boundary it yields

udx =

fdx v H10 ().

This implies, as H10 () is dense in L2(), that u = f in L2() and so almost

everywhere. On the other hand as u H10 (), u = 0 on . So u is a strong solution of(3.25)-(3.26).

Case of non homogeneous Dirichlet boundary conditions

Let f L2() and u0 H1(). We consider the problem

u = f in , (3.28)u = u0 on . (3.29)

As the value of u on the boundary cannot be directly put in the function space if it isnot zero, as else the function space would not be stable by linear combinations, we needto bring the problem back to the homogeneous case. To this aim let u = u u0. Wethen show as previously that u is a solution of the variational problemFind u H10 () such that

u v dx =

fv

u0 v dx v H10 (). (3.30)

This is the variational problem that needs to be solved for non homogeneous Dirichletboundary conditions. As u0 will only have non zero entries on the boundary for standardFinite Elements, the problem can be simplified in different manners in practice.

Case of Neumann boundary conditions

Let f L2() and g H1(). We consider the problem

u+ u = f in , (3.31)u

n= g on . (3.32)

Assuming that u H2(), we multiply by a test function v H1() and integrate usingthe Green formula (3.24), which yields

u v dx

u

nv d +

uv dx =

fv dx.

Replacing un by its value g on the boundary, we obtain the variational formulationFind u H1() such that

u v dx+

uv dx =

fv dx+

gv d v H1(). (3.33)

Let us now show that u is a strong solution of the boundary value problem, providedit is in H2(). As H10 () H1() one can first take only test functions in H10 (). Then

32

as in the case of homogeneous Dirichlet conditions it follows from the Green formula(3.24) that

(u+ u)dx =

fdx H10 ().

This implies, as H10 () is dense in L2(), that u + u = f in L2() and so almost

everywhere.It now remains to verify that we have the boundary condition

u

n= g on .

For that we start from (3.33) and apply the Green formula (3.24), which yields

u v dx+

u

nv d +

uv dx =

fv dx+

gv d v H1(),

and as u+ u = f , it remains

u

nv d =

gv d v H1(),

which yields that un = g on .

3.4.3 Lagrange Finite Elements

Finite Elements are used to construct a finite dimensional space Vh with basis functionsthat have a small support so that the resulting matrix is sparse, i.e. most of its entriesvanish. The simplest Finite Elements are determined by points values and related toLagrange interpolation, whence their name. In order to construct a basis of Vh, onestarts by decomposing the computational domain into non overlapping intervals in 1D(like the Finite Difference mesh), triangles or quads in 2D, tetrahedra or hexahedra in3D. This defines a mesh of the computational domain. Then the basis functions aredefined locally on each cell (or element).

Let us start with a non uniform 1D mesh of the domain [a, b] defined by the gridpoints a = x0 < x1 < < xN = b. The elements of the mesh are here the intervals[x , x+1]. The restriction of Vh to each element is defined to be the space of polynomialsof degree k, denote by Pk([x , x+1]). The basis restricted to each element is definedvia a reference element, which is conveniently chosen, for later Gauss integration, as theinterval [1, 1] and an affine map

F : [1, 1] [x , x+1]

x 7 x + x+12

+

(x+1 x

2

)x.

An important aspect when choosing the local basis of a finite element is the globalcontinuity requirement coming from the fact that Vh V . Indeed a function withdiscontinuities is not in H1, this is why we need to choose Vh as a subset of C

0([a, b]). Inorder to make this requirement easy to implement in practice it is convenient to definethe basis of Pk([x , x+1]) as being defined by its value at k + 1 points in [x , x+1],including the endpoints of the interval x and x+1. Such a basis is called a nodal basisand the naturally associated basis functions are the Lagrange basis functions.

33

So the subspace Vh on the mesh x0 < x1 < < xN is defined by

Vh ={vh C0([a, b]) | vh|[x ,x+1] Pk([x , x+1])

}.

As an affine mapping maps polynomials of degree k to polynomials of degree k, thebasis can be defined on the reference element [1, 1]. Given k + 1 interpolation points1 = y0 < y1 < < yk = 1 the Lagrange basis functions of degree k denoted by lk,i,0 i k, are the unique polynomials of degree k verifying lk,i(yj) = i,j . Because of thisproperty, any polynomial p(x) Pk([1, 1]) can be expressed as p(x) =

kj=0 p(yj)lj,k(x)

and conversely any polynomial p(x) Pk([1, 1]) is uniquely determined by its valuesat the interpolation points yj , 0 j k. Hence in order to ensure the continuityof the piecewise polynomial at the cell interface x it is enough that the values of thepolynomials on both sides of x have the same value at x . This constraint removesone degree of freedom in each cell, moreover the two end points are known for Dirichletboundary conditions, which removes two other degrees of freedom so that the totaldimension of Vh is Nk 1 and the functions of Vh are uniquely defined in each cellby their value at the degrees of freedom (which are the interpolation points) in all thecells. The basis functions denoted of Vh denoted by (i)0jNk1 are such that theirrestriction on each cell is a Lagrange basis function.

Note that for k = 1, corresponding to P1 finite elements, the degrees of freedom arejust the grid points. For higher order finite elements internal degrees of freedom areneeded. For stability and conveniency issues this are most commonly taken to be theGauss-Lobatto points on each cell. We are now ready to assemble the linear system

The discrete variational problem in 1D reads: Find uh Vh such that ba

duhdx

dvhdx

dx =

baf(x)vh(x) dx vh Vh.

Now expressing uh (and vh) in the basis of Vh as uh(x) =Nk1

j=1 uj(t)j(x), vh(x) =Nk1j=1 vjj(x) and plugging these expression in the variational formulation, denoting

by U = (u1, , uNk1)> and similarly for V yields: Find U RNk1 such that

i,j

ujvi

ba

i(x)

x

j(x)

xdx =

i,j

vi

baf(x)i(x) dx V RNk1,

which can be expressed in matrix form

V >AU = V >b V Rnk,

which is equivalent toAU = b

where the square (Nk 1) (Nk 1) matrix A and right hand side b are defined by

A = (

L0

di(x)

dx

dj(x)

dxdx)i,j , b = (

L0f(x)i(x) dx)i.

Another option for computing the right-hand side and which yields the same orderof approximation is to project first the unknown f onto fh in the space Vh, then fh(x) =

34

i fii(x), and the right hand side can be approximated with b = MF , with F the

vector of components fi and the mass matrix

M = (

L0i(x)j(x) dx)i,j .

Note that these matrices can be computed exactly as they involve integration ofpolynomials on each cell. Moreover because the Gauss-Lobatto quadrature rule is exactfor polynomials of degree up to 2k1, A can be computed exactly with the Gauss-Lobattoquadrature rule. Moreover, approximating the mass matrix M with the Gauss-Lobattorule introduces an error which does not decrease the order of accuracy of the scheme [4]and has the big advantage of yielding a diagonal matrix. This is what is mostly done inpractice.

Usually for Finite Elements the matricesM and A are computed from the correspond-ing elementary matrices which are obtained by change of variables onto the referenceelement [1, 1] for each cell. So L

0i(x)j(x) dx =

n1=0

x+1x

i(x)j(x) dx,

and doing the change of variable x = x+1x2 x+x+1+x

2 , we get x+1x

i(x)j(x) dx =x+1 x

2

11(x)(x) dx,

where (x) = i(x+1x

2 x+x+1+x

2 ). The local indices on the reference element gofrom 0 to k and the global numbers of the basis functions not vanishing on element are j = k + . The are the Lagrange polynomials at the Gauss-Lobatto points inthe interval [1, 1].

The mass matrix in Vh can be approximated with no loss of order of the finiteelement approximation using the Gauss-Lobatto quadrature rule. Then because theproducts (x)(x) vanish for 6= at the Gauss-Lobatto points by definition of the which are the Lagrange basis functions at these points, the elementary matrix M isdiagonal and we have 1

1(x)

2 dx k

=0

wGL (x)2 = wGL

using the quadrature rule, where wGL is the Gauss-Lobatto weight at Gauss-Lobattopoint (x) [1, 1]. So that finally M = diag(wGL0 , . . . wGLk ) is the matrix with k + 1lines and columns with the Gauss-Lobatto weights on the diagonal.

Let us now compute the elements of A. As previously we go back to the interval[1, 1] with the change of variables x = x+1x2 x + x+1+x2 and we define (x) =i(

x+1x2 x+

x+1+x2 ). Note that a global basis function i associated to a grid point

has a support which overlaps two cells and is associated to two local basis functions.Thus one needs to be careful to add the two contributions as needed in the final matrix.

35

We get (x) =x+1x

2 i(x+1x

2 (x+ 1) + x). It follows that x+1x

j(x)i(x) dx =

11

(2

x+1 x

)2(x)

(x)

x+1 x2

dx

=2

x+1 x

11(x)

(x) dx =

2

x+1 xk

m=0

wGLm (xm)

(xm).

As the polynomial being integrated is of degree 2(k 1) = 2k 2 the Gauss-Lobattoquadrature rule with k+1 points is exact for the product which is of order 2k1. Usingthis rule 1

1(x)(x) dx =

km=0

wGLm (xm)(xm) = w

GL

(x),

As before, because are the Lagrange polynomials at the Gauss-Lobatto points, onlythe value at x in the sum is one and the others are 0. On the other hand evaluatingthe derivatives of the Lagrange polynomial at the Gauss-Lobatto points at these Gauss-Lobatto points can be done using the formula

(x) =p/px x for 6= and

(x) =

6=

(x),

where p = 6=(x x). This formula is obtained straightforwardly by taking the

derivative of the explicit formula for the Lagrange polynomial

(x) =

6=

(x x) 6=

(x x)

and using this expression at the Gauss-Lobatto point x 6= x. We refer to [1] for adetailed description.

This can be extended via Kronecker product to tensor product meshes in 2D or 3Dor more.

3.4.4 Formal definition of a Finite Element

Let (K,P,) be a triple(i) K is a closed subset of Rn of non empty interior,(ii) P is a finite dimensional vector space of function defined on K,(iii) is a set of linear forms on P of finite cardinal N , = {1, . . . , N}.

Definition 3 is said to be P -unisolvent if for any N -tuple (1, . . . , N ), there existsa unique element p P such that i(p) = i pour i = 1, . . . , N .

Definition 4 The triple (K,P,) of Rn is called Finite Element of Rn if it satisfies(i), (ii) and (iii) and if is P -unisolvent.

36

Example 1: Pk in 1D. Let a, b R, a < b. Let K = [a, b], P = Pk the set ofpolynomials of degree k on [a, b], = {0, . . . , k}, where a = x0 < x1 < < xk = bare distinct points and

k : P R,p 7 p(xi).

Moreover is P -unisolvant as for k distinct values (0, . . . , k) there exists a uniquepolynomial p P such that p(xi) = i, 0 i k. This corresponds to the Lagrangeinterpolation problem, which has a unique solution.

Example 2: Pk in 2D. For 3 non aligned points a, b, and c in R2 let K be the triangledefined by a, b and c. Let P = Pk be the vector space of polynomials of degree k in 2D

Pk = {Span(xy), (, ) N2, 0 + k}.

is defined by the point values at (k+1)(k+2)2 points such that the Lagrange interpolationproblem at these points is well defined.

Example 3: Qk in 2D. For 4 points a, b, c and d in R2 such that not 3 of them arealign let K be the quadrangle defined by a, b, c and d. Let P = Pk be the vector spaceof polynomials of degree k in 2D

Qk = {Span(xy), (, ) N2, 0 , k}.

is defined by the point values at (k + 1)2 points on a lattice.

We also end up with a total of 6 basis functions, with each basis having a value 1 at its assigned node and avalue of zero at all the other mesh nodes in the element.

Again this has the form

so we can easily determine the coefficients of all 6 basis functions in the quadratic case by inverting the matrixM. Each column of the inverted matrix is a set of coefficients for a basis function.

P3 Cubic ElementsP3 Cubic Elements

In the case of cubic finite elements, the basis functions are cubic polynomials of degree 3 with the designationP3

We now have 10 unknowns so we need a total of 10 node points on the triangle element to determine thecoefficients of the basis functions in the same manner as the P2 case.

Experiment 1Experiment 1In this experiment we will solve Poissons equation over a square region using Finite Elements with linear,quadratic and cubic elements and compare the results. We will solve

Similarly, here are the interpolation functions of the nine-node Lagrange element. The nodes in the x-direction are placed at x = , x = 0, and x= while the the nodes in the y-direction are at y = , y = 0, and y = .

In[28]:=Transpose[HermiteElement1D[{-L1,0,L1},0,x]].HermiteElement1D[{-L2,0,L2},0,y]

Out[28]=

However, the situation becomes more complicated when you consider the elements whose nodes are not equally spaced in a grid. For example,you cannot generate the shape functions for triangular elements or the rectangular elements with irregular node locations by using the abovetensor product approach. Here are two examples in which the interpolation functions cannot be generated.

In[29]:=p1=ElementPlot[{{{1,-1},{-1,-1},{0,3/2}},{0,0}}, AspectRatio->1, Axes->True, PlotRange->{{-2,2},{-2,2}}, NodeColor->RGBColor[0,0,1], DisplayFunction->Identity];

In[30]:=p2=ElementPlot[{{{1,1},{1,-1},{-1,-1},{-1,1}},{0,0}}, AspectRatio->1, Axes->True, PlotRange->{{-2,2},{-2,2}}, NodeColor->RGBColor[0,0,1], DisplayFunction->Identity];

In[31]:=

5.3.2 Triangular ElementsUse the functions TriangularCompleteness, RectangularCompleteness, and ShapeFunction2D to generate interpolationfunctions for triangular and general rectangular elements.

First, consider the triangular element with four nodal points and obtain a set of Lagrange-type shape functions, namely c = 0. Do not specify theminimum number of symmetric terms to be dropped from the full polynomial.

Figure 3.3: (Left) P3 Finite Element, (Right) Q2 Finite Element

3.4.5 Convergence of the Finite Element method

The variational problems we consider can be written in the following abstract formFind u V such that

a(u, v) = l(v) v V, (3.34)where V is a Hilbert space, a is a symmetric continuous and coercive bilinear form andl a continuous linear form.

The most convenient tool for provient existence and uniqueness of the solution of avariational problem is the Lax-Milgram theorem that we recall here:

37

Theorem 1 (Lax-Milgram) Let V a Hilbert space with the norm .V . Let a(., .)continuous, symmetric and coercive bilinear form on V V , i.e.

1. (Continuity): there exists C such that for all u, v V|a(u, v)| CuV vV .

2. (Coercivity): there exists a constant > 0 such that for all u Va(u, u) > u2V .

Let l(.) a continuous linear form on V , i.e. there exists C such that for all v V|l(v)| CvV .

Then there exists a unique u V such thata(u, v) = l(v) v V.

The Ritz-Galerkin method consists in finding an approximate solution uh in a finitedimensional subspace of V . For convergence studies one needs to consider a sequence ofsubspaces of V of larger and larger dimension so that they get closer to V . One thendefines a sequence of problems parametrised by h that read :Find uh Vh such that

a(uh, vh) = l(vh) vh Vh, (3.35)where Vh V is a vector space of dimension N . Let (1, . . . , N ) a basis of Vh. Anelement uh Vh can then be expanded as uh(x) =

Nj=1 ujj(x). Taking vh = i the

equation (3.35) becomes using the linearity

Nj=1

uja(j , i) = l(i).

Then using the symmetry of a, we notice that the discrete variational formulation(3.35)is equivalent to the linear system

AUh = L, (3.36)

where A = (a(i, j))1i,jN , L is the column vector with components l(i) and Uis the column vector with the unknowns ui that are the coefficients of uh in the basis(1, . . . , N ).

Theorem 2 Assume that a is a symmetric continuous and coercive bilinear form ona Hilbert space V and l a continuous linear form on V . Then the system (3.36) isequivalent to the discrete variational form (3.35) and admits a unique solution

Proof. For vh Vh, we denote by V the vector of its components in the basis(1, . . . , N ).

Thanks to the bilinearity of a and the linearity of l the relation (3.35) can bewritten equivalently

tV AUh =tV L V RN , (3.37)

which means that the vector AUhL RN is orthogonal to all the vectors of RN ,and so is the zero vector. Conversely it is clear that (3.36) implies (3.37) and so(3.35).

38

Let vh Vh. Then, as a is coercive, there exists > 0 such thattV AV = a(vh, vh) vh2 0,

and tV AV = 0 = a(vh, vh) vh = 0, which implies that vh = 0 and so V = 0.So A is symmetric, positive definite and therefore invertible.

After making sure the approximate solution exists for some given space Vh, oneneeds to make sure the approximation converges towards the exact solution. This resultsfrom two properties: 1) The Galerkin orthogonality, which comes from the conformingGakerkin approximation, 2) The approximability property, which confirms that for anyv V there exist vh in some finite dimensional space of the family which is close enoughto v.

Lemma 2 (Cea) Let u V the solution of (3.34) and uh Vh the solution of (3.35),with Vh V . Then

u uh C infvVhu v.

Proof. We have

a(u, v) = l(v) v V,a(uh, vh) = l(vh) vh Vh,

as Vh V , we can take v = vh in the first equality and take the difference which yields

a(u uh, vh) = 0 vh Vh.

It results that a(u uh, u uh) = a(u uh, u vh + vh uh) = a(u uh, u vh), asvh uh Vh and so a(u uh, vh uh) = 0. Then there exists > 0 and such that

u uh2 a(u uh, u uh) as a is coercive, a(u uh, u vh) vh Vh, u uhu vh as a is continuous.

Whence u uh u vh for all vh Vh. We get the desired results taking theinfimum in Vh.

For the global error estimates, we make the following hypotheses on the triangulationTh:(H1) We assume that the family of triangulations is regular in the following sense:

(i) There existes a constant such that

K hTh hKK .

(ii) The quantity h = maxKh

hK tend to 0.

39

(H2) All finite elements (K,P,), K hTh are affine equivalent to a unique referenceelement (K, P , ).

(H3) All finite elements (K,P,), K hTh are of class C0.

Theorem 3 We assume the hypotheses (H1), (H2) and (H3) are verified. Moreover weassume that there exists an integer k 1 such that

Pk P H1(K),

Hk+1(K) C0(K) (true if k + 1 > n2

).

Then there exists a constant C independent of h such that for any function v Hk+1()we have

v pihvk+1 Chk|v|k+1,,where pih is the finite element interpolation operator defined by

pihv =

Ni=1

v(xi) pi.

We consider a variational problem posed in V H1().Theorem 4 We assume that (H1), (H2) and (H3) are verified. Moreover we assumethat there exists an integer k 1 such that k + 1 > n2 with Pk(K) P H1(K) andthat the exact solution of the variational problem is in Hk+1(), then

u uh1, Chk|u|k+1,,

wh

computational plasma physics

Documents