energy conserving numerical methods for the...

ENERGY CONSERVING NUMERICAL METHODS

FOR THE COMPUTATION OF

COMPLEX VORTICAL FLOWS

A DISSERTATION

SUBMITTED TO THE DEPARTMENT OF

AERONAUTICS AND ASTRONAUTICS

AND THE COMMITTEE ON GRADUATE STUDIES

OF STANFORD UNIVERSITY

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY

Yves Allaneau

December 2011

© Copyright by Yves Allaneau 2012

All Rights Reserved

ii

I certify that I have read this dissertation and that, in my opinion, it is fully

adequate in scope and quality as a dissertation for the degree of Doctor of

Philosophy.

(Antony Jameson) Principal Adviser



Philosophy.

(Robert W. MacCormack)



Philosophy.

(Gianluca Iaccarino)

Approved for the University Committee on Graduate Studies

iii

Abstract

One of the original goals of this thesis was to develop numerical tools to help with the

design of micro air vehicles. Micro Air Vehicles (MAVs) are small flying devices of only

a few inches in wing span. Some people consider that as their size becomes smaller and

smaller, it would be increasingly more difficult to keep all the classical control surfaces

such as the rudders, the ailerons and the usual propellers. Over the years, scientists took

inspiration from nature. Birds, by flapping and deforming their wings, are capable of

accurate attitude control and are able to generate propulsion. However, the biomimicry

design has its own limitations and it is difficult to place a hummingbird in a wind tunnel

to study precisely the motion of its wings. Our approach was to use numerical methods to

tackle this challenging problem. In order to precisely evaluate the lift and drag generated

by the wings, one needs to be able to capture with high fidelity the extremely complex

vortical flow produced in the wake. This requires a numerical method that is stable yet

not too dissipative, so that the vortices do not get diffused in an unphysical way. We

solved this problem by developing a new Discontinuous Galerkin scheme that, in addition to

conserving mass, momentum and total energy locally, also preserves kinetic energy globally.

This property greatly improves the stability of the simulations, especially in the special

case p = 0 when the approximation polynomials are taken to be piecewise constant (we

recover a finite volume scheme). In addition to needing an adequate numerical scheme, a

high fidelity solution requires many degrees of freedom in the computations to represent

the flow field. The size of the smallest eddies in the flow is given by the Kolmogoroff scale.

Capturing these eddies requires a mesh counting in the order of Re3 cells, where Re is the

Reynolds number of the flow. We show that under-resolving the system, to a certain extent,

is acceptable. However our simulations still required meshes containing tens of millions of

degrees of freedom. Such computations can only be done in reasonable amounts of time

by spreading the work on multiple CPUs via domain decomposition. Further speed-up

iv

efforts were made by implementing a version of the code for GPUs using Nvidia’s CUDA

programming language. Finally we searched for optimal wing motions by coupling our

computational fluid dynamics code with the optimization package SNOPT. The wing motion

was parameterized by a few angles describing the local curvature and the twisting of the

wing. These were expressed in terms of truncated Fourier series, the Fourier coefficients

being our optimization parameters. With this approach we were able to obtain propulsive

efficiencies of around 50% (thrust power/power input).

v

Acknoledgments

It is a pleasure to thank the many people who made this thesis possible.

I would like to express my gratitude to my adviser, Professor Antony Jameson, for

inspiring and encouraging me to pursue my research interests, and for supporting my time as

a graduate student. I always have a great time learning from Prof. Jameson. His availability

and his willingness to share his knowledge with his students make him a wonderful adviser.

I also want to thank the members of my writing and oral committee, Prof. MacCormack,

Prof. Iaccarino, Prof. Lele and Prof. Saunders for taking the time to go through my work

and reviewing it. Also, I truly enjoyed taking the classes you teach at Stanford.

During the course of my PhD, some friends have been particularly important for me for

their continuous support and help. I would like in particular to thank Wen Qi, Sebastien,

Lala and David for being next to me when I needed it the most. I was also extremely lucky

to have such incredible lab mates who are brilliant academically but also nice to hang out

with outside of the lab. I had a great time riding my bicycle with Patrice on the hills behind

campus or bar hopping and drifting U-Haul trucks with Matt. I also want to thank Tony

Horton and Shaun Thomson for helping me stay fit during the course of my studies (while

eating so much!).

I was lucky enough to obtain a Hugh H. Skilling Graduate Fellowship to support and

fund my PhD research. I am indebt to Mr. and Prof. Lynch for their generosity in

establishing this fellowship. Funding for my research was also provided by the AFOSR and

by the NSF.

Lastly, but most importantly, I would like to thank my parents, Myriam and Jean-

Bernard and my sister, Fanny. Their faith in me taught me to have faith in myself.

vi

Contents

Abstract iv

Acknoledgments vi

Introduction 1

Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

I Theoretical Results 5

1 Connections between Filtered DG and FR 6

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2 Filtered DG for 1D linear advection . . . . . . . . . . . . . . . . . . . . . . 8

Discontinuous Galerkin method for linear advection . . . . . . . . . . . . . 8

Stability of the method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Stability of the filtered DG method . . . . . . . . . . . . . . . . . . . . . . . 12

1.3 ESFR as a filtered DG method . . . . . . . . . . . . . . . . . . . . . . . . . 14

The Flux Reconstruction method . . . . . . . . . . . . . . . . . . . . . . . . 14

Energy Stable Flux Reconstruction as a Filtered DG method . . . . . . . . 16

1.4 Further analysis of the schemes . . . . . . . . . . . . . . . . . . . . . . . . . 22

1.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2 DG-KEP for Euler’s equations 28

2.1 DG for 1D conservation equations . . . . . . . . . . . . . . . . . . . . . . . 29

2.2 Energy stability for linear advection . . . . . . . . . . . . . . . . . . . . . . 32

vii

2.3 DG-KEP for Euler’s equations . . . . . . . . . . . . . . . . . . . . . . . . . 35

Derivation in 1 dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

The special case of finite volumes . . . . . . . . . . . . . . . . . . . . . . . . 41

Numerical example : viscous Sod shocktube . . . . . . . . . . . . . . . . . . 42

Influence and effect of α . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

2.4 Discussion and conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3 Connections between ESFR and DG-KEP 53

3.1 VCJ schemes for Vectorial Conservation Laws . . . . . . . . . . . . . . . . . 53

3.2 Difficulties associated with Kinetic Energy . . . . . . . . . . . . . . . . . . . 57

3.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

II Implementation 59

4 Parallel implementations 60

4.1 Distributed computing and domain decomposition . . . . . . . . . . . . . . 61

4.2 An implementation using GPUs and CUDA-FORTRAN . . . . . . . . . . . 64

GPUs and CUDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

Implementation in the finite volume code . . . . . . . . . . . . . . . . . . . 66

4.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

III Numerical applications - Finite Volume Method 70

5 Shocktube and shock-vortex interaction 71

5.1 Viscous FV-KEP in multidimensions . . . . . . . . . . . . . . . . . . . . . . 72

Continuous Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

Semi-discrete approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5.2 DNS of a 2D shocktube . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Simple shocktube and comparison with Roe Scheme . . . . . . . . . . . . . 77

Study of nonclassical effects in the pseudosteady flow area . . . . . . . . . . 77

5.3 Shock vortex interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Problem setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

viii

Numerical simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

6 DNS of plunging airfoils 88

6.1 Numerical Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

6.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

6.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

7 3D Flapping wings 101

7.1 Numerical Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

Time Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

Artificial Dissipation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

7.2 Parameters and Wing Deformations . . . . . . . . . . . . . . . . . . . . . . 105

Motion Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

Reconstructing the wing - Stretching the mesh . . . . . . . . . . . . . . . . 107

7.3 Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

Examples of complex wing motions . . . . . . . . . . . . . . . . . . . . . . . 108

An example of high resolution simulation . . . . . . . . . . . . . . . . . . . 109

The flow solver as part of an optimization process . . . . . . . . . . . . . . 110

Conclusion 115

Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

A Odd/Even decoupling phenomenon in DG 117

B Order of finite volume KEP scheme 121

B.1 Mean functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

B.2 Use of mean operators in finite volumes approximation . . . . . . . . . . . . 123

B.3 Application to the FV-KEP Scheme . . . . . . . . . . . . . . . . . . . . . . 124

Bibliography 129

ix

List of Tables

4.1 CPU times coarse mesh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.2 Single GPU times coarse mesh . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.3 CPU and GPU times on fine mesh . . . . . . . . . . . . . . . . . . . . . . . 68

7.1 Assumed form of an optimal solution . . . . . . . . . . . . . . . . . . . . . . 111

7.2 Results of the optimization using SNOPT . . . . . . . . . . . . . . . . . . . 112

B.1 `2-error for various mesh sizes . . . . . . . . . . . . . . . . . . . . . . . . . . 126

x

List of Figures

1 Estimation of computational requirements by Antony Jameson, 1989 . . . . 1

1.1 Reference Solution - DG c = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . 21

1.2 Plot of the solution at t = 20 for various values of c for an upwind flux . . . 22

1.3 Plot of the solution at t = 20 for various values of c for an central flux . . . 23

2.1 Mesh nomenclature in 1D . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.2 Viscous shocktube, coarse mesh (N = 100) and low order (p = 0) . . . . . . 44

2.3 Viscous shocktube, Evolution of Kinetic Energy, coarse mesh (N = 100) and

low order (p = 0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

2.4 Viscous shocktube, coarse mesh (N = 100) and high order (p = 5), KEP flux 46

2.5 Details of the Pressure distribution, coarse mesh (N = 100) and high order

(p = 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

2.6 Viscous shocktube, fine mesh (N = 4000) and low order (p = 0) . . . . . . . 47

2.7 Details of the Pressure distribution, fine mesh (N = 4000) and low order

(p = 0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

2.8 Viscous shocktube, fine mesh (N = 800) and high order (p = 5), central flux 48

2.9 Sod shocktube, Finite volumes (N = 500) for various values of α . . . . . . 49

2.10 Sod shocktube, Finite volumes (N = 2000) for various values of α . . . . . . 49

2.11 Viscous Sod shocktube, DG (N = 100, p = 5) for various values of α . . . . 50

4.1 Decomposition of the computational domain around an airfoil . . . . . . . . 62

4.2 Computational domain for 1 CPU and its halo . . . . . . . . . . . . . . . . 62

4.3 Computational domains for multiple CPUs and their overlapping halos . . . 63

4.4 nVidia Fermi architecture. Each small green square is a CUDA core. . . . . 65

4.5 Kernels hierarchy. Kernels are in capital letters . . . . . . . . . . . . . . . . 66

xi

5.1 Variation of state variables along the centerline. Re = 25000, α = 0.6 . . . . 78

5.2 Comparison of pressures on the centerline for the KEP scheme and the Roe

scheme at two locations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

5.3 Distribution of nondimensional x -velocity in the shocktube . . . . . . . . . 79

5.4 Distributions of nondimensional velocities and pressure in the pseudosteady

area (contact discontinuity area) in the case Re = 25000, α = 0.6, t =

0.2136L/Vl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.5 Pressure pattern observed in the pseudosteady area of the flow. A (+) is

a surpressure compared to the inviscid case while a (−) corresponds to a

depression. Re = 25000, α = 0.6. . . . . . . . . . . . . . . . . . . . . . . . . 80

5.6 Pressure distribution along the walls of the shocktube. α = 0.2, Re = 25000 81

5.7 Pressure Waves pattern in the pseudosteady flow area for α = 0.2, α = 0.3,

α = 06. Re = 25000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

5.8 Shock Vortex interaction problem setup - © J. Furst . . . . . . . . . . . . 83

5.9 Pressure distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5.10 Numerical Schlieren (density gradient magnitude) at various times . . . . . 86

6.1 Computational domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

6.2 Artificial dissipation is added only in the darker area . . . . . . . . . . . . . 93

6.3 Density field in the airfoil’s wake - Sr = 0.29 . . . . . . . . . . . . . . . . . 94

6.4 Vorticity distribution in the airfoil’s wake - Sr = 0.29 . . . . . . . . . . . . 95

6.5 Streak lines, experimental data by Jones and Platzer - Sr = 0.29 . . . . . . 95

6.6 Lift and Drag history - Sr = 0.29 . . . . . . . . . . . . . . . . . . . . . . . . 96

6.7 Density field in the airfoil’s wake - Sr = .60, h = .1, k = 6. . . . . . . . . . 96

6.8 Vorticity distribution in the airfoil’s wake - Sr = .60, h = .1, k = 6. . . . . 97

6.9 Streak lines, experimental data by Jones and Platzer - Sr = 0.60, h =

.2, k = 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

6.10 Lift and Drag history - Sr = 0.60, h = .1, k = 6. . . . . . . . . . . . . . . 98

6.11 Vorticity field in the airfoil’s wake - Sr = .60, h = .2, k = 3. . . . . . . . . 98

6.12 Density field in the airfoil’s wake - Sr = 1.5 . . . . . . . . . . . . . . . . . . 99

6.13 Vorticity distribution in the airfoil’s wake - Sr = 1.5 . . . . . . . . . . . . . 99

6.14 Streak lines, experimental data by Jones and Platzer - Sr = 1.5 . . . . . . . 100

6.15 Lift and Drag history - Sr = 1.5 . . . . . . . . . . . . . . . . . . . . . . . . 100

xii

7.1 Wing skeleton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

7.2 Flapping motion reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . 106

7.3 Twisting motion reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . 107

7.4 Dilation of the mesh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

7.5 Deformation of the mesh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

7.6 Characteristic motions of the wing . . . . . . . . . . . . . . . . . . . . . . . 109

7.7 High resolution example - Vorticity isosurfaces colored by pressure distribution110

7.8 Lift and drag time history for the optimal case n = 1, N = 1 for various meshes.112

7.9 Optimal solution N = 1, n = 2 . . . . . . . . . . . . . . . . . . . . . . . . . 114

B.1 Convergence of the error in the `2-norm for ρ and p . . . . . . . . . . . . . . 126

B.2 Exact Solution of the Isentropic Vortex case at t = 0. and t = 1. . . . . . . 127

B.3 Numerical solution at t = 1., mesh sizes h = .2 . . . . . . . . . . . . . . . . 127



xiii

Introduction

During the last sixty years, the exponential growth of computational power allowed the

numerical simulation of physical phenomena of increasing complexity. In particular, Com-

putational Fluid Dynamics (CFD) greatly benefitted from the improvements in computers.

Figure 1: Estimation of computational requirements by Antony Jameson, 1989

Today, it is not uncommon for a company to own a super-computer cluster capable

of more than a hundred trillion floating-point operations per second (100 Tflops). While

such a number is almost impossible for us to appreciate, many numerical simulations would

benefit from even greater flops counts. Furthermore, large computer clusters are rarely

available to a single user and very often, hundreds of users have to share the ressources. As

a result, the writer of a high fidelity numerical simulation code must make smart choices

and design algorithms carefully rather than relying solely on the increasing computational

1

INTRODUCTION 2

performance.

In the field of Computational Fluid Dynamics (CFD), some problems are still extremely

challenging and too expensive to be solved by simple low order (first order) schemes on

extremely large meshes. For example, capturing with accuracy the turbulent flow over an

obstacle at a Reynolds number Re requires a mesh resolution greater than the Kolmogorov

scale of the smallest eddies 1Re3/4 . As a result, using a first order requires a mesh of the order

of Re3 cells. Similarly, the problem of computing complex vortical flows is difficult. This

problem arises when evaluating the flow generated by a rotorcraft or in the wake of flapping

air vehicles. Numerical simulations generally require the addition of artificial dissipation

to enhance stability, but this can lead to unphysical dissipation of the flow vorticity. Once

again, one could choose to use a very fine mesh coupled with a stable yet cheap low order

method, or to use a high order method.

In both cases, two approaches can be considered. Using a higher order method decreases

the number of degrees of freedom of the problem. On the other hand, finding an extremely

cheap and stable low order method that could work fast despite the large number of degrees

of freedom would also speed up the computations.

Motivations

The main focus of this thesis is the problem of computing with high fidelity the vortical

flow generated around moving objects and in particular Micro Air Vehicles(MAVs). While

several methods were developed in the past to improve the quality of such simulations and

to preserve the flow’s total vorticity, most of them proved to be complex to implement and

expensive to run. One of the methods developed is the vorticity confinement approach.

The idea was first proposed by Steinhoff and Underhill [ref nawee] as a method to capture

vortices by injection of vorticity back into the vortex cores. The method has been used

to compute the vortical flow around simple geometries and rotorcrafts with mixed results.

Another approach to solving the problem is to use high order methods while refining the

mesh. Athough effective, this approach is usually very expensive and classical high order

methods for structured meshes such as WENO are difficult to implement in a context of

complex geometry. Furthermore, the use of high order methods in the industry is still

extremely limited and most of the current Navier-Stokes commercial codes use a Finite

Volume approach to solving the problem.

INTRODUCTION 3

The basic idea in this thesis is that simple schemes that conserve a particular energy can

be used to solve the vortex flow problem efficiently. The additional conservation of an energy

(other than the total energy) by the scheme tends to improve its stability and removes

some of the need to add artificial dissipation or to devise complex numerical tricks. We

start our study by considering high order methods for unstructured meshes (Discontinous

Galerkin, Flux Reconstruction). By doing so, we highlight fundamental connections that

exist between these methods as well as deep intrinsic differences. Also we derive a Kinetic

Energy Conserving DG scheme (DG-KEP), as an extension of Jameson’s previous work

(FV-KEP). However, this work made us quickly realize that some of the best results in

terms of performance and simplicity were obtained for the special case of finite volumes

(zero order polynomial approximation). The FV-KEP is extremely easy to implement in

existing finite volume codes and flux computations are so cheap that it can be run on huge

meshes to compensate for its low order properties. In the remainder of this work, we focus

on using the FV-KEP scheme to solving various vortex dominated flow problems and in

particular to compute the flow around flapping Micro Air Vehicles.

Outline

This thesis is divided in three parts. Each part corresponds to one of the three main topics

in Computational Fluid Dynamics: the development of numerical methods and schemes to

solve the flow equations; the implementation of fast, efficient and robust computer codes;

and the application of these codes to physical problems.

In the first part, attention is focused on the study of high order methods for unstructured

meshes. In chapter 1, we describe the Discontinous Garlerkin (DG) method and the Energy

Stable Flux Reconstruction (ESFR) method introduced by Huynh [15] and Vincent et al.

[38]. We then show how these two methods are connected and how the ESFR method

is equivalent to a DG method with a linear filtering operator applied to the residual. In

chapter 2, we derive an extension of Jameson’s Kinetic Energy Preserving scheme FV-KEP

[19, 20] to the DG method (DG-KEP). We also show how a simple modification of the flux

leads to a Kinetic Energy Decreasing scheme. In addition, it is observed that the special

case of p = 0 leads to recovery of the particularly good FV-KEP method. In chapter 3,

we study the possibility of using the numerical flux of the DG-KEP method in the ESFR

framework to obtain a kinetic energy preserving flux reconstruction method. Unfortunately,

INTRODUCTION 4

we show that the two approaches are fundamentally different and that it is not possible to

couple the two ideas without inducing new complications.

In the second part, we describe the work made to implement an efficient and fast finite

volume code based on the FV-KEP scheme. To speed up simulations, the code is parallelized

between multiple nodes using MPI (Message Passing Interface) and also modified to run on

GPGPU (General Purpose Graphic Process Unit) using CUDA, nVidia’s parallel computing

architecture.

The aim of the third and last part of this thesis is validation of the method for various

cases, and its application to the design of flapping wing vehicles. In chapter 5, we examine

the flow in a 2D viscous shocktube and the viscous interaction of a moving vortex with a

shockwave. This allows us to check that the FV-KEP scheme is able to capture complex

flow features and can be used for more complex problems. In chapter 6, we study the

flow around 2D plunging airfoils. These cases are particularly interesting because there

exists extensive literature that describes the phenomenon [32, 33]. Furthermore, many

other groups performed the same simulation using various CFD codes [26, 25], making it

an excellent validation case. In chapter 7, we compute the 3D flow around flapping and

deforming pairs of wings. We explain how the code can be used to find the optimal motion

of the wing that would lead to best propulsive efficiency [7] and therefore help with the

design of Micro Air Vehicles, our original goal.

A special effort has been made to keep the various chapters of the text as independent as

possible for convenience of the reader. Only chapter 3 requires the reading of chapters 1 and

2 as it tries to establish connections between the results obtained in both these chapters.

If the reader is unfamiliar with Jameson’s FV-KEP scheme, it might help to read a short

description of the method in section 5.1 before attempting to read chapters 6 and 7.

Part I

Theoretical Results

5

Chapter 1

Connections between Filtered DG

and FR

The purpose of this chapter is to provide new insights on the connections that exist be-

tween the discontinuous Galerkin method (DG), the flux reconstruction method (FR), and

the recently identified energy stable flux reconstruction method (ESFR) for solving time-

dependent conservation laws. All these schemes appear to be quite similar and it is impor-

tant to understand how they are related. In this chapter, we review results on the stability

of the discontinuous Galerkin method and extend it to the filtered discontinuous Galerkin

method. We then consider the Flux Reconstruction approach and show its connections with

DG. In particular, we show how the Energy Stable Flux Reconstruction method introduced

by Vincent, Castonguay, and Jameson is equivalent to a Filtered DG method, hence giving

a new proof of its stability. Also, it allows use of the method without having to know the

special form of the flux correction polynomials. Finally, we underline some fundamental

differences that exist between FR and DG.

1.1 Introduction

High order numerical methods for unstructured grids have seen many developments over the

last few decades. The pioneering work of Reed and Hill [34] in the 1970s led to the original

Discontinuous Galerkin method (DG) based on a variational form of the equations. In a

series of papers, Cockburn and Shu formulated and developed the discontinuous Galerkin

6

CHAPTER 1. CONNECTIONS BETWEEN FILTERED DG AND FR 7

method for conservation laws [2, 3, 4, 5, 6]. They also provided extensive theoretical re-

sults. However, the computational cost of the original Discontinuous Galerkin approach

forced researchers to look at somewhat cheaper or simpler alternatives. In their book [14],

Hesthaven and Warbuton give a thorough exposition of a nodal variant of the Discontinuous

Galerkin method. Kolias and Kopriva [24] introduced the staggered grid method, based on

the differential form of the equation, later renamed spectral difference (SD) and thoroughly

studied by Wang et al. and Jameson et al. [39, 40]. Other methods include the popular

spectral volume method due to Wang [41].

Recently, Huynh introduced a Flux Reconstruction (FR) framework [15, 16] with which

he was able to recover some existing schemes and formulate some new variations. Jameson

used this framework to recast the Fourier stable Spectral Difference method and to show its

energy stability in a Sobolev type norm[18] for all orders of accuracy. Vincent, Castonguay

and Jameson later extended this work to identify a class of FR schemes[38] among Huynh’s

family of schemes, which are energy stable for all orders of accuracy.

All these numerical methods may appear to be quite similar in both their formulation

and the results they provide. It therefore seems legitimate to ask what are the connections

among all the various schemes. Huynh started to answer this question by showing that the

family of FR schemes contains both the nodal DG and the SD methods. This chapter goes

further and shows how the entire class of energy stable flux reconstruction schemes identified

by Vincent et al. can be recast as a discontinuous Galerkin method for which a linear

filtering operator is applied on the residual. However, this chapter also shows that some

differences exist between the schemes and that some Flux Reconstruction methods cannot

be described as a filtered discontinuous Galerkin method. Conversely, there exist linearly

filtered discontinuous Galerkin methods that cannot be expressed in the flux reconstruction

framework.

In section 1.2, we describe the classical discontinuous Galerkin method for linear ad-

vection and give an energy based proof of stability. We also show how appropriate filters

applied to the residual preserve energy stability. In section 1.3, we introduce a simple for-

mulation of the flux reconstruction method and show how one can recover a DG scheme by

using Radau polynomials for the flux correction function. We then consider the special case

of the Energy Stable Flux Reconstruction and show how it can be formulated in terms of

a filtered Discontinuous Galerkin method, hence giving a new proof of its stability. Section

1.4 highlights some fundamental differences that exist between discontinuous Galerkin and


Flux reconstruction approaches.

1.2 Filtered Discontinuous Galerkin Method for 1D linear

advection

In this part, we describe a Discontinuous Galerkin (DG) method to solve the following one

dimensional linear advection equation on the domain Ω = [L,R]:

∂u

∂t+ a

∂u

∂x= 0 , a is a constant (1.1)

We then consider the effect of filtering on the stability of the method.

Discontinuous Galerkin method for linear advection

The DG method focuses on finding an approximate weak solution to equation (1.1). To do

so, the domain is decomposed into N elements

Ω =N−1⋃

k=0

[xk, xk+1], L = x0 < x1 < ... < xN = R

=N−1⋃

k=0

Ωk

on which the solution is approximated by polynomials of degree p:

uhk =p+1∑

i=1

uikφi,

where φi is a basis of Rp[X], the space of degree p polynomials with real coefficients. We

define uk = [u1k · · · up+1

k ]T .

The DG method results from the discretization of the weak formulation of the equation.

We require the residual Rh = ∂uh∂t +a∂uh

∂x to be orthogonal to a set of smooth test functions.

In the case of our discretization, this set is the space of polynomials of degree at most p.

This leads to the following equations on each cell k:


∀j,∫

Ωk

Rh · φjdx = 0.

Replacing the residual by its expression, one obtains

∀j,∫

Ωk

(∂uhk∂t

+ a∂uhk∂x

)φjdx = 0.

The spacial derivative term can be integrated by part to extract the cell boundary contri-

butions:

∀j,∫

Ωk

∂uhk∂t

φj − auhk∂φj∂x

dx = −[(auhk)φj

]xk+1

xk

.

The DG method is defined by replacing these boundary terms by interface fluxes (au)? so

that all the cells are connected by the method:

∀j,∫

Ωk

∂uhk∂t

φj − auhk∂φj∂x

dx = − [(au)?φj ]xk+1xk

.

Note that the flux between cells k and k+1 is uniquely defined so that the method remains

conservative. This form of the equations is called “DG weak form”. One can integrate this

relation by part one more time to obtain the “DG strong form” on the equations. This form

is still a discretization of the weak formulation of the equations, the term strong comes from

the fact that the original linear advection equation appears directly in the equation:

∀j,∫

Ωk

φj

(∂uhk∂t

+ a∂uhk∂x

)dx =

[((auhk)− (au)?

)· φj

]xk+1

xk

. (1.2)

As defined above, (au)? is the numerical flux at cell interfaces. More precisely, (au)?(xk+1) =

(au)?k,k+1 is the flux between cell k and k + 1.

The set of p+ 1 equations given by (1.2) can be recast as a matrix system:

Mk ddt

uk + aSkuk =[(

(auhk)− (au)?)Φ

]xk+1

xk

, (1.3)


where Mk and Sk are the local mass matrix and stiffness matrix

Mkij =

∫

Ωk

φiφjdx,

Skij =∫

Ωk

φidφjdx

dx.

and Φ ∈ Rp+1 is defined by Φ(x) = [φ1(x) . . . φp+1(x)]T . With this notation, uhk(x) =

uTkΦ(x).

Stability of the method

Consider again the linear advection equation on the domain [L,R]. Multiplying (1.1) by u

and integrating over x gives

∫ R

Lu∂u

∂tdx = −a

∫ R

Lu∂u

∂xdx

and therefore,

ddt

∫ R

L

u2

2dx =

12a

(u2L − u2

R

), with uL = u(L) and uR = u(R).

This energy estimate tells us that the L2 norm of the exact solution u remains bounded

for finite boundary values. If one assumes periodic boundary conditions, uL = uR andddt

∫ RL

u2

2 dx = 0, the total energy remains constant in the domain.

We now focus on the stability of the DG method and show how it satisfies a similar

criterion. Multiplying (1.3) by uTk , one obtains

uTkMk ddt

uk + a uTk Skuk =

[((auhk)− (au)?

)uTkΦ

]xk+1

xk

=[(

(auhk)− (au)?)uhk

]xk+1

xk

.


Now, using the fact that

uTSu =∫ xr

xl

uh∂uh

∂xdx

=

[uh

2

2

]xr

xl

and that

uTMddt

u =12

ddt

(uTMu

)

=12

ddt‖u‖2M

we obtain

12

ddt‖uk‖2M =

[(auhk

2

2− (au)?uhk

)]xk+1

xk

, (1.4)

where ‖ · ‖M is the norm associated with the inner product defined by u,v 7→ uTMv. To

make things clearer, we introduce the notation

u−k = uhk−1(xk)

u+k = uhk(xk).

Equation (1.4) becomes

12

ddt‖uk‖2M =

[12au−k+1

2 − (au)?k,k+1u−k+1

]−

[12au+

k2 − (au)?k−1,ku

+k

]. (1.5)

Now suppose the numerical flux is taken to be

(au)?k−1,k =12a

(u+k + u−k

)− 12α|a| (u+

k − u−k), α ∈ [0, 1]. (1.6)

Then for α = 0 we recover a central flux, and for α = 1 we recover an upwind flux.

Substituting this expression of the flux into (1.5) and summing over all the elements, we


getN−1∑

k=0

12

ddt‖uk‖2M = −1

2α|a|

N−1∑

k=1

(u+k − u−k

)2 − 12α|a| (u+

0 − u−N)2.

where, for simplicity, we assume periodic boundary condition. The terms on the right-hand

side are negative for α ≥ 0. Therefore

N−1∑

k=0

12

ddt‖uk‖2M ≤ 0, (1.7)

Since∑ ‖uk‖2M is a positive quantity decreasing in time, it remains bounded. This

concludes the stability proof of the method. Here,

‖uk‖2M =∫ xk+1

xk

uhk2dx

and therefore, the meaning of equation (1.7) is that the energy of the numerical solution

can only decrease in time. We can summarize these results in a more concise manner. The

DG method described by equation (1.3) can be written

Mdudt

+ aSu = RHSDG, (1.8)

where RHSDG depends only on the choice of the numerical flux. The index k was dropped

for convenience. We showed that the method was stable if M was any symmetric positive

definite matrix (hence defining an inner product on Rn and an associated weighted norm)

and if the numerical fluxes were chosen according to equation (1.6). In the rest of this

document, we always assume the latter to be satisfied.

Stability of the filtered DG method

In an actual implementation of the method, the DG semi-discrete equations (1.8) are written

dudt

= M−1 (−aSu + RHSDG)

= RDG(u) , the DG residual

and are then marched in time. Various explicit and implicit techniques can then be used.

Often, although the method is shown to be stable for linear equations, wiggles tend to


appear when we solve nonlinear systems of equations, such as the Burgers equation or the

Euler equations. In particular, when the solution contains discontinuities, large spurious

oscillations (Gibbs phenomenon) can be observed. One classical method to remedy this

problem is to introduce filters applied to the residual. Their goal is to damp the highest

modes and limit the Gibbs phenomenon (at this time, the notion of modes remains unde-

fined). From the previous section, it is extremely easy to show how a large class of linear

filters preserve stability (or enhance it in some sense) in the case of linear equations. A

linear filter F applied to the DG residual will lead to the equation

dudt

= F ·RDG(u)

= F ·M−1 (−aSu + RHSDG) ,

which is equivalent to solving

M · F−1 dudt

+ aSu = RHSDG

⇔ Mdudt

+ aSu = RHSDG,

where M = M · F−1 is a modified mass matrix. If M is symmetric positive definite, then

we showed in section 1.2 that the resulting scheme would be stable in the norm associated

with M for linear advection. (The proof is the same with M ← M .)

Without loss of generality, the element Ωk = [xk, xk+1] can be mapped to a reference

element [−1, 1]. In this reference element, the mass matrix is a representation of the bilinear

form (u, v) 7→ ∫ 1−1 uv dx on a basis of Rp[X] : B = φ1, φ2, . . . , φp+1 :

Mij =∫ 1

−1φiφjdx

In particular, M = I the identity matrix if B = P = P 1, P 2, . . . , P p+1, the normalized

Legendre polynomial basis. Denote VB,P = V the transformation matrix from general and

unspecified basis B to basis P. Evidently, MB = VT ·I ·V = VTV. Also, FB = V−1 ·FP ·V,

where FP is the expression of the filter in the normalized modal basis P. It follows that


the modified mass matrix MB takes the form

MB = VTVV−1F−1P V

= VTF−1P V.

MB is symmetric positive definite if and only if FP is symmetric positive definite as well,

leading to a scheme stable for linear advection.

There are many filters satisfying this property. For example, a classical choice is the

exponential filter [14] defined by

FP =

σ1

σ2

. . .

σp+1

, σi = exp

(−α

(i− 1p

)s),

where α and s are free parameters. The idea is to force the residual to have a decay in its

coefficients that is similar to the one observed for smooth functions decompositions (high

modes have smaller coefficients). Here, all the terms are smaller than 1 and the energy

proof of stability is intuitive. Things can be less intuitive when considering a general

positive definite filter FP . Also, in that case, the concept of filtering is not so clear, as

various modes can be coupled.

1.3 Energy Stable Flux Reconstruction scheme as a filtered

DG method

The Flux Reconstruction method

The formulation given here of the Flux Reconstruction method closely follows the one given

by Huynh [15]. For the linear advection equation ∂u∂t + a∂u∂x = 0, the FR method can be

described as follows. We consider an element Ωk mapped to [−1, 1]. The solution uhk can


once again be expanded in a polynomial basis. The flux is taken to be fhk = fDk +fCk , where

fDk (x) = auk(x)

fCk (x) =[f?k−1,k − fDk (−1)

]gL(x) +

[f?k,k+1 − fDk (1)

]gR(x)

= fCL · gL(x) + fCR · gR(x),

“D” stands for discontinuous, “C” stands for correction, and gL and gR are flux correction

functions. They are chosen to approximate zero in some sense and satisfy

gL(−1) = 1, gL(1) = 0

gR(−1) = 0, gR(1) = 1.

It follows that fk is continuous on Ω and for all k

fk(xk) = fk−1(xk) = f?k−1,k.

Its derivative with respect to x on Ωk is

dfhkdx

= aduhkdx

+ fCLdgLdx

+ fCRdgRdx

.

We now specify gL and gR more precisely by assuming they are in Rp+1[X], the space of

real coefficients polynomials of degree at most p+1. As a consequence, dgLdx is a polynomial

of degree at most p and it can be represented in the same basis as the solution uh by the

vector g′L. The same can be said about dgRdx .

We are now in a position to give an explicit vectorial formulation of the FR method:

dukdt

+ a Dkuk + fCL · g′L + fCR · g′R = 0, (1.9)

where D is the differentiation matrix defined by dudx = Du (using very informal notations).

Now, multiplying by the mass matrix introduced earlier one obtains

Mk dukdt

+ a Skuk = −fCL ·Mkg′L − fCR ·Mkg′R. (1.10)


If g is a polynomial of degree at most p+ 1, and g′ is the vector representation of g′ in the

basis φ1, φ2, . . . , φp+1, its derivative with respect to x is then

M · g′ =∫ 1

−1g′Φ dx

= [gΦ]1−1 −∫ 1

−1gΦ′ dx.

Again Φ = [φ1 φ2 · · · φp+1]T and Φ′ = [dφ1

dxdφ2

dx · · · dφp+1

dx ]T . Equation (1.10) becomes

Mk dukdt

+ a Skuk = fCL ·Φ(−1)− fCR ·Φ(1) +∫ 1

−1(fCL · gL + fCR · gR)Φ′ dx

=[(auhk − (au)?

)Φ

]1

−1+

∫ 1


and therefore

Mdudt

+ aSu = RHSDG +∫ 1

−1(fCL · gL + fCR · gR)Φ′ dx. (1.11)

This equation is extremely interesting and should be put in relation to equation (1.8).

It tells us that when considering the flux reconstruction method with polynomial correc-

tion functions of degree at most p + 1, one recovers the DG method plus an extra term∫ 1−1 (fCL · gL + fCR · gR)Φ′ dx. As pointed out by Huynh [15], we recover exactly the DG

method if we define gR and gL using Radau polynomials, so that the extra term vanishes.

Energy Stable Flux Reconstruction as a Filtered DG method

We now consider the Energy Stable Flux Reconstruction approach introduced by Vincent,

Castonguay and Jameson[38]. In this section we derive a new formulation of the method

based on Jameson’s proof of stability of the Spectral difference method[18] and show how

it can be interpreted as a filtered DG scheme. Suppose there exists a symmetric matrix

K such that K · D = 0 (we will show later how such a matrix can be easily found). By

multiplying (1.9) by K one obtains

Kdudt

= −fCL ·Kg′L − fCR ·Kg′R. (1.12)


Adding this new relation to equation (1.11) yields

(M + K)dudt

+ aSu = RHSDG + fCL ·[∫ 1

−1gLΦ′dx−Kg′L

]

+ fCR ·[∫ 1

−1gRΦ′dx−Kg′R

].

The FR method proposed by Vincent et al. aims to find gL and gR such that the two

last terms in square brackets vanish. For this particular choice of gL and gR, the flux

reconstruction method is therefore completely equivalent to solving

(M + K)dudt

+ aSu = RHSDG (1.13)

as long as (M + K) is invertible. The first observation here is that if one chooses to solve

(1.13) instead of (1.9), the explicit forms of gL and gR need not being given. The second

observation is that their Flux Reconstruction scheme takes the exact form of a Discontinuous

Galerkin method with modified mass matrix M = M + K:

Mdudt

+ aSu = RHSDG

⇔ M · (I + M−1K)dudt

+ aSu = RHSDG

⇔ M · F−1 dudt

+ aSu = RHSDG

or equivalently

dudt

= F ·RDG(u),

where F−1 = (I + M−1K). Once again, F can be interpreted as a linear filtering operator

applied on the DG residual, hence proving the stability of the method.

Let us now be more specific about the method introduced by Vincent et al. It is evident

that if D is the differentiation operator for polynomials of degree at most p, then Dk is the

kth derivative operator for these polynomials. In particular, we know that D is nilpotent

and Dp+1 = 0. Since we want K to be symmetric and such that KD = 0, the choice

K ≡ c (Dp)T Dp appears immediately. c is a real scaling coefficient. Their work was then


to find gR such that ∫ 1

−1gRΦ′dx = c (Dp)T Dpg′R.

and to define gL by symmetry. Various choices of c lead to many known schemes (DG, Spec-

tral Differences, Huynh’s g2 flux reconstruction. . . ). As mentioned above, these schemes

can be recast in the DG framework as filtering operators applied to the residual without

obtaining an explicit expression for gL and gR. Here, the filter takes the form

F =(I + cM−1 (Dp)T Dp

)−1

It is now possible to derive an explicit expression of F in the classical Legendre poly-

nomial basis P = P0, P1, · · · , Pp (actually we could derive the expression of the filter in

any basis, but the values of c would then have to be rescaled to match the ones computed

by Vincent). The leading coefficient of the pth Legendre polynomial is given by:

Pp(x) = cpxp + cp−1x

p−1 + . . .+ c0

=12p

(2p)!(p!)2

xp + . . .

Therefore,

Dp =

0 · · · p!cp...

...

0 · · · 0

and c (Dp)T ·Dp =

0. . .

c(p!cp)2

.

Also,

∫ 1

−1P 2i dx =

22i+ 1

leading to M−1 =

12

32

. . .2p+1

2

.

Hence,

I + cM−1 (Dp)T Dp =

1. . .

1

1 + c2p+12 (p!cp)2


and eventually

F =(I + cM−1 (Dp)T Dp

)−1

=

1. . .

11

1+c 2p+12

(p!cp)2

.

The filter can then be transformed to the computational basis FB = V−1B,P · F ·VB,P . As

pointed out in section 1.2, the resulting scheme is stable provided that F is symmetric

positive definite. This is the case if

1 + c2p+ 1

2(p!cp)2 > 0

⇔ c > c− =−2

(2p+ 1) · (p!cp)2 .

If c− < c < 0 the effect of the filter is to amplify the highest mode of the residual. If c = 0

the filter reduces to the identity matrix and we recover the unfiltered DG method. Finally,

if c > 0 the action of the filter is to damp the highest mode of the residual. Vincent et al.

identified a few values of c that recover some interesting schemes.

Discontinuous Galerkin - cDG = 0

In this case, the filter reduces to the identity matrix and has no action on the residual.

Therefore, the DG method remains unchanged.

Spectral Difference - cSD = 2p(2p+1)(p+1)(p!cp)2

Here we recover the stable SD scheme identified by Huynh[15]. In the non-normalized

Legendre basis, and for this particular value of c, the filter takes the form

F =

1. . .

1p+12p+1

.


Huynh’s g2 Scheme - cHU = 2(p+1)(2p+1)p(p!cp)2

This time, we recover the g2 scheme introduced by Huynh in his original paper on the

Flux Reconstruction method and found to be particularly stable[15]. Again we can

give the explicit form of the filter in the non-normalized Legendre basis

F =

1. . .

1p

2p+1

.

Special case - c∞ 7→ ∞This time, the largest mode of the residual is completely annihilated by the filter. It

takes the form

F =

1. . .

1

0

.

Large losses in accuracy are expected for this particular scheme.

Note that the last term of the filter decreases as c increases. Therefore, the larger c, the

more dissipative is the scheme.

Numerical Examples

We now consider the linear advection of a Gaussian bump and verify we recover Vincent’s

results. The domain of interest Ω = [−1, 1] is decomposed into 10 elements of equal length.

The advection speed is a = 1. The initial condition is given by u(x, 0) = e−20x2. Periodic

boundary conditions are applied at both ends of the domain. For the DG implementation,

we consider the case p = 3 and the collocation points are taken to be the Gauss-Lobatto

points. Time integration is done explicitly via a third order Runge Kutta scheme[13].

Results are presented at t = 20.


Upwind Flux

Here, the numerical flux defined in (1.6) is considered for α = 1. We therefore recover a

fully upwind flux

(au)?k−1,k =12a

(u+k + u−k

)− 12|a| (u+

k − u−k).

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−0.2

0

0.2

0.4

0.6

0.8

1

1.2

t

u

Figure 1.1: Reference Solution - DG c = 0

Figure 1.1 is a plot of the solution at t = 20 for c = 0 (unfiltered Discontinuous Galerkin).

Figure 1.2 shows the solution for 4 interesting values of c. c = c−/2 is a value close to the

stability limit found above. c = cHU and c = cSD lead respectively to the recovery of

Huynh’s g2 flux reconstruction scheme and to the stable Spectral Difference scheme. Even-

tually, c 7→ ∞ is a particular case where the last mode of the residual is completely canceled.

All the results obtained by filtering the DG residual match exactly the ones obtained by

Vincent using the Flux Reconstruction approach, hence confirming the preceding theoretical

results.

Central Flux

Now, the flux defined in (1.6) is taken with α = 0, leading to a central flux

(au)?k−1,k =12a

(u+k + u−k

).

Results are presented in Figure 1.3. Once again, our plots match exactly those obtained by

Vincent.


−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−0.2

0

0.2

0.4

0.6

0.8

1

1.2

t

u

(a) c = c−/2

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−0.2

0

0.2

0.4

0.6

0.8

1

1.2

t

u

(b) c 7→ ∞

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−0.2

0

0.2

0.4

0.6

0.8

1

1.2

t

u

(c) c = cSD

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−0.2

0

0.2

0.4

0.6

0.8

1

1.2

t

u

(d) c = cHU

Figure 1.2: Plot of the solution at t = 20 for various values of c for an upwind flux

1.4 Further analysis of the schemes

The stable method proposed by Vincent, Castonguay and Jameson recovers many of the flux

reconstruction schemes introduced by Huynh. We just showed how it is included in a larger

class of stable schemes: the Filtered DG schemes. Thus, two questions arise naturally:

Can all the flux reconstruction schemes be expressed in the form of a filtered DG?

Can any linearly filtered DG scheme be transformed into flux reconstruction form

(i.e., for a given filter, can we always find gL and gR such that the flux reconstruction

method and the filtered DG are equivalent)?

The goal of this section is to give a formal answer to these questions.

Proposition 1. There exist FR schemes that cannot be expressed as filtered DG schemes

(linearly or nonlinearly).


−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−0.2

0

0.2

0.4

0.6

0.8

1

1.2

t

u

(a) c = c−/2

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−0.2

0

0.2

0.4

0.6

0.8

1

1.2

t

u

(b) c = 0

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−0.2

0

0.2

0.4

0.6

0.8

1

1.2

t

u

(c) c = cSD

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−0.2

0

0.2

0.4

0.6

0.8

1

1.2

t

u

(d) c = cHU

Figure 1.3: Plot of the solution at t = 20 for various values of c for an central flux

Proof.

The DG method can be expressed as

dudt

= RDG(u) (1.14)

while the FR method can be written in compact form:

dudt

= RFR(u). (1.15)

We say the FR method is a filtered DG method if there exists a linear or nonlinear operator

F (a filter) such that

i- F(0) = 0


ii- F is independent of u

iii- For any u, RFR(u) = F (RDG(u))

We show in Appendix A that the DG method admits spurious non-constant steady

solutions uh to the linear advection equations such that fCR = (−1)pfCL 6= 0. For this

particular non-constant solution, the DG residual is zero (RDG(u) = 0) although this is

a non-physical result (resulting from the odd/even decoupling phenomenon when using a

central flux). If a FR method is obtained by filtering the DG residual, then for this particular

solution, we should have RFR(u) = 0. We know that

RFR(u) = RDG(u) + M−1

∫ 1


= 0 + M−1

∫ 1

−1(fCL · gL + fCR · gR)Φ′ dx.

Therefore,

RFR(u) = 0

⇔ M−1

∫ 1

−1(fCL · gL + fCR · gR)Φ′ dx = 0

⇔ ∀i,∫ 1

−1(fCL · gL + fCR · gR)φ′i dx = 0

⇔ ∀i, ©©©fCL

∫ 1

−1(gL + (−1)pgR)φ′i dx = 0 (fCL 6= 0).

Consider the case p = 2 where φi = Pi is the Legendre polynomial basis. Now suppose

gL =18(1− x)3,

gR =18(1 + x)3.


These correction functions would be the g3 functions for K = 3 in Huynh’s paper [15]. They

satisfy

gL(−1) = 1, gL(1) = 0

gR(−1) = 0, gR(1) = 1

but for i = 1, P ′1 = 1 and

∫ 1

−1(gL + gR) dx = 1 6= 0 ⇒ RFR(u) 6= 0.

We found a Flux Reconstruction method such that there exists a solution u for which

RDG(u) = 0 but RFR(u) 6= 0. We exhibited a particular FR method that cannot be

expressed as a filtered DG method. ¤

Proposition 2. There exist filtered DG schemes that cannot be recovered by a flux recon-

struction approach. Also, while all FR schemes are conservative, some filtered DG schemes

are not.

Proof.

The flux reconstruction and discontinuous Galerkin residuals are related by

RFR(u) = RDG(u) + M−1

∫ 1


Therefore, FR modifies the DG residual by the addition of an extra term. However, this

extra term cannot affect the lowest mode of the residual. Indeed, consider again P = Pithe Legendre polynomial basis:

P0 = 1, P ′0 = 0.

It immediately follows that

∫ 1

−1(fCL · gL + fCR · gR)P ′0 dx = 0.


The conclusion of this proof is then straightforward. Let F be a linear filter. F can be

decomposed as F = I + G, where I is the identity matrix. Then,

F ·RDG(u) = RFR(u)

⇔ G ·RDG(u) = M−1

∫ 1


Take G diagonal with G11 6= 0. Then

M ·G ·RDG(u) =

××...

×

, × can be a nonzero entry,

but

∫ 1

−1(fCL · gL + fCR · gR)Φ′ dx =

0

×...

×

,

which shows that the above equality cannot be satisfied all the time by any flux recon-

struction method. We should mention that it is extremely easy to find u such that the

×’s are strictly nonzero. The fact that FR cannot alter the lowest mode means that the

average value in a cell is always the one obtained by a regular DG scheme, hence proving

conservation. The above example would imply that conservation is no longer respected for

that particular filtered DG. Of course, choosing such a DG filter that modifies the lowest

mode would be a bad choice and it would not be done in practice. A more fundamental

question that has yet to be answered is “can any conservative linearly filtered DG scheme

be transformed into flux reconstruction form?” ¤

We therefore answered the two questions posed at the beginning of this section. Not all

filtered DG methods can be expressed in a flux reconstruction framework, and reciprocally,

not all flux reconstruction schemes can be cast as a filtered DG method.


1.5 Conclusions

In this chapter, connections between the filtered discontinuous Galerkin method, the Flux

Reconstruction method, and the Energy Stable Flux Reconstruction method have been

established and help understand the working mechanisms of the various methods. We

showed how a large class of filtered DG methods are energy stable. We also gave a new

derivation of ESFR that led to its formulation in terms of a filtered DG method, giving a new

and elegant proof of its energy stability property. Finally, we highlighted differences between

the flux reconstruction and the filtered DG methods. In particular, we demonstrated that

neither method is a subset of the other. However, we showed that their intersection is not

empty because the ESFR scheme is both a filtered DG method and a FR method. This

study can easily be extended to simplex elements and we refer the reader to the work of

Castonguay et al. for a detailed discussion on the ESFR method on triangles [1]. Wang and

Hyunh also gave interesting extensions of the FR method to triangles [11, 17].

ESFRLinearly filtered

DGFR

Chapter 2

Kinetic Energy Conserving Scheme

for Discontinuous Galerkin method

for the Euler’s Equations

The purpose of this chapter is to present a proof that generalizes Jameson’s Kinetic En-

ergy Preserving Finite Volume scheme (FV-KEP) to higher order Discontinuous Garlerkin

method.

The successful simulation of complex unsteady vortical flows requires the use of numer-

ical methods that are stable yet not too dissipative. The kinetic energy conserving scheme

introduced by Jameson [18, 19] in 2007 for the finite volume method seems extremely

promising at solving these kinds of problems. In this chapter we present an extension of

this work to high order discontinuous Galerkin methods and show how an identical flux

leads to a method that is almost kinetic energy conserving. We also give the derivation of

the KEPα scheme where the addition of an extra term to the flux leads to a ‘Kinetic Energy

Decreasing’ scheme controlled by a single parameter α.

In a first part, the discontinuous Galerkin method for 1D conservation equation is briefly

described. In a second part, we rederive an energy based proof of stability for the linear

advection equation (see chapter 1) and show how stability and conservation of kinetic energy

is directly linked to the choice of the numerical flux. In a third part, the kinetic energy

conserving flux is derived for the Euler’s equations and numerical experiments are conducted

to test and better understand the method. We also discuss the special case of finite volumes.

28

CHAPTER 2. DG-KEP FOR EULER’S EQUATIONS 29

2.1 Discontinuous Galerkin Method for 1D conservation equa-

tions

We consider the Discontinuous Galerkin (DG) method to solve the equation ∂u∂t +

∂∂xf(u) = 0

on the domain Ω = [L,R]. Ω is decomposed in N elements

Ω =N−1⋃

k=0

[xk, xk+1], L = x0 < x1 < ... < xN = R

=N−1⋃

k=0

Ωk, Ωk =]xk, xk+1[

The function space considered is the set of functions piecewise H1:

V h(Ω) =φ : [L,R] 7→ R / ∀k, φ|Ωk

∈ V hk = H1(Ωk)

.

The functions in V h(Ω) are defined almost everywhere and for convenience, we introduce

φ−k = limx→x−kφ(x) and φ+

k = limx→x+kφ(x). Also, if φh ∈ V h, we denote by φhk the

restriction/extension of φh to Ωk. Therefore, φhk is in H1(Ωk) and we take φhk(xk) = φ+k and

φhk(xk+1) = φ−k+1. In what follows and to lighten the notations, we will drop the superscripth to denote the elements of V h.

DG formulation

The conservation problem we are trying to solve, associated with a set of appropriate initial

and boundary conditions can be expressed in DG weak form as

Find u ∈ V h s.t. ∀φ ∈ V h

∀k,∫

Ωk

∂uk∂t

φk − f(uk)∂φk∂x

dx = −f?k,k+1 · φk(xk+1) + f?k−1,k · φk(xk),

where f?k,k+1 is the numerical flux between cell k and cell k + 1. Usually an upwind type

flux is used and lead to a stable method. One can integrate by parts this equation one more

time to obtain the DG strong form


Find u ∈ V h s.t. ∀φ ∈ V h

∀k,∫

Ωk

∂uk∂t

φk +∂f(uk)∂x

φk dx =[f(uk(xk+1))− f?k,k+1

] · φk(xk+1)

− [f(uk(xk))− f?k−1,k

] · φk(xk).

Both weak and strong forms are mathematically equivalent but not numerically for the

integrations are not necessarily performed exactly, as described in the following.

Polynomial Approximation and Aliasing Errors

Of course, the above formulation is not really helpful to numerically solve the equation

and we need to introduce a finite dimension approximation of the space V h. To do so, we

consider the space of piecewise polynomials of order p, namely V hp

V hp (Ω) =

φ : [L,R] 7→ R / ∀k, φ|

Ωk∈ V h

k = Rp[X].

If u ∈ V hp (Ω) is the approximation of u on Ω, uk is the restriction of the approximation

on Ωk. Therefore, uk is a polynomial of degree p and can be represented in a lagrange

polynomial basis `iki∈[0,p] defined by the points xiki∈[0,p] ∈ (Ωk)p+1

∀x ∈ Ωk, uk(x) =p∑

i=0

uk(xik)`ik(x).

Ωk

xk+1xk−1

h

x0

kx

1

k...

Ωk−1

xk

Figure 2.1: Mesh nomenclature in 1D

Following the notations introduced by Hesthaven and Warbuton [14], we will identify

uk : x 7→ uk(x) and uk = [u0k u1

k · · · upk]T . For clarity, we introduced the notation uik =

uk(xik). We can now formulate the approximate problem in V hp :


Find u ∈ V hp s.t. ∀i and ∀k

∫

Ωk

∂uk∂t

ìk +∂f(uk)∂x

ìk dx =[f(uk(xk+1)− (f(u))?k,k+1

]· ìk(xk+1)

−[f(uk(xk)− (f(u))?k−1,k

]· ìk(xk).

On each element k, we obtain p+ 1 equations (one per i) allowing us to solve for the p+ 1

unknowns uik.

Some attention needs to be given to the way integrations are performed. It is evident

that∫ìk`

jkdx and

∫ìk

∂`jk

∂x dx can be performed exactly. However, the point is not so clear

when considering∫ìk

∂f(uk)∂x dx, as f(uk) is not necessarily a polynomial of degree p. One

way to overcome this difficulty is to consider f(u)k, the degree p polynomial approximation

of f(uk) defined as

f(u)k =p∑

i=0

f(uik)ìk

⇔ f(u)k = [f(u0k) f(u1

k) · · · f(upk) ]T .

Now the integration can be performed exactly on the approximation of f . Another

strategy would be to transform the term in the integral ∂f(u)∂x ` = ∂f

∂u∂u∂x` and use more than

(p + 1)/2 quadrature points to get a better approximation of the integral. However, this

strategy would be extremely expensive as a lot of new function evaluations would have to

be done.

Here is an example illustrating the difference between f(uk) and f(u)k in the case p = 2:

Ωk = [0, 2] x0k = 0, x1

k = 1, x2k = 2

f : x 7→ x2

uk = x2

f(uk) = x4

f(u)k = 7x2 − 6x0 0.5 1 1.5 2

−2

0

2

4

6

8

10

12

14

16

Exact

Approximation


2.2 Energy Stability of DG for the Linear Advection equa-

tion

Similarly to what was done in chapter 1, we derive the energy stability analysis of the DG

method for solving the following linear advection equation on the domain Ω = [L,R]

∂u

∂t+ a

∂u

∂x= 0 , a is a constant. (2.1)

Multiplying (2.1) by u and integrating over x gives

∫ R

Lu∂u

∂tdx = −a

∫ R

Lu∂u

∂xdx,

and therefore,ddt

∫ R

L

u2

2dx =

12a

(u2L − u2

R

). (2.2)

This energy estimate tells us that the L2 norm of the exact solution u remains bounded for

finite boundary values. We can now show how the proper choice of the numerical flux in

the DG method leads to a similar conservation of energy. Let us consider the DG method

in its general strong formulation. The problem is to find u ∈ V h (u piecewise H1) such that

for all φ ∈ V h and for all element k,

∫

Ωk

∂uk∂t

φk + a∂uk∂x

φk dx =[auk(xk+1)− (au)?k,k+1

]φk(xk+1)

− [au(xk)− (au)?k−1,k

]φk(xk).

Since u ∈ V h, we can substitute φ by u to obtain

∫

Ωk

∂

∂t(u2k

2) + a

∂

∂x(u2k

2) dx =

[auk(xk+1)− (au)?k,k+1

]uk(xk+1)

− [au(xk)− (au)?k−1,k

]uk(xk).


To make things clearer, we introduce the notation

u−k = uk−1(xk),

u+k = uk(xk).

We then have

∫

Ωk

∂

∂t(u2k

2) + a

∂

∂x(u2k

2) dx =

[au−k+1 − (au)?k,k+1

]u−k+1 −

[au+

k − (au)?k−1,k

]u+k

⇔∫

Ωk

∂

∂t(u2k

2)dx+ a

u−k+12

2− au

+k

2

2=

[au−k+1

2 − (au)?k,k+1u−k+1

]−

[au+

k2 − (au)?k−1,ku

+k

]

⇔∫

Ωk

∂

∂t(u2k

2)dx =

[au−k+1

2

2− (au)?k,k+1u

−k+1

]−

[au+k

2

2− (au)?k−1,ku

+k

]

Now summing over the entire domain gives

∂

∂t

∫

Ω

u2

2dx =

N−1∑

k=0

[au−k+1

2

2− (au)?k,k+1u

−k+1

]+ a

u2R

2− (au)?RuR

−N∑

k=1

[au+k

2

2− (au)?k−1,ku

+k

]− au

2L

2+ (au)?LuL.

Let us take

(au)?L = auL,

(au)?R = auR.

and change the bounds of the first summation

ddt

∫

Ω

u2

2dx = a

u2L

2− au

2R

2+

N∑

k=1

[(au)?k−1,k

(u+k − u−k

)+ a

u+k

2

2− au

−k

2

2

](2.3)

=12a

(u2L − u2

R

)+

N∑

k=1

[(u+k − u−k

) ((au)?k−1,k − a

u−k + u+k

2

)]. (2.4)


This last equation should be put in relation with equation (2.2). The conservation of energy

in time is identical to the exact solution with the addition of a term that depends only on the

choice of the numerical flux (au)?k−1,k. Using a regular central scheme (au)?k−1,k = au−k +u+

k2

makes the extra term vanishes and leads to a Kinetic Energy Conserving or Kinetic Energy

Preserving scheme (DG-KEP). In practice, the central scheme is never used since it can lead

to the odd/even decoupling phenomenon (see appendix A) and an upwind biased numerical

flux is often preferred

(au)?k−1,k =12a

(u+k + u−k

)− 12α|a| (u+

k − u−k), α ∈ [0, 1]. (2.5)

If α = 0, we recover the central flux, if α = 1 we recover the upwind flux. Substituting this

flux in equation (2.4) leads to

ddt

∫

Ω

u2

2dx =

12a

(u2L − u2

R

)− 12α|a|

N−1∑

k=1

(u+k − u−k

)2.

The second term on the right hand side is negative for α ≥ 0. Therefore,

(ddt

∫

Ω

u2

2dx

)

DG

≤(

ddt

∫

Ω

u2

2dx

)

exact

< +∞.

This tells us that variations of numerical kinetic energy are bounded by the variations of

kinetic energy of the exact solution. Since kinetic energy is a positive value, integrating this

relation in time leads to the following bounds

0 ≤ ke|DG ≤ ke|exact, at all time.

More precisely, we have

(ddt

∫

Ω

u2

2dx

)

DG

=(

ddt

∫

Ω

u2

2dx

)

exact

− 12α|a|

N−1∑

k=1

(u+k − u−k

)2.

α = 0 leads to a Kinetic Energy Preserving scheme

0 < α ≤ 1 leads to a Kinetic Energy Decreasing scheme

This derivation gives a simple yet elegant proof of stability of the DG method for solving

linear advection problems. In the next section we show how this idea can be extended to


the Euler’s equations.

2.3 Kinetic Energy Preserving DG scheme for the Euler’s

equations

The 1D Euler equations are given on Ω = [L,R] by

∂w

∂t+

∂

∂xf(w) = 0, (2.6)

where w =

ρ

ρu

ρE

, and f(w) =

ρu

ρu2 + p

ρuH

.

By combining the continuity and momentum equations, one can obtain an equation for

kinetic energy k = 12ρu

2:

∂k

∂t=∂k

∂w· ∂w∂t, with

∂k

∂w= k,w =

[−u2

2 u 0]

= − ∂k∂w· ∂f∂x

= − ∂

∂x

[u

(p+ ρ

u2

2

)]+ p

∂u

∂x.

Integrating this relation over the entire domain Ω gives a global conservation equation for

kinetic energy (we assume there are no discontinuities in the solution):

∂

∂t

∫

Ωk dx =

[u

(p+ ρ

u2

2

)]

a

−[u

(p+ ρ

u2

2

)]

b

+∫

Ωp∂u

∂xdx.

Derivation in 1 dimension

Similarly to what was done in the previous section for the linear advection equation, we can

show how a proper choice of the numerical flux can lead to a Kinetic Energy Conserving

DG scheme for the Euler’s equations.


Now, we consider the general DG strong formulation to solve the Euler equations. We

seek w ∈ V 3 such that for all φ ∈ V and for all elements k,

∫

Ωk

φk

(∂wk∂t

+∂f(wk)∂x

)dx = φk(xk+1)

[f(wk(xk+1))− f?k,k+1

]

− φk(xk)[f(wk(xk)− f?k−1,k

],

where we assume that f?k,k+1 takes the form

f?k,k+1 =

(ρu)?k,k+1

(ρu2 + p)?k,k+1

(ρuH)?k,k+1

.

Since u ∈ V and u2

2 ∈ V , we have for all k

∫

Ωk

k,w

(∂wk∂t

+∂f(wk)∂x

)dx = k,w(xk+1)

[f(wk(xk+1))− f?k,k+1

]

− k,w(xk)[f(wk(xk)− f?k−1,k

]. (2.7)

Again, to gain in clarity, we reintroduce the following compact notation:

w−k = wk−1(xk),

w+k = wk(xk).

We can now expand both sides of equality (2.7):

Left hand side

LHS =∫

Ωk

k,w

(∂wk∂t

+∂f(wk)∂x

)dx

=∫

Ωk

∂kk∂t

+∂

∂x

[uk

(pk + ρ

u2k

2

)]− pk ∂uk

∂xdx

=∫

Ωk

∂kk∂t

dx+[u

(p+ ρ

u2

2

)]−

k+1

−[u

(p+ ρ

u2

2

)]+

k

−∫

Ωk

pk∂uk∂x

dx


Right hand side

RHS =k,w(xk+1)[f(wk(xk+1))− f?k,k+1

]− k,w(xk)[f(wk(xk)− f?k−1,k

]

=[u

(ρu2 + p

)− u((ρu2 + p)?k,k+1

)]−

k+1

−[u2

2(ρu)− u2

2(ρu)?k,k+1

]−

k+1

−[u

(ρu2 + p

)− u((ρu2 + p)?k−1,k

)]+

k

+[u2

2(ρu)− u2

2(ρu)?k−1,k

]+

k

Therefore, equating LHS and RHS gives

∫

Ωk

∂kk∂t

dx−∫

Ωk

pk∂uk∂x

dx = −[

©©©©©©©u

(p+ ρ

u2

2

)]−

k+1

+

[

©©©©©©©u

(p+ ρ

u2

2

)]+

k

+[»»»»»»u

(ρu2 + p

)− u((ρu2 + p)?k,k+1

)]−

k+1

−[

½½

½½u2

2(ρu)− u2

2(ρu)?k,k+1

]−

k+1

−[»»»»»»u

(ρu2 + p

)− u((ρu2 + p)?k−1,k

)]+

k

+[

½½

½½u2

2(ρu)− u2

2(ρu)?k−1,k

]+

k

=[− u−k+1

(ρu2 + p

)?k,k+1

+u−k+1

2

2(ρu)?k,k+1

]

−[− u+

k

(ρu2 + p

)?k−1,k

+u+k

2

2(ρu)?k−1,k

].

Now summing over the entire domain gives

∫

Ω

∂k

∂tdx−

N∑

k=1

∫

Ωk

pk∂uk∂x

dx =N−1∑

k=0

[−u−k+1(ρu

2 + p)?k,k+1 +u−k+1

2

2(ρu)?k,k+1

]

+[−ub(ρu2 + p)?b +

ub2

2(ρu)?b

]

−N∑

k=1

[−u+

k (ρu2 + p)?k−1,k +u+k

2

2(ρu)?k−1,k

]

−[−ua(ρu2 + p)?a +

ua2

2(ρu)?a

].


Let us take

(ρu2 + p

)?a

= ρau2a + pa,

and(ρu2 + p

)?b

= ρbu2b + pb,

and change the bounds of the first summation on the right hand side of the relation:

∫

Ω

∂k

∂tdx−

N∑

k=1

∫

Ωk

pk∂uk

∂xdx =

[u

(p+ ρ

u2

2

)]

a

−[u

(p+ ρ

u2

2

)]

b

+N∑

k=1

[−u−k (ρu2 + p)?

k−1,k +u−k

2

2(ρu)?

k−1,k + u+k (ρu2 + p)?

k−1,k −u+

k

2

2(ρu)?

k−1,k

].

If we assume that (ρu2 + p)?k−1,k = (ρu2)?k−1,k + p?k−1,k, we obtain

∫

Ω

∂k

∂tdx =

[u

(p+ ρ

u2

2

)]

a

−[u

(p+ ρ

u2

2

)]

b

+

[N∑

k=1

∫

Ωk

pk∂uk

∂xdx+

N∑

k=1

p?k−1,k(u+

k − u−k )

]

+N∑

k=1

(u+k − u−k ) ·

[(ρu2)?

k−1,k − (ρu)?k−1,k

(u+

k + u−k2

)].

Hence, it is clear that if the continuity and momentum fluxes are chosen such that

(ρu2)?k−1,k = (ρu)?k−1,k

(u+

k +u−k2

), the last summation vanishes, leading to

∫

Ω

∂k

∂tdx =

[u

(p+ ρ

u2

2

)]

a

−[u

(p+ ρ

u2

2

)]

b

+

[N∑

k=1

∫

Ωk

pk∂uk

∂xdx+

N∑

k=1

p?k−1,k(u+

k − u−k )

].

(2.8)

The total conservation of DG kinetic energy satisfies a relation analog to the exact con-

servation of kinetic energy. Therefore, the scheme is said to be Kinetic Energy Conserving

or Kinetic Energy Preserving (DG-KEP). In fact and similarly to what was done for linear

advection, the flux can be modified to lead to a Kinetic Energy Decreasing scheme. To do

so, we add a diffusive term in the following fashion

(ρu2)?k−1,k =12(ρu)?k−1,k(u

+k + u−k )− 1

2α|(ρu)?k−1,k|(u+

k − u−k ) , α ∈ [0, 1] .


When using this flux, the equation of conservation of energy becomes

∫

Ω

∂k

∂tdx =

[u

(p+ ρ

u2

2

)]

a

−[u

(p+ ρ

u2

2

)]

b

+

[N∑

k=1

∫

Ωk

pk∂uk

∂xdx+

N∑

k=1

p?k−1,k(u+

k − u−k )

]

− 12α

N∑

k=1

|(ρu)?k−1,k|(u+

k − u−k )2.

The last term on the right hand side is negative for α ≥ 0. Therefore

(ddt

∫

Ωkh dx

)

DG

≤(

ddt

∫

Ωkdx

)

exact

< +∞.

The variations of DG kinetic energy are bounded by the variations of exact kinetic energy.

Again, assuming the code not to return negative values for density, DG kinetic energy

should be positive and bounded by above by the kinetic energy of the exact flow solution

0 ≤ k|DG ≤ k|exact, at all time.

More precisely, we have

(ddt

∫

Ωk dx

)

DG

=(

ddt

∫

Ωkdx

)

exact

− 12αN−1∑

k=1

|(ρu)?k−1,k|(u+k − u−k )2.

Finally, the DG-KEPα flux can be written

f?k−1,k =

(ρu)?k−1,k

12(ρu)?k−1,k(u

+k + u−k )− 1

2α|(ρu)?k−1,k|(u+k − u−k ) + p?k−1,k

(ρu)?k−1,kHk−1,k

,

where we expanded the Energy component of the flux in a manner consistent with the

momentum flux. From there it would seem sensible to choose

Hk−1,k =12(H−

k +H+k ),

although this choice is somewhat arbitrary and does not affect the kinetic energy conserving

properties of the scheme. If we define ak as the arithmetic average of a−k and a+k , ak = a−k +a+k

2 ,


then the flux can be written

f?k−1,k =

(ρu)?k−1,k

(ρu)?k−1,kuk − 12α|(ρu)?k−1,k|(u+

k − u−k ) + p?k−1,k

(ρu)?k−1,kHk

. (2.9)

α = 0 leads to a Kinetic Energy Preserving scheme

0 < α ≤ 1 leads to a Kinetic Energy Decreasing scheme

Discussion

The last term between brackets in equation (2.8) “looks like” an approximation of∫Ω p

∂u∂x .

However things are not that simple since p is not a continuous function on the domain Ω.

We remember that if Tf is the distribution associated to the L1 function f and if f admits

a first kind discontinuity in a, then

T ′f = Tf ′ + σδ(a),

where δ is the dirac distribution and σ = f(a+) − f(a−) the amplitude of the jump of

function f in a. Therefore, if p were C∞ on Ω, we would have:

x

σ

y

a

〈T ′u, p〉 =N∑

k=1

∫

Ωk

pk∂uk∂x

dx+N∑

k=1

p(xk)(u+k − u−k )

exactly. However, since p is likely to be discontinuous in xk, this relation is not valid and

we only have an approximation of∫Ω p

∂u∂xdx. As a convention we simply choose p?k−1,k =

12(p−k +p+

k ). We will see later how this choice is justified by the special case of finite volumes.


It follows that ∫

Ω

∂k

∂tdx |exact ≈

∫

Ω

∂k

∂tdx |DG theoretical ,

provided there are no shocks in the solution. However the approximation does not only

come from the pressure term. In the above demonstration, we assumed that k,w ∈ V h. It

is actually not true when considering V hp . Indeed, u = ρu

ρ is rational, not polynomial. Also,

when considering an actual implementation of the method using piecewise polynomials of

degree p, nothing guarantees that integrations will be exact (at best, we will have exact

integrals of approximate functions as explained earlier in section 2.1). This all lead to the

following result

∫

Ω

∂k

∂tdx |exact ≈

∫

Ω

∂k

∂tdx |DG theoretical ≈

∫

Ω

∂k

∂tdx |DG implemented .

A DG implementation of the Kinetic Energy Preserving scheme will only lead to an ap-

proximate conservation of Kinetic Energy, provided there are no shocks in the solution.

Another point that should be noted is that the scheme depends entirely of the choice

made for (ρu)?k−1,k. In that sense, there exist an infinite family of kinetic energy conserving

schemes. In Jameson’s original work [19, 20], the choice was made to take (ρu)?k−1,k = ρuk

or (ρu)?k−1,k = ρkuk. However, one should keep in mind that other choices could have

been made. For example, taking (ρu)?k−1,k = θρuk + (1 − θ)ρkuk still leads to a kinetic

energy conserving scheme. One can also choose to pick (ρu)?k−1,k from the continuity flux

of an other scheme (say the Roe scheme) and plug it into the Kinetic Energy Preserving

framework. Otherwise specified however, we will always use (ρu)?k−1,k = ρkuk. This choice

leads to fast evaluations of the flux (a “special” central flux) and gave good result in the

original work by Jameson, especially in the presence of shocks.

The special case of finite volumes

A well known result of the Discontinuous Galerkin method is that it recovers exactly the

Finite Volumes Method for p = 0, where the solution is approximated by piecewise constants

functions. Therefore, it is not a surprise that taking p = 0 in all the above equations leads to

the Kinetic Energy Preserving scheme developed by Jameson in 2007 (FV-KEP). However,

some interesting comments can be made.


First, let us take a look at the pressure term in equation (2.8). Since the approximations

are piecewise constant, ∂uk∂x = 0 and the last term reduces to

N∑

k=1

p?k−1,k(uk − uk−1) =N∑

k=1

p?k−1,k

(uk − uk−1

∆x

)·∆x

≈∫

Ωp∂u

∂xdx.

This term corresponds to the integration of p∂u∂x on a grid staggered to the first one. There-

fore, a reasonable choice would be to take p?k−1,k = 12(pk−1 + pk), an approximation of pk

on the staggered grid.

Also this time, no other approximation is being made. Indeed k,w ∈ V h0 since (ρu)2/ρ2

is still a constant and all integrations can be made exactly.

Another interesting point lies in the interpretation we can give of the dissipative term

introduced in the DG-KEPα scheme. The semi discrete equation for momentum is given

on a uniform grid by

ddt

(ρu)k = − 1∆x

[(ρu2 + p

)?k,k+1

− (ρu2 + p

)?k−1,k

]

= − ∂

∂x

(ρu2 + p

)k

+12α∆x|ρu|k ∂

2

∂x2(u)k +O(∆xn).

where n ≤ 2 depends on the choice made for (ρu)?k−1,k. Actually, many possible (ρu)?k−1,k

will lead to a second order discretization on uniform or smoothly varying meshes. The proof

is given in appendix B and considers general symmetric and consistent mean operators.

The first order term introduced in the above equation by the dissipative portion of

the flux therefore behaves like a diffusive term. It is very similar to the real viscosity of

a newtonian fluid, which is also involving second derivatives of the velocity and provides

dissipation of the flow’s kinetic energy.

Numerical example : viscous Sod shocktube

It is interesting to test the Kinetic Energy Preserving (KEP) flux in a situation where a

central flux is not expected to perform well. In what follows, we consider a viscous sod


shocktube and compare the behavior of the regular central flux, the KEP flux, the Roe

flux and the Lax-Friedrichs-Rusanov flux in our DG code. We call regular central flux

the flux average formula defined by f?k−1,k = 12(f−k + f+

k ). Viscous fluxes are computed

using an Upwind/Downwind approach as described in the book by Hestaven [14]. All the

simulations are performed without the addition of filtering or shock capturing operators to

be able to clearly identify the effect of the fluxes. This is also the reason why we consider a

(slightly) viscous test case and not an inviscid one. As the mesh is refined and the order of

the interpolating polynomials increased, we expect to fully capture the continuous viscous

solution even though no special operator is added.

The behavior of the various schemes applied to the Navier Stokes equations depends

directly on the mesh’s size (in what follows, a fine mesh is considered to have a characteristic

length of the order of the viscous shock thickness) and the order of the polynomials used in

the DG approximation (a low order typically referring to the finite volume approximation

p = 0). Therefore, we study 4 configurations : coarse mesh-low order, coarse mesh-high

order, fine mesh-low order, fine mesh-high order.

In all that follows, the plots will have the same legend that we shall give here to keep

the figures uncluttered

KEP scheme

Central scheme

Upwind scheme (Burgers) / Roe scheme (Euler)

Lax−Friedrichs

Viscous Sod shocktube

The viscous Sod shocktube is defined on Ω = [0, 1] by a left (x ∈ [0, .5[) and a right

(x ∈].5, 1]) initial states:

Left State Right StatepL = 1 pR = .1ρL = 1 ρR = .125uL = 0 uR = 0TL = 300K TR = 300K

The case was considered at a Reynolds number Re = ρLVL

µ(T=TL) = 25000.

The following results are compared with a reference solution, obtained using N = 800


elements, p = 5th order polynomial approximation and the Lax-Friedrichs flux (black solid

line). For every run, the solution was driven explicitly in time using a TVD-RK3 time

stepping scheme[13].

Coarse mesh - Low order

We first consider the solutions obtained for N = 100 and p = 0 (Finite volume approxi-

mation) at t = 0.15. Results are presented in figure 2.2. Visually and as expected for a

simulation involving shockwaves, the diffusive fluxes give better results. This comes from

the fact that both central schemes (classical central and KEP) are not able to capture the

shock and generate important oscillations around it. First order diffusive methods are sta-

ble but their solutions are extremely dissipated and features of the flow are largely smeared.

We can see on figure 2.3 that the total kinetic energy computed using these methods is well

below the reference in black. On the other hand, the KEP scheme, although very oscillatory

on such a coarse mesh is much more stable than the regular central scheme. For any time

step, the central flux simulation blew up while it was possible to drive the KEP solution

to the end (oscillatory results, but did not “explode”). Another very interesting point is

that the estimation of total kinetic energy computed using the KEP scheme is much closer

to the reference solution on figure 2.3. This is a bit surprising since conservation of kinetic

energy is not enforced in the presence of a shockwave and the solution is very oscillatory.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

Density

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

Pressure

Figure 2.2: Viscous shocktube, coarse mesh (N = 100) and low order (p = 0)


0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.160

0.01

0.02

0.03

0.04

0.05

0.06

t

Tot

al K

inet

ic E

nerg

y

Figure 2.3: Viscous shocktube, Evolution of Kinetic Energy, coarse mesh (N = 100) andlow order (p = 0)

Coarse mesh - High order

We consider a mesh of N = 100 elements and a polynomial interpolation of degree p = 5.

Again, diffusive fluxes are performing the best (see figures 2.4 and 2.5). Although some

oscillations are created in the vicinity of the shockwave (Gibbs phenomenon), first order

diffusive fluxes provide enough dissipation to prevent the propagation of these wiggles to the

rest of the solution. On the contrary, the KEP central flux and the regular central flux are

unable to stop the propagation of these wiggles to the rest of the flow field. These oscillations

are the exact analog of odd/even decoupling for finite volumes and finite differences. A DG

method with first order diffusive fluxes will eliminate these odd/even modes while a DG

method implemented with a central scheme is blind to them once they are created. However

we still notice that the KEP central scheme behaves in a better fashion than the central

scheme, especially at the beginning of the expansion.

Fine mesh - Low order

Again we consider a finite volume approximation of the Euler equations but this time for

N = 4000 cells. With such a large number of cells, we are able to capture the viscous

structure of the shock and all schemes are expected to be oscillations free. As can be seen

on figure 2.6 and 2.7-3, central fluxes give a sharper shock profile than first order diffusive

fluxes, but they lead to a very small overshoot. When looking at the start of the expansion on


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

Density

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

Pressure

Figure 2.4: Viscous shocktube, coarse mesh (N = 100) and high order (p = 5), KEP flux

0.28 0.285 0.29 0.295 0.3 0.305 0.31 0.315 0.32 0.325

0.98

0.985

0.99

0.995

1

1.005

1.01

Pressure

Start of the expansion

0.72 0.725 0.73 0.735 0.74 0.745 0.75 0.755 0.76 0.765 0.77

0.24

0.26

0.28

0.3

0.32

0.34

Pressure

Shockwave

Figure 2.5: Details of the Pressure distribution, coarse mesh (N = 100) and high order(p = 5)

figure 2.7-1, central fluxes also provide a result closer to the solid black line of the reference

solution. This result is expected since central fluxes lead to 2nd order accurate schemes

on uniform meshes while Lax-Friedrichs-Rusanov and Roe schemes remain 1st order. The

interesting result is in the middle of the expansion, figure 2.7-2. The regular central flux

exhibits a large amount of spurious oscillations that disappear completely when using the

KEP flux.

This situation (fine mesh, finite volumes) is probably where the KEP scheme is the best.

It is much cheaper than a classical Roe scheme yet it leads to non oscillatory solutions that


are better than the ones obtained with a regular central scheme on a fine mesh. It is possible

to obtain a second order solution on a large number of points in the time it would take to

obtain a 1st order solution using a Roe scheme on a smaller number of points.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1Density

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

Pressure

1

2

3

Figure 2.6: Viscous shocktube, fine mesh (N = 4000) and low order (p = 0)

0.3 0.305 0.31 0.315 0.32 0.325

0.975

0.98

0.985

0.99

0.995

1

Pressure

1- Start of the expansion

0.4 0.41 0.42 0.43 0.44

0.42

0.44

0.46

0.48

0.5

0.52

0.54

0.56

0.58

0.6

Pressure

2- Expansion

0.75 0.76 0.77

0.1

0.12

0.14

0.16

0.18

0.2

0.22

0.24

0.26

0.28

0.3

Pressure

3- Shockwave

Figure 2.7: Details of the Pressure distribution, fine mesh (N = 4000) and low order (p = 0)

Fine mesh - High order

Now, simulations are performed with N = 800 cells and p = 5. Results using the regular

central scheme can be seen on figure 2.8. As expected, all the schemes behave the same way


and all the solutions are superimposed. This comes from the fact that all the features of

the flow are fully resolved (about one and a half cell through the shock) and that there are

almost no discontinuity between the cells (due to the fact that interpolating polynomials

are of high order). Therefore, since all the fluxes considered are consistent, they all provide

the same results. The DG method is “almost continuous everywhere”.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

Density

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

Pressure

Figure 2.8: Viscous shocktube, fine mesh (N = 800) and high order (p = 5), central flux

Influence and effect of α

We saw earlier that a dissipative term can be added to the Kinetic Energy Preserving flux

leading to a “Kinetic Energy Decreasing” scheme parametrized by a single scalar α. For

α = 0 we recover the original scheme. For α > 0, total kinetic energy is decreased. We now

study the influence of the parameter α on the solution.

Finite Volumes method

The extra term added to the KEP flux was identified as a diffusion term in the case of

finite volumes. Therefore, we expect improved stability compared to the case α = 0. It is

interesting to look at how the scheme is performing for the inviscid Sod shocktube. We

compare several values of α: α = 0 for reference, α = 0.3, α = 0.5 and α = 0.8. Results are

reported in figures 2.9 and 2.11 and are in good agreement with what we expected.

A lot of noise has been removed in the expansion area and oscillations around the shock

and the contact discontinuity are greatly reduced. As expected, larger values of α lead

to smoother results. However, choosing α very large will not allow to solve for the shock


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

Density

Exactα = 0.α = 0.3α = 0.8

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

Pressure

Exactα = 0.α = 0.3α = 0.8

Figure 2.9: Sod shocktube, Finite volumes (N = 500) for various values of α

and areas of the flow like the expansion might get overly dissipated. We should keep in

mind that the scheme was designed for smooth flows (in the absence of discontinuities) and

that we are only testing it on shocks to have an appreciation of its behavior in extreme

conditions.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

Density

Exactα = 0.α = 0.5

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

Pressure

Exactα = 0.α = 0.5

Figure 2.10: Sod shocktube, Finite volumes (N = 2000) for various values of α

High Order DG method

It would be vain trying to run the inviscid Sod shocktube using a high order discontinuous

Galerkin method without some way of capturing the shock. Therefore, we consider again

the viscous Shocktube at a Reynolds number of 25000. We study the case N = 100 cells


and p = 5. However this time, the results are not as good and we can see that the solutions

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

Pressure

Exactα = 0α = 0.8

Pressure solution

0.68 0.7 0.72 0.74 0.76 0.78 0.8 0.82 0.84

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Pressure

Exactα = 0α = 0.8

Detail around the shock

Figure 2.11: Viscous Sod shocktube, DG (N = 100, p = 5) for various values of α

for α = 0 and for α = 0.8 are extremely close to each other, with no real improvement

made when α is further increased. The odd/even decoupling modes are not removed and it

appears that using a classical Lax-Friedrichs-Rusanov flux still provides better results for a

similar computational cost.

2.4 Discussion and conclusions

The work presented in this chapter focuses on the derivation and experimentation of the

DG-KEPα scheme for the Euler equations in 1 dimension. We showed how a simple set of

conditions on the numerical flux leads to a kinetic energy conserving scheme. The possible

flux is not unique and there exists an infinite family of fluxes that satisfy the conditions.

We obtained good results using (ρu)?k−1,k = ρkuk and unless otherwise mentioned, we will

always use this flux.

From the presented results, it appears that the Kinetic Energy Conserving scheme gives

the best results when used for p = 0, i.e. for the finite volumes method. The interest is

that on a regular mesh, the KEP flux leads to a 2nd order accurate scheme that is clearly

more stable than regular central scheme while also being more accurate than classical first

order Riemann solvers. This difference in accuracy and stability translates in better results

in many simulation where the artificial dissipation needed to solve flow in a stable fashion

deteriorates greatly the solution (see the application chapters). High resolution finite volume


schemes however tend to have the same qualities as our Kinetic Energy Conserving scheme

while being more expensive numerically - however these are able to capture inviscid shocks.

For viscous flows, the idea is that by using the KEP flux in a finite volume code and on

a regular mesh, it is possible to obtain a 2nd order accurate non oscillatory solution on a

large number of points in the time it would take other high resolution schemes to solve the

same problem on a much smaller set of points.

When considering higher order DG schemes, the order of the resulting scheme does not

depend on the choice of the flux but only on the order of the approximation polynomials.

The Kinetic Energy Conserving scheme is therefore of the same order as when using an

approximate Riemann solver yet is less stable. This is mainly due to the fact that Riemann

solvers are able to block spurious oscillations across interfaces, preventing them to spread

over the entire computational domain. Central schemes on the other hand cannot eliminate

the odd/even decoupling resulting from the Gibbs phenomenon at discontinuities. Although

results are better than when using a regular central scheme, it should be noted that no

actual DG implementation uses a central scheme for the numerical flux. The interest of the

method for high order discretization is therefore more theoretical and the flux can be used

for mathematical purposes.

Finally, the choice made to evaluate p? in the momentum flux of the Euler equation is

quite arbitrary (based uniquely on the result for p = 0) and maybe some improvements can

be made.

We also showed how a diffusive term can be added to the momentum equation to obtain

a Kinetic Energy Decreasing scheme, parametrized by a single scalar α. The addition of this

term leads to a tremendous stabilization of the computation in the case of finite volumes

but did not really affect higher order DG simulations.

Starting with the linear advection equation, we extended the results to the inviscid flow

equations. However, a first trivial extension is possible for all scalar conservation laws with

polynomial flux∂u

∂t+

∂

∂x

(uq

q

)= 0

and the kinetic energy conserving flux is given by

f?k−1,k =1

q(q + 1)

(u+kq+1 − u−k

q+1

u+k − u−k

).


Linear compositions will give f?k−1,k for any polynomial flux function. A more general

extension can actually be made to any smooth flux function and we refer the reader to

Jameson’s original work [19] (see section 3, The one dimensional scalar conservation law).

The derivation is again extremely similar to the one given in this chapter. The extension

from finite volumes to discontinuous Galerkin is the same as the one described in this

chapter.

All the above results can be derived in multiple dimensions and lead to the same flux

functions. The results are quite easy to obtain following the 1D approach but involve long

and tedious calculations.

Chapter 3

Establishing connections between

the two approaches

In the previous chapter we found a kinetic energy stable DG scheme for the Euler equations.

In chapter 1 we described a method for which stability is based on DG stability. A question

follows immediately: can we combine the two approaches? Answering this question is

actually not easy at all. Of course, one can implement the Flux Reconstruction method

using the correction functions introduced by Vincent, Castonguay and Jameson and the

Kinetic Energy Conserving flux at the interfaces. Intuitively, this approach should have

improved stability compared to the DG method using a central scheme, based on the results

observed for 1D scalar conservation laws. However, the mathematics show us this is a far

more complex problem and it is tough to conclude on the stability of the resulting scheme

using previous arguments. In a first part, we extend the VCJ schemes (ESFR in the case

of scalar conservation laws) to vectorial conservation law and show a condition that lead

to stability. In a second part, we show how Kinetic Energy does not satisfy this sufficient

condition and why it is tough to conclude on the stability of the resulting scheme.

3.1 Extension of Vincent-Castonguay-Jameson scheme to vec-

torial conservation equations

For the general vectorial conservation equation ∂w∂t + ∂

∂xf(w) = 0, the FR method can be

described as follows. We consider an elements Ωk mapped to [−1, 1]. The solution whk can

53

CHAPTER 3. CONNECTIONS BETWEEN ESFR AND DG-KEP 54

once again be expanded in a polynomial basis; this time however, whk is a vector of dimension

n. For example, when considering the Euler’s equations, n = 3 since w = [ρ, ρu, ρE]T . The

flux is taken to be fhk = fDk + fCk where

fDk (x) = f(wk)(x),

fCk (x) =[f?k−1,k − fDk (−1)

]gL(x) +

[f?k,k+1 − fDk (1)

]gR(x)

= fCL · gL(x) + fCR · gR(x).

“D” stands for discontinuous, “C” stands for correction. f?k−1,k is a numerical flux as defined

in previous chapters. In practice, f can be an extremely complex function and fDk needs

to be projected back on Rp[X] for the implementation of the method, leading to aliasing

errors. For simplicity, we will assume that fDk is in Rp[X] and that no aliasing errors are

made. gL and gR are scalar flux correction functions (polynomials of degree at most p+ 1).

They are chosen to approximate zero in some sense and satisfy

gL(−1) = 1, gL(1) = 0,

gR(−1) = 0, gR(1) = 1.

It follows that fk is continuous on Ω and for all k

fk(xk) = fk−1(xk) = f?k−1,k.

The flux reconstruction method is defined by

∂wk∂t

= − ∂

∂xfk

or equivalently,∂wk∂t

= − ∂

∂xfDk − fCL · g′L − fCR · g′R (3.1)

with g′ = ∂g∂x .

Suppose we know a numerical flux f?k−1,k that leads to a stable DG scheme in an energy

E (a convex function of w). By this, we mean ∂E∂t is well behaved (∂E/∂t ≤ 0 or E follows the


variation of a real bounded quantity, similarly to what was done in chapter 2 with kinetic

energy).

We can derive the Vincent-Castonguay-Jameson method (ESFR for linear advection1)

by considering ∫

Ωk

(ET,w (3.1) + c

∂p

∂xp(ET,w)

∂p

∂xp(3.1)

)with c ∈ R

and then choosing gL and gR to obtain adequate cancellations leading to stability (for more

details on this derivation, we refer the reader to the original paper by Vincent et al.).

First consider

∫

Ωk

ET,w (3.1)

⇔∫

Ωk

ET,w∂wk∂t

dx =∫

Ωk

(−ET,w

∂

∂xfDk − ET,w

(fCL · g′L − fCR · g′R

))dx

⇔∫

Ωk

∂Ek∂t

dx =(∫

Ωk

∂Ek∂t

dx

)

DG

+∫

Ωk

∂

∂x

(ET,w)(fCL · gL + fCR · gR) dx

and then

∫

Ωk

c∂p

∂xp(ET,w)

∂p

∂xp(3.1)

⇔∫

Ωk

c∂p

∂xp(ET,w)

∂p

∂xp

(∂wk∂t

)dx = −c

∫

Ωk

∂p

∂xp(ET,w)

(fCL · g(p+1)

L + fCR · g(p+1)R

)dx

Adding these two relations we obtain

∫

Ωk

∂Ek∂t

+ c∂p

∂xp(ET,w)

∂p

∂xp

(∂wk∂t

)dx =

(∫

Ωk

∂Ek∂t

dx

)

DG

+ fTCL

∫

Ωk

gL∂

∂x(E,w)− cg(p+1)

L

∂p

∂xp(E,w)dx

+ fTCR

∫

Ωk

gR∂

∂x(E,w)− cg(p+1)

R

∂p

∂xp(E,w)dx.

Assuming that all components of E,w are polynomials of degree at most p (this is the case

1We call VCJ scheme the Flux Reconstruction method using the particular correction functions introducedby Vincent[38]. In the particular case of scalar conservation laws, the resulting scheme is provably energystable (ESFR).


if E,w = A · w, with A ∈ Rn×n for example), Vincent defines gL and gR such that

∫

Ωk

gL∂

∂x(E,w)− cg(p+1)

L

∂p

∂xp(E,w)dx = 0

and∫

Ωk

gR∂

∂x(E,w)− cg(p+1)

R

∂p

∂xp(E,w)dx = 0

leading to the final result

∫

Ωk

∂Ek∂t

+ c∂p

∂xp(ET,w)

∂p

∂xp

(∂wk∂t

)dx =

(∫

Ωk

∂Ek∂t

dx

)

DG

. (3.2)

So far, nothing much can be said about the stability of the method. However, suppose

that E can be written E = 12w

TAw, with A ∈ Rn×n and A > 0 (note that in that case

E,w = A · w), then

∂p

∂xp(ET,w)

∂p

∂xp

(∂wk∂t

)=

[∂p

∂xp(A · wk)

]T [∂

∂t

(∂pwk∂xp

)]

= w(p)k

TA∂

∂tw

(p)k , with w(p)

k =∂p

∂xpwk

=∂

∂t

(12w

(p)k

TAw

(p)k

).

With this, equation 3.2 takes a more pleasant form:

∫

Ωk

∂

∂t

(Ek +

12cw

(p)k

TAw

(p)k

)dx =

(∫

Ωk

∂Ek∂t

dx

)

DG

.

Summing over all the elements leads to

(∂

∂t

∫

ΩEdx

)

ESFR

=(∂

∂t

∫

ΩEdx

)

DG

.

For c > 0, E defines a new energy (actually, it might still be an energy for some negative

values of c, as was observed in the original work by Vincent et al.). This new energy has the

same variation as E evaluated using DG. We assumed these variations to be well behaved,

hence proving the stability of the method. This sufficient condition can be summarized in

the following theorem:

Theorem 1. If the numerical fluxes of a DG method can be chosen such that the resulting


scheme is stable in an energy E = wTAw, with A > 0, then the corresponding Vincent-

Castonguay-Jameson scheme will also be stable in an energy E = E + 12cw

(p)TAw(p) if we

ignore aliasing errors.

3.2 Difficulties associated with Kinetic Energy

In chapter 2, we show how a particular choice of numerical fluxes led to a kinetic energy

conserving scheme (DG-KEP). However, the kinetic energy k cannot be written in quadratic

form wTAw with A constant. Instead, the Jacobian and Hessian of kinetic energy are rather

complex.

E = k =12ρu2,

J = k,w =

−u2

2

u

0

,

H = k,ww =1ρ

u2 −u 0

−u 1 0

0 0 0

.

We can see in the simple case p = 1 how a non constant Hessian can be problematic and

how the term ∂∂x(ET,w) ∂∂x

(∂wk∂t

)does not simplify as nicely as before:

∂

∂x(ET,w)

∂

∂x

(∂wk∂t

)=

[∂

∂w(E,w)

∂w

∂x

]T ∂

∂t

(∂w

∂x

)

=∂w

∂x

T

E,ww ∂∂t

(∂w

∂x

)

= w′THw′,with w′ =∂w

∂xand w =

∂w

∂t

=∂

∂t

(12w′THw′

)− 1

2w′T Hw′.

Therefore, equation 3.2 becomes

∂

∂t

∫

Ω

(E +

12cw′THw′

)dx =

(∂

∂t

∫

ΩEdx

)

DG

+∫

Ω

12cw′T Hw′ dx.


Although one can show that H ≥ 0 for Kinetic Energy, nothing can be said about the last

term in the above equation as it directly depends on ρ, u and their variations. Variations

of the VCJ energy could be much larger than the variations of kinetic energy for DG and

the scheme could simply end up being unstable.

3.3 Conclusion

Although this result might appear a bit disappointing, the analysis exposed in this chapter

gives us interesting insights on the working mechanism of both methods and the way they

are connected. We also discovered sufficient conditions on the Hessian of the DG energy to

guarantee stability of the VCJ scheme. Experimentally, we tried nevertheless to implement

the VCJ-KEP scheme and test it on the problem of advection of a vortex. We focused

particularly on the g2-KED scheme (combining g2 correction functions and the kinetic

energy decreasing scheme introduced in chapter 2) hoping to get the best results. Although

some improvements were made compared to the central scheme, which would never be used

anyway, the results were systematically better when using a Lax-Friedrichs-Rusanov type

flux. Best results were obtained with the KED scheme for the dissipation coefficient α = 0.4.

In that case, Kinetic Energy remained almost perfectly constant in time. However, some

spurious oscillations could be observed in the flow field. Finally we should remember that

aliasing errors (due to the projection of the flux back onto Rp[X]) were ignored in this

analysis. A deeper analysis of how aliasing error influences stability was made by Jameson

et al. [22].

Part II

Implementation

59

Chapter 4

Multi CPUs and GPU

implementations of the Finite

Volume Code

We mentioned in the introduction that achieving Direct Numerical Simulation of flows

requires meshes counting of the order of Re3 degrees of freedom. Even for simulations

at relatively low Reynolds number, say around Re = 1000, the required mesh counts in

excess of a few billions degrees of freedom (∼ 1010 dof). If we assume that each degree

of freedom requires the order 10 floating point operations to be updated, that is a total

of 1011 operations that need to be made in order to update the solution from tn to tn+1.

Evaluating 106 time steps in 1 hour requires of computing power of the order of 10 TFLOPS

(1013 floating point operations per seconds). Although this estimation of operation count is

extremely approximate, it gives us a good idea of the kind of problems we have to solve. To

date, no single processor is capable of achieving such a speed. However, by parallelizing the

code onto multiple processing units, it is easily possible to obtain this kind of performances

with today’s computers.

There are two main reasons to explain why time explicit Finite Volume methods are

excellent candidates for parallelization. The first one is that time explicit numerical methods

in general do not impose an order in which the degrees of freedom have to be processed to

go from time step tn to time step tn+1. Any order being possible, that means that in fact all

the degrees of freedom can be processed at the same time by independent threads, as long

60

CHAPTER 4. PARALLEL IMPLEMENTATIONS 61

as each thread has access to all the necessary data at time tn. This kind of parallelization

suits particularly well systems with shared memory and a large number of cores, which is

the case of GPUs.

The second reason why time explicit Finite Volume methods can be further parallelized

is the compactness of the stencils used in the schemes. To update cell o from time tn to

tn+1, one does not need to know the current solution in the entire computational domain but

only in a few cells in the neighborhood of cell o. This property allows us to decompose the

domain in various subdomains on which the computations can be performed independently

by various computing units. This parallelization is particularly suitable for systems with

distributed memory (such as classical computer clusters, GPUs clusters).

This chapter is not a tutorial on parallel computation or on CUDA. There are many

books and tutorials on the internet that explain very well this topic. There are also very

good classes taught in Stanford that explain the fundamentals (and more) of parallel com-

puting: [CME 342] Parallel Methods in Numerical Analysis or [CS 315A] Parallel Computer

Architecture and Programming are good examples. Here, we try instead to describe how the

finite volume code works in parallel and what are the challenges we faced to make it work. In

section 4.1, we describe the domain decomposition approach to parallelization (distributed

computing) and explain basic concepts of communications between subdomains using halos.

In section 4.2 we look at the parallelization of the code on GPUs, which are devices with

shared memory counting a very large number of cores. In practice, the two approaches are

combined and the codes are designed to run on GPU clusters.

4.1 Distributed computing and domain decomposition

A time explicit compact finite volume scheme is intrinsically a local method. Updating the

solution in cell o from time tn = n∆t to time tn+1 = (n+ 1)∆t only requires the knowledge

of the solution at time tn in cell o and its direct neighbors p | p and o share an edge. The

consequence of this geometric locality is that when one tries to evaluate the solution of cell

o at tn+1, access to the entire computational domain is not needed. It is therefore very easy

to split the global computational domain into various subdomains on which computations

are performed independently by different nodes, as depicted on figure 4.1.

Of course, when one tries to update the solution in a cell placed at the boundary of one

of these domains, some of the direct neighbors of this cell might belong to the computational


Figure 4.1: Decomposition of the computational domain around an airfoil

domain of another node. The classic approach to solving this problem is to maintain on

each node an extended region around the computational domain that holds the value of the

neighboring cells from neighboring computational domains at time tn. This region is called

the halo of the computational domain and enables communications between the various

nodes. For a simple compact finite volume scheme, the halo contains only the nearest

neighbors of the boundary elements of the computational domain. When the stencil of the

scheme is larger (for example when using the JST scheme for artificial dissipation), the halo

needs to be bigger and it also contains the neighbors of the neighbors of the boundary cells.

Algorithm 1: Domain decomposition method using halos

n← 0;

while n ≤ MAX ITERATIONS do

node i evaluates the solution at tn+1 on computational domain i;

Update the halos;

n← n+ 1;

end while

Figure 4.2: Computational domain for 1 CPU and its halo

Depending on the network latencies and bandwidth, it might not be optimal to update the


Figure 4.3: Computational domains for multiple CPUs and their overlapping halos

halo at every single time step. For example, if the network has large access latencies but

a very high bandwidth, one can use a larger halo and update it less often (for a compact

scheme, a halo containing the neighbors of neighbors need to be updated only every 2

time steps. If it also includes the neighbors of the neighbors of the neighbors, it needs to be

updated only every 3 time steps and so on... The formal proof is easy to make by induction).

For an extended halo counting the kth level of neighbors, the algorithm becomes as follows:

Algorithm 2: Domain decomposition method using extended halos

n← 0;

while n ≤ MAX ITERATIONS do

node i evaluates the solution at tn+1, · · · , tn+k on computational domain i;

Update the halos;

n← n+ k;

end while

The transfer of data in the algorithm required to update the halos is made using the Message

Passing Interface (MPI) communication protocol for parallel computations. MPI is a very

common tool used in scientific computing and was developed since the early nineties.

Although the bulk of data transfer between nodes is done for updating the halos of the

various computational domain, many other elements of the program need to be modified

to work in parallel. When considering airfoil and wing simulations (see chapters 6 and 7)

the mesh is generated on node 0 only. It is then split into subdomains that are then sent


to the other nodes. Also, during the computations, the maximum permissible time step is

evaluated based on the spectral radius of each cell. The actual time step is taken to be

the smallest of all permissible time steps for the various cells. Each node has to compute

the permissible time step on its computational domain and then a reduction/minimization

operation determines the maximum global permissible time step (mpiAllReduce, with ar-

gument mpi min). Similarly, to evaluate the lift and drag generated on a wing, each node

computes its own contribution to these quantities and the global values are reconstructed

through another reduction operation (mpiAllReduce, with argument mpi sum).

4.2 An implementation using GPUs and CUDA-FORTRAN

GPUs and CUDA

GPGPUs (General-purpose computing on graphics processing units) is the technique of

using GPUs (Graphic Processing Units) to perform numerical simulations and scientific

computing. Traditionally, GPUs are used for rendering images on a computer screen. For

a good user experience, they must be able to generate dozen of frames per second, which

means processing tens of millions of polygons per seconds. Modern GPUs have hundreds

of cores and they can process as many polygons at the same time. However, and to the

difference of a general purpose CPU, each of these core can only perform basic arithmetic

operations and they have very limited amounts of cache/registers they can use. Foreseeing

the huge potential this kind of architecture has in scientific computing, manufacturers (and

in particular nVidia) started to develop graphic cards specifically dedicated to these kinds

of applications. At the same time, various programming layers/libraries were developed

to help the community program on these devices. The most important ones are OpenCL

(Open Computing Language), DirectCompute (Microsoft computing GPU API) and CUDA

(Compute Unified Device Architecture for nVidia’s GPUs).

In 2010, the Aerospace Computing Lab at Stanford was given by nVidia a C1060 card

and 3 Fermi C2050 cards. At a similar time the Portland Group (PGI) released a FORTRAN

compiler supporting CUDA extensions. We decided therefore to implement some of our

codes using CUDA-FORTRAN to see what kind of speed up was achievable for a finite

volume code. We should note that at this time, the version of the CUDA-FORTRAN

compiler did not support all the features available in CUDA-C. For example, it was not

possible to use texture memory, or concurrent kernel execution.


Figure 4.4: nVidia Fermi architecture. Each small green square is a CUDA core.

The nVidia C2050 card is based on the Fermi architecture. It counts 512 CUDA cores

distributed on 16 streaming multiprocessors. Contrary to the domain decomposition ap-

proach where the granularity was rather large (each node takes care of an entire subdomain),

the parallelization using CUDA has a granularity much finer. In this case, each CUDA core

deals with a single edge or a single cells. The code is decomposed in a succession of ele-

mentary operations. These operations are coded in “kernels”, which can be seen as CUDA

functions, called by the host and executed on the device (the GPU card). A kernel is run

simultaneously by multiple threads. All the simulations we are interested in are done using

double precision floating point variables. For these kind of computations, a single C2050

has a peak performance of 515 GFLOPS. This sounds like a very high number since current

high end CPUs are only capable of a few dozens of GFLOPS. However, feeding the data

to the cores at the adequate rate to sustain performances close to peak can be extremely

challenging for many computations. For the C2050, the computations/memory transfer

ratio is around 4. That means that for each byte transferred, 4 single precision operations

must be done. In the early attempts we made to develop a CUDA finite volume code, this

ratio was only of the order of 0.1!


Implementation in the finite volume code

As we mentioned earlier, the CUDA parallelization of our finite volume code is made at a

very fine grain. Each kernel deals with the processing of the scheme at the cell or edge level.

To perform the computations needed for a single time steps, more than a dozen of kernels

need to be called. In the case of our 2D code, the kernel hierarchy goes as can be seen on

figure 4.4

uflo

viscf

VISCF.....................................viscous coefficients, cell

step

GET TSTEP...............................permissible time step, cell

RAD CUT....................spectral radius across the mesh cut, cell

euler

RESET WN.......... initialize multiple steps time steping scheme, cell

eflux

EFLUX......................................inviscid fluxes, edge

nsflux

GET U ............................ velocity and temperature, cell

GET Q.................................viscous stress tensor, cell

VISFLUX....................................viscous fluxes, edge

dflux

GET SWITCH .......... evaluate the switch P, switch H and E, cell

GET DP I ........................variations of P, I direction, cell

DISSIP I...................artificial dissipation I direction, edge

GET DP J........................variations of P, J direction, cell

DISSIP J..................artificial dissipation J direction, edge

RPLC H BY E.....................replace enthalpy by energy, cell

UPDATE FLOW............accumulate fluxes and update solution, cell

bc wall

VIS BC WALL...............................viscous wall BC, cell

bc far

BC FAR I0....................................outflow at I0, cell

BC FAR IB ................................... outflow at IB, cell

BC FAR JB .................................... inflow at JB, cell

halo

HALO I0 ......................................halo at I=0 , cell

HALO IB......................................halo at I=IB, cell

HALO JB.....................................halo at J=JB, cell

HALO J0 WING........................halo on the wing J=0, cell

HALO J0 CUT.......................halo across the cut J=0, cell

move

MOVE MESH...............................move the mesh, vertex

1

Figure 4.5: Kernels hierarchy. Kernels are in capital letters

It is interesting to see how routines of the CPU code have to be broken down to allow


computations on the GPU. Long functions have to be reduced in size not only to minimize

register uses and to maximize occupancy but also to allow synchronization between the var-

ious parts of the code. For example, in the viscous flux routine, one needs to evaluate the

velocity and temperature every where before their gradients can be evaluated. Separating

this computation in two kernels help us ensure that we are done computing U and T before

starting evaluating the gradients (also our version of CUDA fortran did not support con-

current kernel execution, which helps for these synchronizations). One must also be careful

with new problems that can arise when using shared memory architectures. For example,

in the original CPU code, fluxes are “accumulated” directly in the routine eflux. In the

GPU code each thread computes the flux across 2 cells which then has to be accumulated

in these 2 cells. However, a cell having 2 edges in both directions, that means that 2 thread

could try to access the same memory location at the same time to sum up the fluxes. This

is a race condition that needs to be dealt with carefully. The solution adopted is to store

the fluxes in different directions in independent arrays and so that only at the end, when all

the various fluxes are computed, a single thread per cell sums all the fluxes (in the kernel

UPDATE FLOW).

Some operations are not parallel though, like computing the shape of the skeleton for a

deformed wing (see chapter 7 for more detail on this operation).

Results

Although it is difficult to compare the speed up obtained using 2 different architecture

and not always meaningful (the code could be poorly optimized on one architecture, hence

biasing the results, or one architecture can be older than the other one...) we give here a

few results to have an idea of the kind of results that are achievable.

We conducted several time measurement for the 2D code, both for multi CPUs (to

check how well the code would scale) and for GPUs. Experiments were made on a coarse

mesh counting 131,072 cells (nx × ny = 1024 × 128) and a fine mesh counting 2,097,152

cells (nx × ny = 4096 × 512). In the tables below, we show the time needed to perform

10,000 time steps. Total time includes the loading of the steady state data in CGNS format,

the actual flow computations and the writing of the output file in CGNS format again. A

significant amount of time steps are considered so that the loading and writing phases do

not represent an important part of the total runtime.

The multi CPUs times are a good indication that the distributed code scales decently,


Cores 1 4 16Intel core2 E6850 35’52 15’28 4’51Intel core i7 860 23’45 11’06

Table 4.1: CPU times coarse mesh

Card 1C1060 2’54C2050 1’58

Table 4.2: Single GPU times coarse mesh

Machine Time8 Intel core2 E6850 (16 cores) 32’171 C1060 22’101 C2050 13’25

Table 4.3: CPU and GPU times on fine mesh

even for a quite coarse mesh (speedup of 3.2 between 4 and 16 cores on the coarse mesh).

A single C1060 provides a speedup of 1.7 compared to 16 core2 cores and the C2050 gives

a speedup of 2.5. On the fine mesh, similar speedup are achieved between 16 core2 and the

GPUs (1.5 for the C1050 and 2.4. We can explain these somewhat lower values by the fact

that the distributed code scales better on the fine mesh due to a better computations/ratio

transfers).

Then again, these various speedups should be seen as rough estimates of the potential of

GPUs for this kind of computations. Although being promising, they are not as important

as the one that can be observed for high order codes for example. The intrinsic problem of

the finite volume code is that it is memory limited (a ratio of instruction / bytes transferred

¿ 4). Minimal amount of computations is done on each data before loading a new set of

data and performing new operations. In practice, that means that effort should be made to

hide memory latencies and to improve bandwidth usage (avoid useless data being transferred

for nothing). Future optimizations of the CPU code could try to optimize and maximize

concurrency of threads (by achieving good occupancy, optimizing the threadblock size and

try to reduce the number of registers being used). One can also play with the amount of L1

cache/Shared memory to prevent register spills). The second possible path for optimization

is to improve bandwidth usage and have coalesced data access.


4.3 Conclusion

While the use of parallel architectures has been the obvious solution to perform large com-

putations for almost 30 years, the future of scientific computing is a bit unclear. GPUs

allow us to achieve orders of magnitude speed up compared to current CPUs but their cost

per FLOPS remain high and the work required to convert a CPU code to a GPU code can

be extremely important. Furthermore, GPU programming still being an emerging tech-

nology, it is unclear whether the CUDA model will spread to all GPU applications or an

other language working with all GPUs (and not just nVidia’s) will dominate the market.

Scientists are primarily interested in developping new methods to solve ever more complex

problems. If one has to rewrite entirely a flow solver every two years every time a new

architecture is available, this could be a huge setback to development. Another difficulty

that arises with GPU clusters is that parallelization is made at various levels of granularity.

Each GPU thread operates at the cell or edge level (fine grain) but the whole GPU deals

with a subset of the domain (large grain). Handling so many layers in the abstraction and

the hardware can become a real headache for the programmers.

Part III

Numerical applications - Finite

Volume Method

70

Chapter 5

Shocktube and shock-vortex

interaction

This chapter describes the first 2D applications made using the finite volume kinetic energy

preserving scheme (FV-KEP): a viscous Sod shocktube and the interaction between a mov-

ing vortex and a shock. The goal is to show the usefulness of the scheme and to validate

it’s properties before moving to more complex geometries or flows.

The two test cases chosen seem to be relevant as they involve a large range of phenomena:

unsteady flows, boundary layers, expansion fans, contact discontinuities, shocks and all

sorts of interactions between all of them. Simulations are made for the viscous Navier

Stokes equations. Indeed the FV-KEP scheme alone is not able to capture shocks but most

importantly, the long term goal is high fidelity simulations of real complex flows (Direct

Numerical Simulation).

In a first part we briefly give the semi discrete formulation of the FV-KEP scheme. A

thorough derivation of the scheme is given in chapter 2 for the more general discontinuous

Galerkin method. In a second part, we describe the computation of the flow in a shocktube

using the FV-KEP scheme and the first order Roe scheme. A comparison is made based

on the results obtained. We also study carefully the pseudosteady area. We define “pseu-

dosteady area” as the zone containing the contact discontinuity between the expansion and

the shockwave. In the inviscid approximation, the flow is uniform and steady in this area.

When viscosity is present, some interesting patterns can be observed. In a third part, we

study the interaction between a moving vortex and a steady shock. Again, the numerical

protocol is described and results are commented.

71

CHAPTER 5. SHOCKTUBE AND SHOCK-VORTEX INTERACTION 72

5.1 Finite Volume Kinetic Energy Preserving Scheme for vis-

cous flow

We describe briefly the FV-KEP scheme in multiple dimensions. A derivation for the 1D

case is given in chapter 2.

Continuous Model

First, consider the three-dimensional Navier-Stokes equations in their conservative form:

∂u

∂t+

∂

∂xif i(u) = 0, (5.1)

where

u =

ρ

ρv1

ρv2

ρv3

ρE

and f i =

ρvi

ρviv1 + pδi1 − σi1ρviv2 + pδi2 − σi2ρviv3 + pδi3 − σi3ρviH − vjσij − qj

.

The viscous stress tensor σij is given for a Newtonian fluid by σij = λδij ∂vk

∂xk +µ(∂vi

∂xj + ∂vj

∂xi

).

Often in aerodynamics, λ is taken to be equal to −23µ. The heat flux is proportional to the

temperature gradient (Fourier’s Law) qj = −κ ∂T∂xj .

An equation for the kinetic energy k = 12ρv

i2 can be derived by combining the continuity

and the momentum equations. Indeed,

∂k

∂t=

∂

∂t

(12ρvi

2)

= vi∂

∂t

(ρvi

)− vi2

2∂ρ

∂t.

It follows by substituting ∂∂t

(ρvi

)and ∂ρ

∂t by their corresponding fluxes that:

∂k

∂t+

∂

∂xj

[vj

(p+ ρ

vi2

2

)− viσij

]= p

∂vj

∂xj− σij ∂v

i

∂xj. (5.2)


We assume that we are interested in a do-

main Ω fixed in space. ∂Ω denotes the

boundary of Ω. By integrating (5.2) over the

domain Ω, we get a global conservation law

for kinetic energy:

∂Ω

Ω

∂

∂t

∫

Ω

kdV = −∫

∂Ω

[vj

(p+ ρ

vi2

2

)− viσij

]njdS +

∫

Ω

(p∂vj

∂xj− σij ∂v

i

∂xj

)dV. (5.3)

Definition 1. A numerical scheme to solve the viscous Navier-Stokes equations is said to

be Kinetic Energy Preserving if it satisfies a discrete analog of (5.3).

Here we have assumed that the domain contains no discontinuity. If a shockwave is present

in Ω, the relation (5.3) does not hold anymore.

Semi-discrete approach

Now, we consider a finite volume discretization of the governing equations in the domain Ω .

The generic cell is a polyhedral control volume o. Each cell has one or more neighbors. The

face separating cell o and cell p has an area Aop, and we define niop to be the unit normal

to this face, directed from o to p. Evidently niop = −nipo. We also define Siop = Aopniop. S

iop

can be interpreted as the projected face area in the coordinate direction i.

op

Aop

~nop

~nopAop ≡~Sop

Boundary control volumes are closed by an outer face of directed area Sio = −∑p S

iop

(a control volume is delimited by a closed surface).


In this framework, the semi-discrete finite volume approximation of the governing equa-

tions takes the form:

volo∂uo∂t

+∑

p neighbor

f iop · niopAop = 0

or

volo∂uo∂t

+∑

p neighbor

f iop · Siop = 0. (5.4)

For a boundary control volume b, another contribution to the fluxes f ib · Sb comes from the

outer face.

Now we assume that uo and f iop take the form:

uo =

ρo

ρov1o

ρov2o

ρov3o

ρoEo

and f iop =

(ρvi)op(ρviv1)op + (pδi1 − σi1)op(ρviv2)op + (pδi2 − σi2)op(ρviv3)op + (pδi3 − σi3)op(ρviH)op − (vjσij + qj)op

. (5.5)

We exhibited a set of sufficient conditions on the elements of f iop that lead to a Kinetic

Energy Preserving (KEP) scheme.

Proposition 3. If the elements of f iop defined in (5.5) satisfy the following conditions: a-

(ρvivj)op = 12(ρvi)op(v

jp + vjo)

b- (pδij − σij)op = 12(pδij − σij)o + 1

2(pδij − σij)pand if the fluxes at the boundaries are evaluated such that:

c- f ib = f i(ub) where b is a boundary control volume

then the semi discrete finite volume scheme (5.4) satifies the discrete global variation law

for kinetic energy.

Indeed in that case, the discrete kinetic energy ko satisfies the following relation:


d

dt

∑o

voloko = −∑

b

Sjb

(vjb

(pb + ρb

vib2

2

)− vibσijb

)

+∑o

(po

∑p

vio + vip2

Siop − σijo∑p

vio + vip2

Siop

),

which is indeed a discretization of (5.3).

Condition a of the previous proposition is not very restrictive and allows some degrees

of freedom in the construction of the fluxes defined in (5.5). Let us denote by gop the

arithmetic average of the quantity g between cell o and cell p: gop = (go + gp)/2. We can

rewrite condition a (ρvivj

)op

=(ρvi

)opvjop. (5.6)

The average(ρvi

)op

can be evaluated by any means ((ρvi

)op

= ρopviop or ρviop for

example) and(ρvivj

)op

is deduced using relation (5.6) to satisfy condition a.

5.2 Direct Numerical Simulation of the two-dimensional flow

in a shocktube

The shocktube is characterized by its left and right initial states, both at rest at the begin-

ning of the simulation. The left state is characterized by its pressure pl, its density ρl and

its temperature Tl. The right state is also characterized by its pressure pr, its density ρr

and its temperature Tr. Pressure, density and temperature can be related by the perfect

gas law.

The length of the shocktube is L, the height is h. The aspect ratio of the shocktube is

defined by α = h/L. We assume that the walls of the shocktube are rigid and adiabatic.

L

h=

αL

walls

Computational domain


Viscosity is evaluated using Sutherland’s formula

µ(T ) = CT 3/2

T + S.

For air, at reasonable temperatures, C = 1.456× 10−6kg/( ms√

K) and S = 110.4K.

We define the velocity Vl =√pl/ρl proportional to the speed of sound in the left region

and the Reynolds number Re = ρlLVlµl

, where µl = µ(Tl). The Prandtl number is given by

Pr = µCp

κ . It was taken to be equal to 0.75.

Numerical computations are done for the Kinetic Energy Preserving scheme using these

averaging formula for the convective terms:

(ρvi

)op

= ρopviop

(ρvivj

)op

= ρopviopv

jop

(ρviH

)op

= ρopviopHop

(5.7)

Note that condition a of proposition (1) does not require a specific form for(ρviH

)op

. We

just choose it to be consistent with the continuity and momentum fluxes.

Viscous stress is evaluated in each cell by introducing a complementary mesh, for which

cell vertices are the centers of the original control volumes.

Time integration is performed using the Total Variation Diminishing Runge Kutta 3

like scheme proposed by Shu[13]. For a semi discrete scheme in the form

∂u

∂t+R(u) = 0, (5.8)

this 3-stages scheme advances from time n to time n+ 1 by

u1 = un −∆tR(un)

u2 =34un +

14u1 − 1

4∆tR(u1)

un+1 =13un +

23u2 − 2

3∆tR(u2)

This time discretization does not guarantee the preservation of kinetic energy in time.

One could use a Crank-Nicholson semi implicit scheme as suggested by Jameson[19] to

ensure conservation in time, but the computational costs would be significantly greater.


Eventually, results are checked using a classic first order Roe scheme[36], advanced

explicitly in time by the same Runge Kutta 3 scheme.

Simple shocktube and comparison with Roe Scheme

First, we run the simulation for the case

Left State Right StatepL = 1 pR = .1ρL = 1 ρR = .125uL = 0 uR = 0TL = 300K TR = 300K

The Reynolds number is Re = 25000 and the aspect ratio α = 0.6. This odd shaped

shocktube guarantees that the boundary layers will not affect the flow at the centerline

too much. The grid size is 4096 cells in the x -direction and 256 cells in the y-direction

(only half of the domain is computed using this mesh, the second half is obtained by

symmetry). The grid is uniform in the x -direction but stretched in the y-direction such

that ∆ymin/L = α/4000.

Figure 5.1 shows at time t = 0.2136L/Vl the variations of nondimensional pressure,

density, velocity and energy at the centerline. Figure 5.2 presents a comparison of the

pressure profile along the centerline for the Roe scheme and the KEP scheme. We can

notice that the KEP scheme provides sharper results than the Roe scheme which prooves

that it introduces less dissipation. Results are especially convincing at the shockwave.

Figure 5.3 is a global picture of the x -velocity in the shocktube. The shape of the

boundary layer and the curved aspect of the shockwave near the walls are in agreement

with the usual results observed in a viscous shocktube [27, 28]. Lighter colors coincide with

faster flow.

Study of nonclassical effects in the pseudosteady flow area

In the previous part, α was chosen to be equal to 0.6 for a Reynolds number of 25000

in order to observe phenomena similar to the inviscid case on the centerline. Actually, if

we had plot the pressure on a line closer to the wall, the picture would have been quite

different. The quasi complete absence of dissipation introduced by the KEP scheme allowed

us to capture some unexpected features of the flow in the pseudosteady region, where the

contact discontinuity is (see figure 5.4).


0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

x/L

Pressure

(a) Pressure

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

x/L

Density

(b) Density

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

x/L

x-Velocity

(c) x-Velocity

1.6

1.8

2

2.2

2.4

2.6

2.8

3

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

x/L

Energy

(d) Energy

Figure 5.1: Variation of state variables along the centerline. Re = 25000, α = 0.6

0.86

0.88

0.9

0.92

0.94

0.96

0.98

1

0.2 0.22 0.24 0.26 0.28 0.3 0.32

x/L

Pressure KEPPressure Roe

(a) At the start of the expansion

0.1

0.15

0.2

0.25

0.3

0.86 0.865 0.87 0.875 0.88 0.885 0.89

x/L

Pressure KEPPressure Roe

(b) Through the shockwave

Figure 5.2: Comparison of pressures on the centerline for the KEP scheme and the Roescheme at two locations.

Figure 5.5 is a sketch of the pattern observed in the flow for the case described above.


x/L

y/L

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

0

0.1

0.2

0.3

0.4

0.5

0.6

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Figure 5.3: Distribution of nondimensional x -velocity in the shocktube

A (+) indicates a pressure wave or pressure point, where the pressure is larger than in the

inviscid case. A (-) indicates a depression, where the pressure is lower than in the inviscid

computation.

0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0.1

0.2

0.3

0.4

0.5

0.6

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

(a) x -velocity

0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

0.15

0.2

(b) y-velocity

0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

0.29

0.3

0.31

0.32

0.33

0.34

(c) Pressure

Figure 5.4: Distributions of nondimensional velocities and pressure in the pseudosteadyarea (contact discontinuity area) in the case Re = 25000, α = 0.6, t = 0.2136L/Vl


++

++

++

++

expansion shockwave

centerline

Figure 5.5: Pressure pattern observed in the pseudosteady area of the flow. A (+) is asurpressure compared to the inviscid case while a (−) corresponds to a depression. Re =25000, α = 0.6.

The first obvious pattern that can be observed is the pressure waves developing at the

base of the expansion, near the walls. These waves are starting in the boundary layer and

are curved towards the direction of the flow. Figure 5.7 shows the shape of pressure waves

in the pseudosteady area for 3 different values of α. When α is varied, the shape of the

waves remains the same. As a consequence, when α is reduced too much, the waves end up

by crossing each other (case α = 0.3). If α is further reduced (case α = 0.2 on the figure),

waves will reflect on the walls.

On the other side of the pseudosteady area, near the shockwave this time, we can observe

depression waves. These are visible on figures 5.4 and 5.7 again. They seem to start in the

boundary layer near the shockwave and extend upstream in the pseudosteady area. These

wave are much more smooth than the one previously described. Their s shape is particularly

obvious in the case α = 0.6, but the way these waves interfere for smaller values of α is not

very clear.

Eventually, a third pressure pattern can be observed in the flow, this time not visible on

the previous figures. Just after the shockwave (after the shockwave is in the pseudosteady

region), in the boundary layer, a high pressure point can be observed, as shown on figure

5.6. Figure 5.6 represents the variation of pressure along the wall for the case α = 0.2,

Re = 25000. Interesting fact is that this high pressure point is located at the root of the


0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

x/L

Wall Pressure KEP

(a) Wall pressure KEP

0.1

0.15

0.2

0.25

0.3

0.35

0.82 0.83 0.84 0.85 0.86 0.87 0.88 0.89 0.9

x/L

Wall pressure KEPWall pressure Roe

(b) Shockwave wall pressure

Figure 5.6: Pressure distribution along the walls of the shocktube. α = 0.2, Re = 25000

depression wave described above. The small bump observed in figure 5.6-a for x/L ' 0.6

coincides with the reflection of the pressure waves decribed earlier. Note on 5.6-b the

overshoot in pressure located near the shockwave. Once again, it seems that the KEP

scheme is less dissipative than the Roe scheme.

These various pressure patterns can be interpreted as acoustic waves generated by the

deflection of the flow due to the non constant thickness of the boundary layers. In Figure

5.7, the boundary layer profiles are the same (same Reynolds number) and the pictures are

taken at similar instants. It explains why the pressure waves look the same, as they are

generated by the same deflection in the flow and had the same time to propagate. More

simulations would probably be required to understand the phenomenon in greater depth.

5.3 Shock vortex interaction

The second application of the FV-KEP scheme is the computation of the interaction between

a moving vortex and a static shock. More than studying the physics of the flow, the interest

is once again to understand better how the scheme behaves when solving for complex flows.

Problem setup

This case is thoroughly described by Jirı Furst[10] in his PhD thesis. We will describe it

more shortly and refer the reader to the thesis for more details. The shock vortex interaction


(a) α = 0.2, ymin/L = α/2000

(b) α = 0.3, ymin/L = α/2400

(c) α = 0.6, ymin/L = α/4000

Figure 5.7: Pressure Waves pattern in the pseudosteady flow area for α = 0.2, α = 0.3,α = 06. Re = 25000.


is a phenomenon that has been studied by various authors[10, 8, 35, 31, 9, 23]. Following

Furst we can consider a vortex satisfying the general polytropic property p/ρn = const.

n = 0 is an isobaric vortex without initial rotation [31]

n = 1 is an isothermal vortex [9]

n = γ is an isentropic vortex [23]

n =∞ is a constant density vortex

In our work, we use n = γ and therefore model an isentropic vortex. This vortex is

superimposed to a steady state flow involving a shockwave, as can be seen on figure 5.8.

Figure 5.8: Shock Vortex interaction problem setup - © J. Furst

The supersonic inlet is defined by (ρ1, u1, v1, p1) = (1,√γM1, 0, 1) and the subsonic flow

down the shock can be deduced using the Rankine-Hugoniot relations:

ρ2 = ρ1(γ + 1)M2

1

2 + (γ − 1)M21

p2 = p12γM2

1 + 1− γγ + 1

u2 = u12 + (γ − 1)M2

1

(γ + 1)M21


The composite vortex is defined by an inner core of uniform vorticity (0 ≤ r < r1)

circled by an annular region of uniform and oppositely directed vorticity (r1 ≤ r ≤ r2).

The velocity inside the vortex given by

vθ =vmaxr1

r, 0 ≤ r < r1

vθ = vmax

(Ar +

B

r

), r1 ≤ r ≤ r2

A and B are chosen so that vθ is continuous in r = r1 and vanishes in r = r2. Therefore,

A =r1

r21 − r22,

B = − r1r22

r21 − r22.

The momentum equation in the vortex simplifies to

v2θ =

r

ρ

∂p

∂r. (5.9)

We can differentiate the polytropic relation p/ρn = const and the perfect gas law p = ρRT

to get1n

∂p

∂r=p

ρ

∂ρ

∂r(5.10)

and∂p

∂r=p

ρ

∂ρ

∂r+Rρ

∂T

∂r. (5.11)

Substituting relation (5.10) in relation (5.11) leads to

1ρ

∂p

∂r=

n

n− 1R∂T

∂r.

This last result can be plugged in the momentum equation (5.9) to give

∂T

∂r=

n

n− 11R

v2θ

r,

which can be easily integrated. The final results are given in Furst’s thesis[10]. It is

interesting to define the strength of the vortex by the non dimensional number Mn =vmax√nRT∞

.


Numerical simulation

Once again, such an inviscid simulation involving a shockwave cannot be done with the

Kinetic Energy Conserving scheme directly. One can add a shock capturing operator or

instead consider a viscous flow. Following what we did before, the latter is chosen. Starting

with the inviscid initial conditions described above (t < 0), we assume that the flow becomes

instantly viscous at t = 0. We test a somewhat mild case, with an inlet Mach number

M1 = 1.1, a vortex Mach strength Mn = 0.5 and a Reynolds number of 5000. This choice

is made so as to limit the number of points required in the mesh1. This one ends up having

a size of 4000× 500 cells.

(a) t = 0. Initial condition (b) t = 0.32

Figure 5.9: Pressure distribution

Figure 5.9 represents the pressure field at t = 0. and t = 0.32. Note how the shock looses

sharpness after viscosity is added to the flow. Figure 5.10 is a numerical Schlieren (density

gradient magnitude) representing the interaction of the vortex with the shock at various

instants. Although detailed results are not readily available for comparison, the obtained

solution can be compared qualitatively with the work of Ellzey or Furst. The features

observed are in excellent agreement with their computations. Our solution is clean and no

spurious oscillation can be observed in the vicinity of the shock, even after the interaction

with the vortex.1We know that the monotonicity of the scheme is linked to having a cell Reynolds number close to unity.

Although this was actually proved by Jameson for the KEP scheme for Burgers’ equations[19], this result ismore empirical when considering the Euler’s equations.


(a) t = 0.06 (b) t = 0.18

(c) t = 0.26 (d) t = 0.45

Figure 5.10: Numerical Schlieren (density gradient magnitude) at various times

5.4 Conclusions

This chapter describes our first attempts to solve for complex unsteady flows in more than

one dimension using the kinetic energy conserving scheme for finite volumes. It as been seen

that it allows a successful description of shockwaves, expansion fans and boundary layers

and vortices.

Experimentally, it has been confirmed that the number of grid points required to ensure


numerical stability is proportional to the local cell Reynolds number Reo = ρV 3√voloµ . In

the viscous Sod shocktube case, and for a global Reynolds number of 25000, stability was

ensured using grids as large as 4096 mesh cell in the x -direction by 256 mesh cells in the y-

direction. For the shock vortex interaction case, we considered a weak shock and a Reynolds

number of 5000. We used 4000 points in the x direction to obtain a high resolution of the

flow. This might seem to be a lot, but the KEP fluxes are extremely simple and cheap

to evaluate on each cell face. Furthermore, the use of an explicit code makes it easily

parallelizable with a large scalability, by simple domain decomposition. For example, when

computing the flow in the shocktube for a global Reynolds number of 25000 and α = 0.2,

using a 4096× 512 cells mesh and ymin/L = α/2000, the computation took “only” 9 hours

on a small 8 dual core CPUs (8 Intel E6850 Core 2 Duo CPUs).

Chapter 6

Direct Numerical Simulations of

Plunging Airfoils

Over the last century, steady airfoils have been thoroughly studied yielding important ex-

perimental and numerical results. On the contrary, unsteady flows around airfoils have

not been well characterized due to their higher complexity. It is still a challenge to obtain

accurate empirical data and current computational capabilities are not sufficient to gain

high resolution of the phenomenon. In recent years, unsteady airfoils have been gaining a

lot of interest, especially towards the use of flapping flight in the development of micro air

vehicles (MAVs).

The flow around a plunging airfoil is characterized by the generation of vortices that

strongly interact in the wake of the airfoil. In order to study such an unsteady flow numer-

ically and capture the phenomena present in the wake with good resolution, it is crucial to

use a scheme that introduces as little dissipation as possible. When the artificial dissipa-

tion of the scheme becomes too large, there is significant uncertainty whether the damping

observed is due to natural viscosity or numerical dissipation.

The FV-KEP scheme detailed in previous chapters seems to be a perfect candidate for

this study. By enforcing the global balance of kinetic energy along with the conservation

of mass, momentum and total energy, it exhibits improved stability compared to a regular

central scheme, removing almost completely the need of adding artificial viscosity.

In this chapter, we use the KEP scheme to perform computations of flows around plung-

ing airfoils at low Mach and low Reynolds number. Section 6.1 of this chapter describes

88

CHAPTER 6. DNS OF PLUNGING AIRFOILS 89

how the computations were performed and what changes we made to account for mesh mo-

tions. In Section 6.2, we reproduce experiments conducted by Platzer and Jones. Results

are presented and discussed.

6.1 Numerical Methodology

The Kinetic Energy Preserving scheme described in previous chapters is used to compute the

flow around a plunging airfoil. Computations are done for a NACA 0012 airfoil oscillating

in a uniform flow. The transversal motion of the airfoil is given by h(t) = h · cos(ωt).The freestream flow is characterized by a Mach number M∞, a far field temperature T∞and a far field density ρ∞.

Viscosity is evaluated using Sutherland’s formula

µ(T ) = CT 3/2

T + S.

For air, at reasonable temperatures, C = 1.456× 10−6kg/( ms√

K) and S = 110.4K.

The Reynolds number is based on the chord length of the airfoil L and the free stream

velocity V∞ = M∞c∞Re =

ρ∞LV∞µ∞

where µ∞ = µ(T∞). (6.1)

The Prandtl number is given by

Pr =µCpκ

; (6.2)

it was taken to be equal to 0.75.

Computational Domain

Simulations are done on a structured “C -mesh” counting 4096 × 512 cells. The computa-

tional domain extends roughly 30 chord lengths downstream and 20 chord lengths upstream,

as can be seen on figure 6.1. The mesh is subject to rigid body motion and moves with the

airfoil.


(a) Global domain (b) Mesh detail near the trailing edge

Figure 6.1: Computational domain

Numerical fluxes

The convective fluxes have to be modified to account for the motion of the mesh. First, let

us consider the hyperbolic system of equation ∂u∂t + ∂

∂xi fi(u) = 0 and integrate it over the

moving domain Ω(t). We have using the divergence theorem

∫

Ω(t)

∂u

∂tdV +

∫

∂Ω(t)

f i(u) · nidS = 0. (6.3)

However, since ∫

Ω(t)

∂u

∂tdV =

∂

∂t

∫

Ω(t)

udV +∫

∂Ω(t)

uvi · nidS, (6.4)

where vi is the speed of the boundary of the domain, it follows that

∂

∂t

∫

Ω(t)

udV +∫

∂Ω(t)

(f i(u)− uvi) · nidS = 0. (6.5)

Denote viop the velocity of the edge separating cells o and p. The convective fluxes, are now


given by

f iop convective =

[ρ(vi − viop)

]op[

ρ(vi − viop)v1)]op

+ popδi1

[ρ(vi − viop)v2)

]op

+ popδi2

[ρ(vi − viop)v3)

]op

+ popδi3

[ρ(vi − viop)E + pvi

]op

. (6.6)

We then used the following averaging formula:

[ρ(vi − viop)

]op

= ρop(viop − viop)

[ρ(vi − viop)vj

]op

= ρop(viop − viop)vjop


]op

= ρop(viop − viop)Eop + popv

iop

(6.7)

The Kinetic Energy Conserving scheme does not require a specific form for the energy

flux. We chose to define it in a consistent manner with the continuity and momentum fluxes

(see previous chapters).

Viscous stress is evaluated in each cell by introducing a complementary dual mesh, for which


Time integration

Time integration is done using a TVD Runge Kutta second order multistage time stepping

scheme [13]. For a semi discrete law in the form

∂u

∂t+R(u, t) = 0, (6.8)

the scheme advances from time n to time n+ 1 by

u1 = un −∆tR(un, tn)

un+1 =12un +

12u1 − 1

2∆tR(u1, tn+1).

The explicit dependance in time of the operator R is due to the mesh motion. This

means that both the location and velocity of the mesh need to be updated at tn+1 before

evaluating R(u1, tn+1). The advantage of using this particular second order Runge-Kutta


scheme is that the mesh is updated only once per time step.

As mentioned in chapter 5, this time discretization does not guarantee the preservation of

kinetic energy in time. One could use a Crank-Nicholson semi implicit scheme as suggested

by Jameson [19] to ensure conservation in time, but the computational costs would increase

largely.

Far field artificial dissipation

As the mesh coarsens in the far field, roughly 5 chord lengths away from the airfoil, small

spurious oscillations associated with acoustic waves can be observed. It was shown in

previous work [19, 20] that the number of cells required to ensure a non oscillatory solution

and stability was governed by the local cell Reynolds number, which has to be of the order

of unity to guarantee these properties. This result was observed as well in the previous

chapters. However, covering the entire computational domain with cells as fine as the ones

in the wake is currently too expensive.

A small amount of dissipation, based on the Jameson-Schmidt-Turkel (JST) scheme

[21, 37] was added in the far field to control these unphysical oscillations and prevent the

explosion of the number of grid points. Furthermore, at the very edges of the computational

domain, where the cells are the biggest, artificial dissipation is larger and behaves like a

“sponge” that prevents the reflection of the acoustic waves into the computational domain.

The dissipation introduced in the far field is derived from the JST scheme by dropping

the lower order diffusive term and conserving the higher order term to control odd/even

modes. It shall be noted that no dissipation was introduced in the near field of the airfoil,

an area encompassing about 70% of the cells. If we consider the conservation equation∂u∂t + ∂

∂xf(u) = 0, the truncation error introduced by our second order scheme can be seen

as a continuous term in a modified differential equation

∂u

∂t+

∂

∂xf(u) = O(∆x2,∆t2).

The idea is to introduce an extra diffusive term that will modify the truncation error

∂u

∂t+

∂

∂xf(u) = −∆xpλ(4)∂

4u

∂x4+O(∆x2,∆t2)


with λ(4) ≥ 0 and p ≥ 2 to preserve the order of the scheme. If

∂u

∂t+

1∆x

[hi+ 1

2− hi− 1

2

]= 0

is a finite volume semi-discretization of the equation where hi± 12

is the numerical flux, we

can introduce a correction di± 12

to the flux to obtain the desired property. This can be done

by taking

di+ 12

= αi+ 12ε(4) (−ui+2 + 3ui+1 − 3ui + ui−1)

similarly to what is done in the JST scheme. αi+ 12

is proportional to the spectral radius

of the local jacobian matrix and ε(4) is a switch to add dissipation only where needed, as

described in the original JST scheme. di+ 12

is proportional to ∆x3 ∂3u∂x3 |i.

Figure 6.2: Artificial dissipation is added only in the darker area

6.2 Results

We now present the results obtained using the numerical methods described above. We

chose the various parameters describing the motion of the plunging airfoil to match the

ones studied by Jones and Platzer[32, 33]. Flows around plunging airfoils can be classified

according to their Strouhal numbers Sr = ωhLV∞ = hk. In all the results presented, the Mach

number is taken to be M∞ = .2 and the Reynolds number is Re = 1850.


Drag production at low Strouhal number - Sr = 0.29

The amplitude of the plunging motion is h = 0.08 and the reduced frequency k = 3.6

resulting in a Strouhal number Sr = 0.288. For such a Strouhal number the resulting

flow exerts drag on the airfoil. Figure 6.3 shows the density contour: not only the flow

pattern is clear on this picture, but this is also evidence that there are some non negligeable

compressibility effects, even at such a low Mach number. Figure 6.4 represents the vorticity

of the flow. As can be seen, the vortical structure of the wake is in agreement with the

results obtained by Jones et al., cf. figure 6.5. Figure 6.6 is a plot of the evolution in time

of the lift coefficient C` and the drag coefficient Cd. A positive Cd indicates that the fluid

exerts drag on the airfoil in the flow direction. On the abscissa, the non dimensional time

is defined by t = tt0, t0 = L

V∞ .

Figure 6.3: Density field in the airfoil’s wake - Sr = 0.29

Thrust generation at high Strouhal number - Sr = 0.60

As the Strouhal number is increased the flow pattern changes and thrust is generated. In

this section, we present the results obtained for a Strouhal number of Sr = 0.60. Figures

6.7 and 6.8 depict the computed density and vorticity fields for h = 0.1 and k = 6.0. The

flow pattern is in excellent agreement with experimental data by Jones et al. corresponding

to this Strouhal number (see figure 6.9) and we observe on figure 6.10 that thrust is indeed

generated. However, it should be noted that Jones’ experimental result was obtained for


Figure 6.4: Vorticity distribution in the airfoil’s wake - Sr = 0.29

Figure 6.5: Streak lines, experimental data by Jones and Platzer - Sr = 0.29

h = 0.2 and k = 3.0. When we tried to compute the flow for these values, we still observed

generation of thrust, but the flow pattern was quite different. The large amplitude of the

motion (for h = 0.2) creates important leading edge vortices that interact strongly with the

trailing edge vortices in the wake, as can be seen on figure 6.11.

Lift and thrust generation - Sr = 1.5

This is by far the most interesting case, as lift and thrust are generated by the oscillating

airfoil. The nondimensional plunging amplitude is h = .12 and the reduced frequency

k = 12.3, resulting in a Strouhal number Sr = 1.48. Figures 6.12 and 6.13 depict the

density and the vorticity of the flow in the airfoil’s wake. The dual-mode vortex street

described by Jones et al. [32] is clearly visible. Our numerical computations are again in

excellent agreement with Jones et al. experimental results, as shown in figure 6.14.


-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

0 5 10 15 20-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

Cl

Cd

Nondimensional Time

Time history of Cd and Cl

ClCd

Figure 6.6: Lift and Drag history - Sr = 0.29

Figure 6.7: Density field in the airfoil’s wake - Sr = .60, h = .1, k = 6.

6.3 Conclusion

This chapter shows how to solve with high resolution the flow around plunging airfoils using

a finite volume formulation. The method proposed uses the FV-Kinetic Energy Preserving

scheme described in earlier chapters and a modified version of the JST artificial dissipation

model in the far field to ensure a non oscillatory solution. The resulting code proved to

be robust and extremely low dissipative. Applications with coarser grids containing “only”


Figure 6.8: Vorticity distribution in the airfoil’s wake - Sr = .60, h = .1, k = 6.

Figure 6.9: Streak lines, experimental data by Jones and Platzer - Sr = 0.60, h = .2, k =3.

1024× 256 cells (results not presented in the present document) showed that even though

the far field results were largely degraded, the time history curves of C` and Cd were still

relatively close to the ones obtained with fine grids. On an other hand it appears that the

code can be easily modified to deal with airfoil motions more complex, like a combination

of pitching and plunging motion obeying non sinusoidal variations. These remarks lead us

to think that this code would be particularly well suited in the study of optimal motions of

rigid airfoils (in a sense of propulsion efficiency) at relatively low reynolds numbers.


-8

-6

-4

-2

0

2

4

6

8

0 2 4 6 8 10 12

-4

-2

0

2

4C

l

Cd

Nondimensional Time


ClCd

Figure 6.10: Lift and Drag history - Sr = 0.60, h = .1, k = 6.

Figure 6.11: Vorticity field in the airfoil’s wake - Sr = .60, h = .2, k = 3.


Figure 6.12: Density field in the airfoil’s wake - Sr = 1.5

Figure 6.13: Vorticity distribution in the airfoil’s wake - Sr = 1.5


Figure 6.14: Streak lines, experimental data by Jones and Platzer - Sr = 1.5

-40

-30

-20

-10

0

10

20

30

40

0 2 4 6 8 10 12 14 16-20

-15

-10

-5

0

5

10

15

20

Cl

Cd

Nondimensional Time


ClCd

Figure 6.15: Lift and Drag history - Sr = 1.5

Chapter 7

3D Flapping wings

The study of flapped flight long fascinated scientists. Today, many hope that a better

understanding of the phenomenon will allow the development of efficient flapping MAVs.

However, experimental studies are not easy and until recent years, high fidelity computer

simulations were limited to two dimensional flows and most of the three dimensional results

were obtained using simplistic panel codes.

The flow around a flapping wing is characterized by the generation of vortices that

strongly interact in the wake. In order to study such an unsteady flow numerically and to

capture the phenomena present in the wake with good resolution, it is crucial to use a scheme

that introduces as little dissipation as possible. A solution some have adopted is to use high

order discontinuous methods like the discontinuous Galerkin method to limit the amount of

numerical dissipation introduced [29, 30]. However, the cost of these methods is still very

high and we preferred to use a second order finite volume kinetic energy conserving scheme

instead. The scheme was used in previous work for 2D simulations and proved to be accurate

and robust. With the good results we show in this paper, we foresee the possibility to use the

code for optimization purposes. Coupled with a nonlinear optimization solver like SNOPT,

our code can be used as a black box to optimize quantities like the propulsive efficiency.

In the second section the finite volume flow solver is described and results on Jameson’s

kinetic energy conserving scheme are briefly summarized. In section 3, the parameters

used to prescribe the motion of a wing in 3 dimension are explained in details. Finally, in

section 4 contains various examples of possible wing motions and the flow around wings for

interesting cases.

101

CHAPTER 7. 3D FLAPPING WINGS 102

7.1 Numerical Methodology

The vortex dominated flows produced around a low speed flapping wing are not trivial to

simulate. Most of the classical schemes developed for flow simulations excel in the prediction

of high speed steady flows but fail to compute with accuracy unsteady low regimes. Very

often, available techniques prove to be either too dissipative (low order numerical dissipa-

tion is added to improve the stability of the algorithm) or too expensive (reconstructions

are made to obtain higher order accuracy). Choosing an appropriate scheme for such sim-

ulations cannot be done randomly as one must avoid as much as possible the two pitfalls,

dissipation and cost. Currently, the community tackles this kind of problems with high

order discontinuous methods such as the discontinuous Galerkin method or the spectral

difference method when looking for high resolution, or with panel codes when looking for

speed and efficiency [30, 32, 33]. Here again, we perform the numerical simulation using

our FV-KEP scheme described in ealier chapters.

Numerical fluxes

The convective fluxes then have to be modified to account for the motions and deformations

of the mesh. Consider the system of equation ∂u∂t + ∂

∂xi fi(u) = 0 and integrate it over the

moving domain Ω(t). We have using the divergence theorem

∫

Ω(t)

∂u

∂tdV +

∫

∂Ω(t)

f i(u) · nidS = 0. (7.1)

However, since ∫

Ω(t)

∂u

∂tdV =

∂

∂t

∫

Ω(t)

udV +∫

∂Ω(t)

uvi · nidS (7.2)

where vi is the speed of the boundary of the domain, it follows that

∂

∂t

∫

Ω(t)

udV +∫

∂Ω(t)

(f i(u)− uvi) · nidS = 0. (7.3)

Denote viop the velocity of the edge separating cells o and p. The convective fluxes, are now


given by

f iop convective =

[ρ(vi − viop)

]op[

ρ(vi − viop)v1)]op

+ popδi1

[ρ(vi − viop)v2)

]op

+ popδi2

[ρ(vi − viop)v3)

]op

+ popδi3


]op

. (7.4)

In the code, we used the following averaging formula (a bar denotes the arithmetic mean

uop = uo+up

2 ):

[ρ(vi − viop)

]op

= ρop(viop − viop)

[ρ(vi − viop)vj

]op

= ρop(viop − viop)vjop


]op

= ρop(viop − viop)Eop + popv

iop

(7.5)

Viscous stress was evaluated in each cell by introducing a complementary mesh, for which


Time Integration

Time integration was done using a TVD Runge Kutta second order multistage time stepping

scheme[13]. For a semi discrete law in the form

∂w

∂t+R(w, t) = 0, where w = u · vol (7.6)

the scheme advances from time n to time n+ 1 by

w1 = wn −∆tR(wn, tn)

wn+1 =12wn +

12w1 − 1

2∆tR(w1, tn+1).

The explicit dependence in time of the operator R is due to mesh motion. This means

that both the location and velocity of the mesh need to be updated at tn+1 before evaluating

R(w1, tn+1). Also, notice that using a second order ERK, they only need to be updated

once per time step.


This time discretization does not guarantee the preservation of kinetic energy in time.

One could use a Crank-Nicholson semi implicit scheme as suggested by Jameson [18] to

ensure conservation in time, but the computational costs would increase enormously. Also

it shall be noted that this particular ERK time advancement scheme does not enforce the

so called discrete geometric conservation law (GCL) for a deforming mesh. Practically,

the error remains of second order and since the time steps taken are quite small, the non

enforcement of the GCL did not appear to be crucial.

Artificial Dissipation

Although the kinetic energy conserving scheme brings additional stability compared to a

classical central scheme, some artificial dissipation was added on the coarser meshes to

enhance stability. Furthermore, dissipation was also introduced in the far field to prevent

the reflection of pressure waves on the domain boundary.

Two different kinds of artificial dissipation where considered.

Artificial dissipation based on the Jameson-Schmidt-Turkel (JST) scheme

The dissipation introduced is derived from the JST scheme [21, 37] by dropping the lower

order diffusive term and conserving the higher order term to control odd/even modes. If we

consider the conservation equation ∂u∂t + ∂

∂xf(u) = 0, the truncation error introduced by our

second order scheme can be seen as a continuous term in a modified differential equation

∂u

∂t+

∂

∂xf(u) = O(∆x2,∆t2).

The idea is to introduce an extra diffusive term that will modify the truncation error

∂u

∂t+

∂

∂xf(u) = −∆xpλ(4)∂

4u

∂x4+O(∆x2,∆t2)

with λ(4) ≥ 0 and p ≥ 2 to preserve the order of the scheme. If

∂u

∂t+

1∆x

[hi+ 1

2− hi− 1

2

]= 0

is a finite volume semi-discretization of the equation where hi± 12

is the numerical flux, we

can introduce a correction di± 12

to the flux to obtain the desired property. This can be done


by taking

di+ 12

= αi+ 12ε(4) (−ui+2 + 3ui+1 − 3ui + ui−1)

similarly to what is done in the JST scheme. αi+ 12

is proportional to the spectral radius

of the local Jacobian matrix and ε(4) is a switch to add dissipation only where needed, as

described in the original JST scheme. di+ 12

is proportional to ∆x3 ∂3u∂x3 |i.

Kinetic Energy Decreasing scheme

It was shown by the authors that an extra term can be added to the Kinetic Energy

Conserving flux to create a Kinetic Energy Decreasing scheme. Although this technique is

not sufficient to capture shockwaves, it does provide a significant amount of stabilization in

areas where the mesh is too coarse.

All the examples presented in the following sections use minimum amounts of dissipation

using the modified JST scheme.

7.2 Parameters and Wing Deformations

As mentioned before, the goal of this code is to allow the optimization of the flapping motion

for low speed micro air vehicles. Since this motion cannot be completely arbitrary, we need

to define a set of parameters that will describe the movements of the wing. The objective of

this section is to introduce all the motion parameters and to show how we can reconstruct

the deformed wing from them.

Motion Parameters

The basic motion of the wing is described by a wing skeleton, consisting of a simple artic-

ulated beam. Two aspects of the motion are parametrized: the vertical flapping and the

twisting. The wing skeleton has a length s, equal to the span of the original wing. It is

divided into n rigid sub elements of equal length ` = s/n as depicted in the picture below

for the case n = 3. Note that node 0 is fixed and the wing does not twist or flap at its root.

In the sketch below, note that we are “facing the bird”.


0

12 3

1

23

Figure 7.1: Wing skeleton

Flapping motion

Each element i can rotate with respect to the previous element in a vertical plane by an

angle θi. This means that if element 1 has a flapping angle of 10 and element 2 has a

relative angle of −3, then element 2 really has an absolute angle of 7. Note that node 0 is

fixed and never moves. Once all the angles θi are set, we fit a natural cubic spline through

the nodes to reconstruct the flapping motion of the wing. The process is shown on figure

(7.2).

23

1

θ2

θ3

θ1

(a) Flapping angles

3

1

2

(b) Spline fitting

Figure 7.2: Flapping motion reconstruction

Twisting motion

Once again, the twist is defined discretely on each node i with respect to the previous one

by an angle αi. There is no twist applied on node 0. Therefore a twist angle of 5 at node

1 and a twist angle of −2 at node 2 really means an absolute twist of 3 at node 2. Once

again, after the twist angles are set, we fit a natural cubic spline through these various twist

values along the span. The process is shown on figure (7.3).


2 31

α1

α2

α3

(a) Twisting angles

2 31

(b) Spline fitting

Figure 7.3: Twisting motion reconstruction

Parametrization in time

All the angles introduced above are not constant in time, and we expect the flapping and

twisting process to be somewhat cyclic. We therefore need to give a representation in time

for all the parameters we considered. Suppose the motion of the wing is smooth and periodic

with a period T . The variation in time of all the angles can be expressed in terms of Fourier

sine expansion

αi(t) = a0 +∞∑

k=1

aksin(kωt+ φi), ω =2πT

Practically, we limit the sum to the first N elements. Therefore, the variation in time

of angle αi is entirely defined by a0, a1, · · · , aN , φ1, · · · , φN. A wing parametrized by n

nodes and the N first Fourier modes will then require a total of n(2N + 1) + 1 parameters.

Ideally, we want to start the simulation from a steady state and accelerate smoothly to the

periodic regime. This is done for t in [0, T/2] by multiplying all angles by sin(ωt)2 (we

supposed the flow around the airfoil already reached a quasi steady solution at T = 0).

Reconstructing the wing - Stretching the mesh

Once the skeleton has been formed and the splines fitted through the various control points,

it is possible to deform the mesh to reconstruct the actual flapping wing. We use a Cartesian

H-C mesh. The various nodes of the mesh are therefore given by coordinates x(i, j, k),

y(i, j, k) and z(i, j, k). Figure 7.5(a) represents the original underformed mesh.

The motion of the wing is reconstructed by sliding (flapping) and rotating (twisting) the

parallel z planes forming the mesh by the amounts given by the wing skeleton. In order to

preserve the thickness of the wing, the mesh needs to be dilated in the y direction by taking

y(i, j, k)← y(i, j, k)/cos(θ), where θ is the local slope on the flapping wing, as depicted on

figure 7.4. Since the deformed wing does not extend as far as the flat undeformed original


wing, the z planes also need to be compacted. The final result can be observed on figure

7.5(b).

To wing root

To wing tip

θh/c

os(θ

)

h

WING

z planes

y

z

Figure 7.4: Dilation of the mesh

(a) Undeformed (b) After stretching

Figure 7.5: Deformation of the mesh

7.3 Numerical Experiments

Examples of complex wing motions

Although the wing parametrization is rather simple, it allows for very complex deformations

of the airfoil. We give two examples of wing deformations that we obtained using our code

for n = 4. Figure 7.6 (a) is a wing for which only flapping angles are prescribed.

Conversely, 7.6 (b) represents a wing for which only twist was prescribed.


Flap angles Twist anglesθ1 = 30 α1 = 0

θ2 = −60 α2 = 0

θ3 = 60 α3 = 0

θ4 = −60 α4 = 0

Flap angles Twist anglesθ1 = 0 15

θ2 = 0 −30

θ3 = 0 30

θ4 = 0 −30

(a) Wiggles (b) Twist

Figure 7.6: Characteristic motions of the wing

An example of high resolution simulation

Figure 7.7 shows an example of simulation obtained on a 1024 × 256 × 256 H-C mesh (a

total of 67,108,864 cells). The wing motion is parametrized by a single point (n = 1) at the

tip. This solution is actually the one found to maximize the propulsive efficiency for n = 1

and N = 1 at Mach 0.2 and for a Reynolds number of 2000 based on the wing root chord.

With 67M cells, we are close to achieving a DNS (actually, DNS would require more

than 100M cells for this particular case). Our flow solver is able to solve complex intricate

vortex structures. Although it is almost impossible to study such a complex flow field by

just looking at it, it is interesting to see how the wing tip vortices interact with vortices

shed from the trailing edge of the wing. Solving the flow on such a large mesh also allowed


Figure 7.7: High resolution example - Vorticity isosurfaces colored by pressure distribution

us to check that our parallel implementation using MPI is scaling almost linearly up to 1024

CPUs (the maximum number that was available to us).

The flow solver as part of an optimization process

As mentioned before, the ultimate goal of this work is to use the flow solver as a black box

for the optimization of the propulsive efficiency of a flapping wing. Coupled with a non

linear optimization solver like SNOPT [12], the code is used to give function evaluations of

propulsive efficiency for various flapping parameters inputs. Of course, the quality of the

output depends directly on the finesse of the mesh. Although the objective of this section is

not to discuss the results of the optimization process, it is interesting to study how the lift

and drag history of the airfoil are modified by coarsening and refining the mesh. This will

give us a precious indication on the number of grid points needed to perform meaningful


optimization. In all our numerical experiments, the meshes used ranged from very coarse

(128 × 32 × 32) to quite fine (1024 × 256 × 256). It might be a bit surprising to see that

the time history of lift and drag does not change much when using these various meshes.

On figure 7.8, we plotted lift and drag history for 3 different meshes (128 × 32 × 32 in

red, 256 × 64 × 64 in green and 384 × 96 × 96 in blue) for the optimal case N = 1 and

n = 1. Although the coarsest solution exhibits small variations with the other ones, the

256× 64× 64 mesh can already provide an excellent estimation of the propulsive efficiency.

The quickest way to converge to an optimal solution is to adopt a multi-fidelity approach:

starting with the coarsest mesh, we refine it as the optimization process converges. For

robustness, we prefer to start with the 256× 64× 64 mesh (the coarser mesh does provides

good results in the example presented here, but it might not be the case when the flow field

induced by the airfoil motion is extremely complicated). In practice, very small changed are

made to the solution when the mesh is further reduced and most of the work is therefore

done on the 256× 64× 64 mesh.

Each SNOPT optimization cycle require multiple function evaluations. In our case, a

function evaluation consists to run the code on the 256× 64× 64 mesh cells (1,048,576 cells

in total) for some particular input parameters and to compute the propulsive efficiency.

Using 256 CPUs, a function evaluation takes approximately 3 to 4 hours. Results of the

optimization process are presented in details in a paper by Culbreth [7]. We present here

a particular one for N = 1 and n = 2. Actually, we search for an optimal solution for

propulsive efficiency in the form presented in table 7.1.

Flap angles Twist anglesθ1(t) = a1 sin(ωt) α1(t) = b1 sin(ωt+ φ1)θ2(t) = 0 α2(t) = b2 sin(ωt+ φ2)

Table 7.1: Assumed form of an optimal solution

Note that there are 6 unknowns a1, b1, b2, φ1, φ2, w. The wing is assumed to fly forward

at Mach 0.2 and the Reynolds number based on the root wing chord L is 2000. The

optimization process required O(100) function evaluations and we find the values in table

7.2 to be optimal.

In figure 7.9, we show the solution obtained, computed this time on a 384 × 96 × 96 cells

mesh. The computation of 2 flapping cycles took around 20 hours on 256 CPUs. Note that

a non negligible amount of that time was used to output and process the movie files. It is


-4

-3

-2

-1

0

1

2

-0.5 0 0.5 1 1.5 2

CL

Time

128x32x32256x64x64384x96x96

(a) Lift

-1.5

-1

-0.5

0

0.5

1

-0.5 0 0.5 1 1.5 2

CD

Time

128x32x32256x64x64384x96x96

(b) Drag

Figure 7.8: Lift and drag time history for the optimal case n = 1, N = 1 for various meshes.

Parametersa1 = 47.1

b1 = 45.5

b2 = 12.9

φ1 = 88.9

φ2 = 80.7

ω = 0.76

Table 7.2: Results of the optimization using SNOPT

interesting to see at the end of the optimization process how the wing aligns itself with the

incoming flow to minimize separation. However, we can still observe some separation of the

flow on the wing, hence showing that simplistic panel codes might not be suitable to solve


such an optimization problem.

Conclusion

This chapter describes a finite volume code we developed for flow simulations around 3D

flapping wings at low Reynolds numbers and presents some first results obtained using

it. Using Jameson’s Kinetic Energy Conserving scheme we were able to make high-fidelity

simulations that required minimum amounts of artificial dissipation. We also designed a

complete set of parameters to describe the motion of the wing to reduce the search space

to a domain of finite dimension for optimization problems.

The first optimization results we obtained are in favor of the choice we made of using a

second order finite volume code: a panel code would probably be ineffective at finding an

optimal solution since it might involve complex flows and separations while a high order

method for structured or unstructured mesh would make each function evaluation too costly

to obtain a result in a decent time.


t = 0.21

t = 6.95

t = 13.70

t = 20.44

t = 27.18

t = 33.93

t = 40.68

1 of 1

American Institute of Aeronautics and AstronauticsFigure 7.9: Optimal solution N = 1, n = 2

Conclusion

A Kinetic Energy Preserving Discontinous Galerkin scheme has been developped for the

Euler equations (DG-KEP). We showed that choosing the numerical flux in a certain manner

was leading to a scheme for which the total numerical kinetic energy behaves the same way

as the total kinetic energy of the real flow. We observed that best results were obtained

when considering the case p = 0 in the DG method, that is for a finite volume scheme (FV-

KEP). In this case, the Kinetic Energy preserving scheme is a special central scheme that

is a lot more stable and robust than the regular flux average central scheme. Simulations

do not need the addition of artificial dissipation to be made with this particular central

scheme and if the flow is viscous, a cell Reynolds number of the order of unity will lead

to an almost completely oscillation free solution. The important point is that although a

cell Reynolds number of the order of unity everywhere requires an extremely fine mesh for

the FV-KEP scheme to behave nicely, the total numerical cost remains well under of the

one of using a regular central scheme with artificial dissipation on a coarser mesh for the

evaluation of the KEP flux is extremely cheap.

Extensive testing of the code for various cases confirmed the qualities of our scheme. The

2D viscous shocktube case is interesting as we see that the scheme is able to capture both

a shockwave and acoustic waves directly. These good properties of the FV-KEP scheme

allowed us to develop a code to compute the flow around pitching and plunging airfoils

as well as around fully deformable 3D wings. The code offers an extremely good tradeoff

between speed and accuracy and is particularly well suited for our optimization problems.

Future work

Although our work focused mostly on low Reynolds simulations hence enabling us to perform

Direct Numerical Simulations of the flows, the FV-KEP scheme appears to be a good

115

CONCLUSION 116

candidate for Large Eddy Simulation simulations of turbulent flows.

Other interesting work lies in the developement of GPU codes using CUDA. More fea-

tures are available in FORTRAN and fast evolving architectures could be the key to enabling

DNS of flapping wings at more realistic Reynolds numbers (10,000-50,000 based on the wing

base chord) in realistic amounts of time.

Appendix A

Odd/Even decoupling phenomenon

in DG

The Odd/Even decoupling is a well known phenomenon that appears in finite differences

and finite volumes methods when using central schemes. In this section we focus on solving

the linear advection equation ∂u∂t + a∂u∂x = 0. Theoretically, the only steady solution should

be obtained for ∂u∂x = 0, i.e. u = const. However, if one uses a central scheme, there exist a

set of spurious non constant solutions that yet lead to a zero residual.

Example - Central scheme, finite differences

Here, for all i, we have

δx(ui) =ui+1 − ui−1

2∆x= 0

−1

0

1

−1 −0.5 0 0.5 1

We can show that this phenomenon still exists when considering a DG approach to

solving the problem. From now on, we suppose the polynomial basis to be the Legendre

117

APPENDIX A. ODD/EVEN DECOUPLING PHENOMENON IN DG 118

polynomials P = P0, P1, · · · , Pp. These polynomials satisfy the following properties

Pi(−1) = (−1)i

Pi(1) = 1∫ 1

−1PiPjdx =

22i+ 1

δij

For linear advection, the DG method is

dudt

= M−1 (−aSu + fCL ·Φ(−1)− fCR ·Φ(1))

= RDG(u) , the DG residual

Therefore,

RDG(u) = 0 ⇔ aSu = fCL ·Φ(−1)− fCR ·Φ(1)

⇔ ∀i,∫ 1

−1auh

′Pidx = fCL · Pi(−1)− fCR · Pi(1)

In particular, uh′ ∈ Rp−1[X], so for i = p

fCL · Pp(−1)− fCR · Pp(1) = 0

leading to

fCR = (−1)pfCL

It follows for other i:

∫ 1

−1auh

′Pidx = fCL · Pi(−1)− fCR · Pi(1)

= fCL(−1)i − fCR=

[(−1)i − (−1)p

]fCL

Eventually, we have the condition

RDG(u) = 0 ⇔ ∀i, ∫ 1−1 auh

′Pi dx =[(−1)i − (−1)p

]fCL


Note. If we use a fully upwind flux, then

a > 0⇒ fCR = 0

a < 0⇒ fCL = 0

In both case, since fCR = (−1)pfCL, we have fCR = fCL = 0. It follows that for all

i,∫ 1−1 auh

′Pi dx = 0 implying that u is constant. As it does for finite volumes and finite

differences, upwinding prevents the odd/even phenomenon.

It is now possible to find an exact expression of the spurious modes. Suppose fCL 6= 0

(No full upwinding, 0 ≤ α < 1 in equation 1.6) and RDG(u) = 0. Suppose uh′ takes the

form

uh′ =

p−1∑

i=0

γiPi (γp = 0)

As a consequence,

∀i ≤ p, γi 2a2i+ 1

=[(−1)i − (−1)p

]fCL

If p is even, then

γi =

0, i is even

−2i+ 1a

fCL, i is odd

and if p is odd,

γi =

2i+ 1a

fCL, i is even

0, i is odd

Recalling the following property of the Legendre polynomials

ddxPn+1 = (2n+ 1)Pn + (2(n− 2) + 1)Pn−2 + (2(n− 4) + 1)Pn−4 + · · ·

we conclude

RDG(uk) = 0 ⇔ ukh = (−1)p+1 fCLaPp + λ, λ ∈ R


Example - Central scheme, p = 3

Consider the case

a = 1

fCL = +1

fCR = −1

and the solution u = P3

−1

0

1

−1 −0.5 0 0.5 1

Example - Central scheme, p = 4

Consider the case

a = 1

fCL = ±1

fCR = ±1

and the solution u = ∓P4

−1

0

1

−1 −0.5 0 0.5 1

Appendix B

Order of finite volume KEP scheme

We show that a large family of smooth symmetric and consistent mean operators can be

used to generate an infinite number of 2nd order compact finite volume schemes on uniform

structured grids. As a corollary, a simple proof is given to confirm that Jameson’s Kinetic

Energy Conserving scheme is indeed of second order accuracy in space. In a first part,

we introduce the mean operators considered and describe some of their properties. In a

second part, application to finite volumes is shown. We conclude this appendix with the

application of these concepts to Jameson’s kinetic energy conserving scheme for the euler

equations.

B.1 Mean functions

M is a mean function such thatM(u, v) = M(v, u)

M(u, u) = u(B.1)

We also assume that M is C∞.

Proposition 1. If M is a mean operator defined as above, then ∂M∂u (u, u) = 1

2

Proof. Let ψ : (x, y) 7→ ψ(x, y) be a differentiable function of two variables such that

i- ψ is symmetric : ψ(x, y) = ψ(y, x)

121

APPENDIX B. ORDER OF FINITE VOLUME KEP SCHEME 122

ii- ψ is consistent : ψ(x, x) = x

Now define φ : x 7→ (x, x) and g : x 7→ ψ φ(x). It follows immediately that g : x 7→ x and

that dgdx(x) = 1. Also,

dgdx

(x) = 5φ(x)×5ψ(φ(x))

=∂ψ

∂x(x, x) +

∂ψ

∂y(x, x)

Since ψ is symmetric, we have ∂ψ∂x (x, x) = ∂ψ

∂y (x, x) = 12

Proposition 2. If M1,M2, . . .Mk are means defined as in (B.1) , and if∑αk = 1, then

M =∑αkMk is also such a mean.

The most classical mean operators are given by the `p norm on vectors of 2 elements. For

example,

Mp(u, v) =12p

(u

1p + v

1p

)p, p ∈ R

M1(u, v) =12

(u+ v)

M2(u, v) =14

(√u+√v)2

2M2(u, v)−M1(u, v) =√uv

When using such a mean:

M(v, w + ∆w) = M(v, w) + ∆w · ∂M∂w

(v, w) +∆w2

2· ∂

2M

∂w2(v, w) + . . .

∆w = ∆x · ∂w∂x

+∆x2

2· ∂

2w

∂x2+ . . .

and using the notation

xi+1 = xi + ∆x

ui = u(xi)

u′i =∂u

∂x(xi)

we then have


M(ui, ui+1) = M(ui, ui) +∆x2· u′i + βi∆x2 + . . .

= ui +∆x2· u′i + βi∆x2 + . . .

≡ ui+ 12

Note that because of the symmetry of M we have

M(ui−1, ui) = M(ui, ui−1)

= ui − ∆x2· u′i + βi∆x2 + . . .

≡ ui− 12

B.2 Use of mean operators in finite volumes approximation

For ui+ 12

and ui− 12

defined by

ui+ 12

= ui +∆x2· u′i + βi∆x2 + . . .

ui− 12

= ui − ∆x2· u′i + βi∆x2 + . . .

we have a second order approximation to the first derivative:

1∆x

(ui+ 1

2− ui− 1

2

)= u′i +O(∆x2)

Product of approximates

Let

ui± 12

= ui ± ∆x2· u′i + βi∆x2 + . . .

vi± 12

= vi ± ∆x2· v′i + αi∆x2 + . . .


Then

ui± 12· vi± 1

2= uivi ± ∆x

2(u′ivi + uiv

′i) +

(uiαi + viβi +

14u′iv

′i

)∆x2 + ...

= (uv)i ± ∆x2

(uv)′i + γi∆x2 + ...

≡ (uv)i± 12

Proposition 3. The product of second order approximates (obtained by the means defined

above for example) is a second order approximate of the product.

It is then evident by induction that if we have n variables u1, u2, . . . , un and n approxi-

mations defined by uki+ 1

2

= Mk(uki , uki+1), where M1,M2, . . . ,Mn are means that satisfy

relation 1, then:

u1i+ 1

2

u2i+ 1

2

. . . uni+ 1

2

= (u1iu

2i . . . u

ni ) +

∆x2

(u1u2 . . . un)′i + Γi∆x2 + . . .

≡ (u1u2 . . . un)i+ 12

and

1∆x

((u1u2 . . . un)i+ 1

2− (u1u2 . . . un)i− 1

2

)= (u1u2 . . . un)′i +O(∆x2)

B.3 Application to the FV-KEP Scheme

The one dimensional Euler equations are given by:

∂

∂t

ρ

ρu

ρE

+

∂

∂x

ρu

ρu2 + p

ρuH

= 0

and the flux is defined by

fi+ 12

=

(ρu)i+ 12

(ρu2 + p)i+ 12

(ρuH)i+ 12

The KEP scheme requires that (ρu2)i+ 12

= (ρu)i+ 12ui+ 1

2, where


ui+ 12

=12

(ui + ui+1)

= M1(ui, ui+1), M1 defined in part I.

Therefore we can take any second order approximation for (ρu)i+ 12

and the scheme will be

second order accurate (cf. product of approximates above).

For example, one can take

(ρu)i+ 12

= M(ρiui, ρi+1ui+1)

or (ρu)i+ 12

= M(ρi, ρi+1) ·N(ui, ui+1)

where M and N are means defined in part I.

Consistently, we can define (ρuH)i+ 12

as (ρu)i+ 12M(Hi,Hi+1).

Isentropic vortex test case

Let us consider an isentropic vortex test case for which an exact solution is known and given

by

u = 1− βe(1−r2) y − y0

2π

v = βe(1−r2)x− x0

2π

ρ =[1−

(γ − 116γπ2

)β2e2(1−r2)

] 1γ−1

p = ργ

where r =√

(x− t− x0)2 + (y − y0)2, x0 = 5, y0 = 0, β = 5 and γ = 1.4. The considered

domain is (x, y) ∈ [0, 10] × [−5, 5]. The KEP scheme was implemented using (ρu)i+ 12

=

M1(ρi, ρi+1)×M1(ui, ui+1).

As can be seen in table B.1 and in figure B.1, the scheme is indeed of second order.

Figure B.2 depicts the exact solution (density and vorticity) at t = 0 and t = 1. Figure


?? shows computed solution for various mesh sizes: a coarse mesh (counting 50× 50 cells,

h = 0.2), an intermediate mesh (100× 100 cells, h = 0.1) and a fine mesh (200× 200 cells,

h = 0.05).

Quantity h h/2 h/4 Slopeρ 3.3661× 10−3 8.2139× 10−4 2.0429× 10−4 2.026p 3.7658× 10−3 9.3822× 10−4 2.3441× 10−4 2.003

Table B.1: `2-error for various mesh sizes

0.250.51

10−3

h

erro

r

Error convergence for the Isentropic Vortex case

Density

Pressure

Figure B.1: Convergence of the error in the `2-norm for ρ and p


(a) Density t = 0 (b) Vorticity t = 0

(c) Density t = 1 (d) Vorticity t = 1

Figure B.2: Exact Solution of the Isentropic Vortex case at t = 0. and t = 1.

Mesh detail h Density h Vorticity h

Figure B.3: Numerical solution at t = 1., mesh sizes h = .2


Mesh detail h/2 Density h/2 Vorticity h/2


Mesh detail h/4 Density h/4 Vorticity h/4


Bibliography

[1] P.Castonguay, P.E. Vincent and A. Jameson, A New Class of High-Order Energy Stable

Flux Reconstruction Schemes for Triangular Elements, Journal of Scientific Comput-

ing, DOI: 10.1007/s10915-011-9505-3, 2011

[2] Cockburn, B., Shu, C., TVB Runge-Kutta local projection discontinuous Galerkin

finite element method for conservation laws II: general framework, Mathematics of

Computation, 186 411-435 1989

[3] Cockburn, B., Lin, S., Shu, C., TVB Runge-Kutta local projection discontinuous

Galerkin finite element method for conservation laws III: one dimensional systems,

Journal of Computational Physics, 84(1) 90-113 1989

[4] Cockburn, B., Shu, C., The Runge-Kutta local projection discontinuous Galerkin finite

element method for conservation laws IV: the multidimensional case, Mathematics of

Computation, 190 545-581 1990

[5] Cockburn, B., Shu, C., The Runge-Kutta local projection P1-discontinuous Galerkin

finite element method for scalar conservation law, RAIRO - Modelisation Mathematique

et Analyse Numerique, 25 337-361 1991

[6] Cockburn, B., Shu, C., Runge-Kutta discontinuous Galerkin methods for convection

dominated problems, Journal of scientific computing, 16(3) 173-261 2001

[7] M. Culbreth, Y. Allaneau and A. Jameson, High-Fidelity Optimization of Flapping

Airfoils and Wings, AIAA paper - Hawaii 2011

[8] Dosanjh, D., Weeks, T., Interaction of a starting vortex as well as a vortex street with

a traveling shock wave AIAA Journal, 3(2):216-223, february 1965

129

BIBLIOGRAPHY 130

[9] Ellzey, J., Henneke, M., Picone, J., Oran, E., The interaction of a shock with a vortex:

Shock distorsion and the production of acoustic waves, Physics of Fluids, 7(1):172-184,

January 1995

[10] Furst, J., Modelisation numerique d’ecoulements transsoniques avec des schemas TVD

et ENO, PhD Thesis, University of Aix-Marseille II

[11] Wang, Z.J., Gao, H., A unifying lifting collocation penalty formulation including the

discontinuous Galerkin, spectral volume/difference methods for conservation laws on

mixed grids, Journal of Computational Physics, 228 81618186 2009

[12] P. E. Gill, W. Murray and M. A. Saunders, SNOPT An SQP Algorithm for Large Scale

Constrained Optimization, SIAM Review, Vol. 47, No 1, pp. 99-131.

[13] Gottlieb, S., Shu, C.W., Total Variation Diminishing Runge-Kutta Schemes, Mathe-

matics of Computation 221, 73-85 1998

[14] Hesthaven, J.S., Warbuton, T., Nodal Discontinuous Galerkin Methods, Algorithms,

Analysis and Application, Springer

[15] Huynh, H.T., A Flux Reconstruction Approach to High-Order Schemes Including Dis-

continuous Galerkin Methods, AIAA 2007-4079

[16] Huynh, H.T., A Reconstruction Approach to High-Order Schemes Including Discon-

tinuous Galerkin for Diffusion, AIAA-2009-403

[17] Huynh, H.T., High-Order Methods Including Discontinuous Galerkin by Reconstruc-

tions on Triangular Meshes, AIAA 2011-44

[18] Jameson, A., A Proof of the Stability of the Spectral Difference Method for All Orders

of Accuracy, Journal of Scientific Computing, 45 348-358 2010

[19] Jameson, A., The Construction of Discretely Conservative Finite Volume Schemes that

Also Globally Conserve Energy or Entropy, Stanford University ACL Report 2007-1,

Journal of Scientific Computing, Vol. 34, 2008, pp. 152-187

[20] Jameson, A., Formulation of Kinetic Energy Preserving Conservative Schemes for Gas

Dynamics and Direct Numerical Simulation of One-dimensional Viscous Compressible

Flow in a Shock Tube Using Entropy and Kinetic Energy Preserving Schemes, Stanford

BIBLIOGRAPHY 131

University ACL Report 2007-2, Journal of Scientific Computing, Vol. 34, 2008, pp.

188-208

[21] A. Jameson, W. Schmidt and E Turkel, Numerical Solutions of the Euler Equations

by Finite Volume Methods Using Runge-Kutta Time-Stepping Schemes, AIAA paper

81-1259, June 1981

[22] Jameson, A., Vincent, P. and Castonguay, P., On the Non-Linear Stability of Flux

Reconstruction Schemes, Journal of Scientific Computing, doi:10.1007/s10915-011-

9490-6, April 2011

[23] Jiang, G.-S, Shu, C.-W., Efficient implementation of weighted ENO schemes, Journal

of Computational Physics, 126(1):202-228, 1996

[24] Kopriva, D.-A, Kolias, J.-H., A conservative staggered-grid Chebyshev multidomain

method for compressible flows, Journal of Computational Physics, 125 244-261 1996

[25] Leffell, J., Pulliam, T., Grid and Time Step Requirements to Accurately & Efficiently

Resolve Flow around a Rigid Flapping Airfoil using OVERFLOW, AIAA paper 2011-

573

[26] Liang, C., Ou, K., Premasuthan, S., Jameson, A., Wang, Z. J., High-order accurate

simulations of unsteady flow past plunging and pitching airfoils, Computers and Fluids,

doi:10.1016/j.compfluid.2010.09.005, 2010

[27] Mirels, H., Test Time in Low-Pressure Shock Tubes, The physics of Fluids, vol. 6-9,

p. 1201-1214, 1963

[28] Mirels, H., Flow Nonuniformities in Shock Tubes Operating at Maximimum Test

Times, The Physics of Fluids, Vol. 9-10, p. 1907-1912, 1966

[29] K. Ou, P. Castonguay and A. Jameson, 3D Wing Simulation with High-Order Spectral

Difference Method on Deformable Mesh, AIAA Paper, Orlando 2011

[30] P.-O. Persson, D. J. Willis and J. Peraire, The Numerical Simulation of Flapping Wings

at Low Reynolds Numbers, AIAA paper 2010-724

[31] Picone, J., Oran, E., Boris, J., Young, T., Theory of vorticity generation by shock wave

and flame interaction, AIAA, 94:429-448, 1985

BIBLIOGRAPHY 132

[32] K. D. Jones, C. M. Dohring and M. F. Platzer, Experimental and Computational

Investigation of the Knoller-Betz Effect, AIAA Journal, Vol. 36, No. 7, July 1998

[33] K. D. Jones, T. C. Lund and M. F. Platzer, Experimental and Computational Inves-

tigation of Flapping Wing Propulsion for Micro Air Vehicles, Progress in Astronautics

and Aeronautics, Vol. 195, pp. 307-339. 2001

[34] Reed, W., Hill, T., Triangular mesh methods for the neutron transport equation, Los

Alamos Report LA-UR-73-479, 1973

[35] Ribner, H., Cylindrical sound wave generated by shock-vortex interaction AIAA Jour-

nal, 23(11):1708-1715, november 1985

[36] Roe, P.L., Approximate Riemann Solvers, Parameter Vectors and Difference Scheme,

Journal of Computational Physics, Vol. 14, p. 357-372, 1981

[37] R. C. Swanson and E. Turkel, Artificial Dissipation and Central Difference Schemes

for the Euler and Navier-Stokes Equations, AIAA paper 87-1107

[38] Vincent, P., Castonguay, P., Jameson, A., A new class of High-Order Energy Stable

Flux Reconstruction Schemes, Journal of Scientific Computing, DOI: 10.1007/s10915-

010-9420-z 2010

[39] Liu, Y., Vinokur, M., Wang, Z.J., Discontinuous Spectral Difference Method for Con-

servation Laws on Unstructured Grids, Journal of Computational Physics 216 780-801

2006

[40] Wang, Z. J., Liu, Y., May, G., Jameson, A., Spectral difference method for unstructured

grids II: Extension to the Euler equations, Journal of Scientific Computing, 32 45-71

2007

[41] Wang, Z. J., Spectral (finite) volume method for conservation laws on unstructured

grids: basic formulation, Journal of Computational Physics, 178 210-251 2002

energy conserving numerical methods for the...

Documents