lecture notes on calculus of variations and partial · the calculus of variations is a mathematical...

LECTURE NOTES ON CALCULUS OF VARIATIONS AND PARTIAL

DIFFERENTIAL EQUATIONS (AN INTRODUCTORY COURSE).

MARGARIDA BAIA, DM, IST;LECTURE 1-14 (2S-17/18); DRAFT

Lecture 1.

1. Introduction

The calculus of variations is a mathematical discipline that may simplest be described asa general theory for studying extreme and critical points.

At this introductory course we will focus on the origins of calculus of variations: the studyof the extrema1 of functionals defined on infinite dimensional function (vector) spaces with realvalues.2 Namely, our goal is to study what is historically known as the fundamental problem ofthe calculus of variations (see Section 1.2). We will see3 that studying extrema of functionalsis a generalization of the problem of studying extrema of functions of several variables indifferentiable calculus4 (variables will themselves be functions and we will be seeking to studythe extrema of “functions of functions”). However this study applied to functionals, generallydefined by definite integrals involving functions (and their derivatives) often characterized byboundary conditions and/or other constraints and smoothness requirements, depending onthe problem itself, is more complex and requires new tools with respect to the ones used indifferentiable calculus. One example of such a functional could be

I(u) =

∫ b

a

√1 + [u′(x)]2 dx

defined for all u = u(x) continuously differentiable functions defined on the interval [a, b] ⊂ R,whose value for each of such function u corresponds to be the length of its graph. One could beinterested, for instance, on finding the minimum of I restricted to some sub-class of functions.

The reason for naming this subject “calculus of variations” has to due with the “techniqueof variation” that is employed to obtain necessary conditions for the existence of extremepoints for our given functional, as we will study in Subsection 2.4.1 (for some study oncritical points that are not extreme as well as related existence questions for non linear PDEwe refer to e.g Evans [22], Rabinowitz [43], Struwe [49], Willem [52]). Calculus of variationshas roots in many areas from geometry to optimization to mechanics and we refer to Section1.5 for historical surveys on this subject, whose enormous growth (rigorous foundation and

1maximum/minimum, local or global. More generally, we could study critical points2A functional is a mapping from a vector space whose range lies on its scalar (or coefficient) field; see e.g

Rudin [45]. Other references: Brezis [13] and Kreyszig [39].3This notes are based on given references and do not pretend to be an original work. Every lecture

correspond to a week of classes. Any typo/mistake detection are welcome to be sent to me by email.4see e.g Apostol [4, 5]

1

2 MARGARIDA BAIA, DM, IST; LECTURE 1-14 (2S-17/18); DRAFT

understanding) turned into a satisfactory theory only around the first half of the 20th centuryand forces any attempt to completely describe it as a whole theory most likely to an impossiblemission.

As far as applications, we all have already had the opportunity to observe Nature’s andHuman’s propensity to “minimize” efforts and “maximize” benefits. Indeed, variational prin-ciples involving these kind of functionals or energies (as for instance Fermat’s (or least time)principle, least-action principle, the law of maximal entropy, Hamilton’s principle) form oneof the most wide-ranging means of formulating mathematical models governing the equilib-rium configurations of physical systems. In economics, management and finance, problemsas the minimization of the cost, maximization of the profit or minimization of the investmentrisk are of the fundamental interest. As an example, we mention other fields: Aeronautics(maximization of the lift of an airplane wing; optimum flight profiles from a kinetic energypoint of view and energy/fuel consumption economy); Mechanical engineering (maximizationof the strength of a column, a dam or an arch); Electrical engineering (best electronic filterand reflector antennas design); Sport and related equipment design (minimization of the airresistance on a bicycle helmet, optimum shape of a ski, optimum shape of a boat hull); Com-puter Vision (image segmentation, image morphing and image denoising models are basedon minimization problems).

Calculus of Variations is used in other areas of mathematics as, for instance, differen-tial geometry (geodesic, minimal surface and isoperimetric problems), differential equations(study of the existence of solutions for ordinary and partial differential equations, even insome situations when these cannot be found analytically, as the “three-body problem”, seeSection 1.5), and it gives a base for optimal control theory, with many applications in theproblems mentioned above. We refer to Section 1.5 for further reading regarding applications.

1.1. Notation: This section contains the general notation (unless otherwise specified) usedduring the course.

- A typical point in RN (n-dimensional real Euclidean space supplied with a Cartesiansystem of coordinates with origin at ORN ) is x = (x1, ..., xN ).

- Euclidean norm in RN : ||x||2 =√∑N

i=1 xi2 (most of the times denoted by || · ||).

- Ω ⊂ RN represents an open set.- Given F : R3 → R3, F = (F1, F2, F3), ∇ · F ≡ div F := ∂F1

∂x1+ ∂F2

∂x2+ ∂F3

∂x3and

∇× F = curl(F ) := (∂F3∂x2− ∂F2

∂x3, ∂F1∂x3− ∂F3

∂x1, ∂F2∂x1− ∂F1

∂x2).

- C(Ω) and C(Ω;Rd) stand, respectively, for the set of continuous functions u : Ω→ Ror u : Ω→ Rd, accordingly.

- C(Ω) is the set of continuous functions u : Ω → R, which can be continuouslyextended to Ω

- A vector α = (α1, ..., αN ) with αi ≥ 0 is called a multi-index of order |α| :=∑N

i=1 αi.

- Ck(Ω) is the set of functions u : Ω → R which have all partial derivatives, Dαu :=∂|α|u

∂xα11 ...∂x

αNN

, with |α| ≤ k, continuous.

- Ck(Ω) is the subset of Ck(Ω) of those functions whose derivatives up to the order kcan be extended continuously to Ω.

- C∞(Ω) = ∩∞k=0Ck(Ω) (infinitely differentiable functions).

LECTURE NOTES ON CALCULUS OF VARIATIONS AND PARTIAL DIFFERENTIAL EQUATIONS (AN INTRODUCTORY COURSE).3

- C∞c (Ω) stands for those functions in C∞(Ω) with compact support, i.e, C∞c (Ω) =u ∈ C∞(Ω), u(x) = 0 ∀x ∈ Ω \K, K ⊂ Ω compact set.

- Obvious adaptation for C(Ω;Rd), Ck(Ω;Rd), Ck(Ω;Rd), C∞(Ω;Rd), C∞c (Ω;Rd).- We write A = (ai,j)i=1,...,d; j=1,...,N to mean a d×N matrix (or, by identification, an

element in Rd×N ) with (i, j)th entry ai,j ∈ R.- tr(A) and cof(A) stand for the trace and cofactor matrix of A ∈ Rd×N .

- If A = (ai,j)i=1,...,d j=1,...,N ∈ Rd×N then we denote |A| =(∑d

i=1

∑Nj=1 a

2i,j

)1/2

- Du ≡ ∇u =(∂ui

∂xj

)i=1,...,d; j=1,...,N

∈ Rd×N for u : Ω → Rd, u = (u1, ..., ud) (gradient

vector if d = 1).

- D2u =(

∂2u∂xi∂xj

)i,j=1,...,N

(Hessian matrix).

- ∆u =∑N

i=1∂2u∂xi2

, u : Ω→ R (Laplacian of u).

- a.e (almost everywhere with respect to the Lebesgue measure in RN ).- By ∞ we mean +∞.

1.2. Model variational problem and connection to the study of PDE.

As refereed at the introduction, in this course, we will focus mainly on what is histor-ically known as the fundamental (or simplest) problem of the calculus of variations i.e onminimization problems (also known as variational problems) of the form

(1.1) infu∈A

I(u)

for integral functionals I : X → R (or, eventually, R = R ∪ +∞) defined by

I(u) =

∫Ωf(x, u(x),∇u(x)) dx

(typically refereed as a variational integral5), where Ω ⊂ RN is an open and bounded set, theintegrand f : Ω×Rd×Rd×N → R, f = f(x, u, ξ) (e.g a density; usually called the Lagrangianof I), is a continuous function6, X is a space of functions u : Ω→ Rd and A (admissible classof functions) consists of functions u ∈ X possibly satisfying suitable boundary conditionsand/or further constraints.

Solving problem (1.1) means to minimize the integral I(u) among all functions u ∈ A,that is, to discover u0 ∈ A such that

I(u0) ≤ I(u), ∀u ∈ A.

We will call m := infu∈A I(u) the minimal value that such an integral can take. If thisminimizer exists we will write I(u0) = minu∈A I(u) (or u0 = argminu∈A I(u)).

The first tentatives to solve problem (1.1) date from the beginning of the infinitesimalcalculus for the case N = d = 1 and are based on the study of the Euler-Lagrange equationsassociated to I (see Section 2 and Section 2.4).

5Also refereed as action in the setting of optimal control theory. More general integrals could be considereddepending on higher order derivatives.

6More general integrands could be considered, e.g Caratheodory functions, but the analysis is harder froma technical point of view.


Contrary to what happens in the study of minimum problems involving functions of severalvariables, a major difficulty here is the choice of the space A depending on the one of X. Anatural choice for this type of problem seems to be X = C1(Ω;Rd) (I is well defined) butunfortunately, as we will see, this space is not the best choice for guaranteeing the existenceof a minimizer. As a simple example, consider:

I(u) =

∫ 1

0

([u

′(x)]2 − 1

)2dx

and A = u ∈ C1([0, 1]) : u(0) = 0, u(1) = 0.In general, we will work within the framework of Sobolev spaces (see Subsection ?? for a

short review on this class of spaces, if needed).7

Connection to PDE: Assuming that we are able to find a minimizer u0 ∈ A for problem(1.1), let us give the idea behind its application to the study of (stationary) partial differentialequations. Imagine that we want to solve (in the same class A) one of these equations that,for simplicity, we write it in the abstract form

(1.2) L(u) = 0,

where L(·) denotes a given, possibly nonlinear differential operator and u is the unknown.For instance, we could have

L(u) = ∆u if studying the Laplace equation : ∆u = 0.

Assuming that in “some sense”, to be explained later, the general operator L(·) is the “deriv-ative” of I(·). Then, as we will see (the first time at Subsection 2.4.1), to find the solutionsof (1.2) will reduce to study the critical points of I(.). Consequently u0 will solve (1.2).

1.3. Some motivational examples. We start, as a motivation, with some specific examplesof minimization problems of the type (1.1). See Subsection 1.5 for other examples andreferences.

1.3.1. Dido’s isoperimetric problem.

Origins of the problem (adapted from the problem/legend described at Virgil’s epic Aeneid):Given a rope (oxhide thread) of fixed length and a curve (part of the north African shoreline)determine the optimal path along which to place the rope so that the area (of land) enclosedby the rope and the curve is maximum.

Formulation of the problem: To determine the (smooth) curve γ on the plane of fixed lengthL such that the area enclosed by γ and another given planar curve σ is maximum.

For simplicity, let us assume that σ is the line segment in R2 joining (−1, 0) and (1, 0). Letγ be such a curve parametrized by (x, u(x)), x ∈ [−1, 1], with u(−1) = u(1) = 0 (assumingu = u(x) smooth enough8).

7Other spaces as functions of Bounded Variation could be considered depending on the problem to bestudied.

8Smooth, for the moment, means: function in C1([−1, 1]) in order all quantities involved are well defined.


Recall that the area enclosed by this curve and the line segment, assuming u(x) ≥ 0, isgiven by ∫ 1

−1u(x) dx

and that its length is given by ∫ 1

−1

√1 + [u′(x)]2 dx.

We are thus led to find

max

∫ 1

−1u(x) dx, u = u(x) ∈ A

where

A =

u = u(x), x ∈ [−1, 1], u(x) ≥ 0, u(−1) = u(1) = 0,

∫ 1

−1

√1 + (u′(x))2 dx = L, L > 2

.

At this point, note that it is always possible to reformulate a maximization problem fora function/functional I into a minimization one by studying −I (see Section 1.4) . In theformalism of (1.1), and taking into account this remark, here f(x, u, ξ) = −u. The solutionof this problem turns out to be an arc of a circle (see Subsection ??/in class).

Finally we refer that the term Dido’s problem has been used to cover the more generalproblem: among all closed curves in the plane of fixed perimeter (lenght) determine the curvethat encloses the maximum area. The solution turns out to be a circle (see Subsection ??/inclass).

1.3.2. Newton problem of optimal aerodynamic profile.

See Section 1.4.

1.3.3. Brachistochrone problem.

Problem: Given two points A and B in a vertical plane (but on different vertical lines),assign a path to a moving body M (with mass m > 0) along which the body, beginning fromA, will arrive to the point B falling by its own gravity, in the least time possible.

For simplicity assume A = (a1, b1) and B = (a2, b2) with ai, bi > 0 and a2 > a1 andb1 > b2 in a Cartesian coordinate system with gravity acting in the direction of the negativey-axis. We want to find a (smooth) curve γ parametrized by (x, u(x)), x ∈ [a1, a2], (assumingu = u(x) smooth enough9) with u(x) < b1 for a1 < x ≤ a2, such that a particle with mass mslides from rest at A to B quickest among all such curves.

If this particle moves without friction, the law of conservation of mechanical energy guar-antees that the sum of its kinetic and potential energy remains constant along all the path.In addition if it starts from rest, this imply that for all x ∈ (a1, a2) then

1

2mv(x)2 +mgu(x) = mgb1,

9As before, smooth here means: function in C1([a1, a2])


where g is the (constant) gravitational acceleration on earth and v = v(x) the speed of theparticle at (x, u(x)). Thus for all x ∈ (a1, a2) we have that

v(x) =√

2g(b1 − u(x))

from where the total time of descent of the particle is

(1.3)1√2g

∫ a2

a1

√1 + [u′(x)]2

b1 − u(x)dx

We are thus led to find

min

1√2g

∫ a2

a1

√1 + [u′(x)]2

b1 − u(x)dx, u = u(x) ∈ A

where

A = u = u(x), x ∈ [a1, a2], u(x) < b1, u(a1) = b1, u(a2) = b2 .

In the formalism of (1.1) we remark that here f(x, u, ξ) = 1√2g

√1+ξ2

b1−u . The solution of this

problem turns out to be a portion of a cycloid10 (brachistochrone) (see Subsection ??).

1.3.4. Minimization of the Dirichlet integral.

Problem: To find

inf

∫Ω|∇u|2 dx : u : Ω→ R, u = g on ∂Ω

Here f(x, u, ξ) = |ξ|2. The integral

∫Ω |∇u|

2 dx is called the Dirichlet integral (see Section ??and Subsection ??) and it appears in several applications as:

7→ Problem in linear elasticity11. Let us assume, for simplicity, the one-dimensionalcase, and let us consider an elastic string which, at rest, is described by the segment[−1, 1] ⊂ R on the x-axis. If we load the string and denote by u = u(x) its vertical

10A cycloid is the elongated arch that traces the path of a fixed point on a circle as the circle rollsalong a straight line in two-dimensions. In this case, this line would be the horizontal line y = b1 and thecircle will roll along its bottom. The parametric form of this curve would be x(t) = a1 + k(t − sin(t)) andu(t) = b1 − k(1− cos(t)), t ∈ [0, T ], where the constants T and k are determined by the condition x(T ) = a2and u(T ) = b2.

11Elasticity theory is one of the most important theories of continuum mechanics (see Section 1.5 for somereferences in this topic). The main physical characteristic of a purely elastic material is that the energy storedin the body at a given instant (a scalar quantity, often called strain energy) depends only upon the shapeof the body at that instant. Consequently, returning the body to its initial shape recovers any change inenergy (absence of dissipation). Apart form the strain energy, also the stress (a measure of force per unitarea) and the elasticity (resistance to changes in shape) depend upon the current shape of the body. Forexample, metals will soften and polymers may stiffen as they are deformed to levels approaching failure.Linear elastic constitutive relations model elastic behavior of a material that is subjected to very small strains(strain: amount of deformation an object experiences compared to its original size and shape) and have linearrelations between strain and stress.


displacement (assumed smooth enough12), then, according to the simplest model oflinear elasticity (Hooke’s law), the potential elastic energy of the string is given by

k

2

∫ 1

−1|u′

(x)|2 dx

where k is a positive constant characteristic of the material from which the string ismade of. Assuming that the string is fixed at its boundary points and that the loadis uniformly distributed, the shape of the string will be obtained by minimizing thetotal energy of the system. Let b denote a positive constant giving the uniform loaddistribution. We are led then to minimize

min

∫ 1

−1

[k

2|u′

(x)|2 + b u(x)

]dx, u = u(x) ∈ A

where

A = u = u(x), x ∈ [−1, 1], u(−1) = 0, u(1) = 0 .Here f(x, u, ξ) = k

2ξ2 + b u.

7→ Study of the Laplace’s equation 13 and more generally Poisson’s equation14 see Sub-section ??. These equations appear in many contexts:(a) Potential theory e.g. in the Newtonian theory of gravity, electrostatics15, heat

flow, and potential flows in fluid mechanics.(b) Riemannian geometry e.g. the Laplace-Beltrami operator.(c) Stochastic processes e.g. the stationary Kolmogorov equation for Brownian mo-

tion.(d) Complex analysis e.g. the real and imaginary parts of an analytic function of a

single complex variable are harmonic.

1.3.5. Problems in Hyperelasticity; see also [44]. 16

Consider a continuous body which occupies a domain17 Ω ⊂ R3 (we refer to Ω as a referenceconfiguration of this body)

A deformation (or configuration) of this body is a map u : Ω→ R3 where u(x) denotes thedeformed position of the material point x (u(Ω): deformed configuration of the body) that isassumed to be a differentiable bijection18 and orientation-preserving19, that is,

det∇u(x) > 0, x ∈ Ω.

12Smooth here means again: function in C1([−1, 1])13∆u = 0: the prototype of an elliptic partial differential equation; many of its qualitative properties are

shared by more general elliptic PDE’s.14∆u = f non-homogeneous version of Laplace’s equation15Formally the theory of electricity is similar to the theory of gravity, since the forces of interaction between

the charges and masses separated by a distance have similar form, but nevertheless, electricity and gravityphenomena differ greatly.

16This topic could not be covered here; it will depend on the course development17connected open set18invertibility: to avoid interpenetration of matter19∇u: is called the deformation gradient; locally represents the volume after deformation per unit original

volume.


A hyperelastic material 20 is an elastic material for which the stress-strain relationshipfollows from the existence of a scalar valued volumetric strain energy function in the referenceconfiguration, encapsulating all information regarding the material behavior. Under theassumption that the body is homogeneous (i.e, the material response is the same at eachpoint) this material is characterized by an elastic energy of the form∫

ΩW (∇u) dx

where W : R3×3 → [0,∞) is the strain-energy (or stored-energy) density of the material. Inapplications often W is given in terms of the so-called Green-St.Venant strain tensor

E =1

2

(∇v +∇vT +∇vT∇v

)where v(x) = u(x)− x (displacement).

Properties21 of W :

(1) W (I) = 0 (undeformed state costs no energy)(2) (invariance under change of frame)

W (Rξ) = W (ξ)

for all rotations R and all ξ ∈ R3×3.(3) W (ξ)→∞ as detξ ↓ 0 (infinite compression costs infinite energy)(4) W (ξ)→∞ as |ξ| → ∞ (very large deformations costs infinite energy)

In the presence of an external body force field b : Ω → R3 (e.g gravity) the total elasticenergy of the system is given by

I(u) =

∫Ω

[W (∇u)− b · u] dx

Example: Ogden materials.22 In this models W is of the form

W (ξ) =

m∑i=1

aitr[(ξT ξ)

γi2

]+

l∑j=1

bjtr

[cof

[(ξT ξ)

δj2

]]+ φ(detξ), ξ ∈ R3×3

where ai > 0, γi ≥ 1, bj > 0, δj ≥ 1 (material constants) and φ : R → R ∪ ∞ is a convexfunction with φ(s) → ∞ as s ↓ 0 and φ(s) = ∞ for s ≤ 0 (W is an example of what is

20Most materials undergo strains that qualify as large deformation and in addition in most of them therelationship between stress and strain is nonlinear. There are many potential strain energy functions, depend-ing on the problem/model under study. Hyperelasticity (written sometimes hyper-elasticity) is the materialmodel most suited to the analysis of elastomers.

21postulates22model used to describe the non-linear stress-strain behaviour of complex materials such as rubbers,

polymers, and biological tissue.


called polyconvex integrand; see e.g Dacorogna [18]). Depending on the material constantsthis kind of model includes: Neo-Hookean solids23 and Mooney-Rivlin materials24.

The main problem in hyper-elasticity is to minimize the energy I among functions sub-ject to appropriate boundary conditions. We will use the Direct Method of the calculus ofvariations (see Section ??) to address this problem.

In the setting of linearized elasticity (for homogeneous, isotropic media) the elastic energyis usually given in terms of the so-called linearized strain tensor

E =1

2

(∇v +∇vT

)as

I(v) =1

2

∫Ω

2µ|E(v)|2 +

(k − 2

3µ

)|trE(v)|2 − b · u dx

where µ > 0 and k > 0 are material constants.

1.3.6. Problems in periodic homogenization: applications within the Γ-convergencetheory. 25

Roughly speaking, the aim of homogenization theory is to describe the behavior of micro-scopically heterogeneous composite physical structures by means of homogeneous structureswith global characteristics equivalent to the initial ones.

In many physical situations the heterogeneities are very small in comparison with theregion in which the structure is to be studied and the heterogeneities are evenly distributed,so that they can be modeled by a periodic distribution of period a small parameter. Inpractice, one is interested in the global behavior of these structures when the heterogeneitiesare very, very small. From the mathematical point of view, we are led to characterizing theasymptotic behavior of (systems of) ordinary or partial differential equations with oscillatingperiodic coefficients of period a small parameter ε, as ε tends to zero.

A well-known model problem in periodic homogenization, used frequently to describethermal as well as electrical or linear elasticity properties in a periodic composite mediumhas as underlying the following linear second-order partial differential equation

−div(A(xε

)∇uε

)= g on Ω.

Here Ω is the (material) domain in RN (N ≥ 1), A is a scalar or tensor-valued function withperiodic coefficients, and uε and g are scalar or vector-valued functions in some appropriatefunctional spaces. One wishes to know the asymptotic behavior of the solutions uε as ε→ 0.They converge, under appropriate hypotheses, to a solution of an “homogenized” differentialequation of the form

−div(Ahom(∇u)) = g on Ω.

23that can be used for predicting the nonlinear stress-strain behavior of materials undergoing largedeformations

24model where the strain energy density function is a linear combination of two invariants of the leftCauchy-Green C = ∇v∇vT deformation tensor.

25This topic could not be covered here; it will depend on the course development


Starting with the use of asymptotic expansions methods, homogenization techniques evolvedtoward more general situations through other the concept, as in particular, the notion ofΓ-convergence due to De Giorgi.

From a variational point of view, the theory of periodic homogenization rests on the studyof a family of minimum problems depending on a small parameter ε > 0

(1.4) min

∫Ωfε(x, u(x),∇u(x)) dx+

∫Ωug dx : u = ϕ on ∂Ω

,

where the functions fε (the elastic density energy) are increasingly oscillating in the firstvariable as ε tends to zero, and u (the deformation), g (the density of applied body forces)and ϕ are scalar or vector-valued functions in some Sobolev space. In the example aboveif A = (Aij) and u is a scalar function, fε(x,∇u) =

∑Aij(

xε )∇iu∇ju, where ∇i = ∂/∂xi.

More general minimum problems can be considered but in this Introduction we restrict tothis case for simplicity. The homogenization of the family of minimum problems leads to an“effective homogenized minimum problem” (not depending on ε)

(1.5) min

∫Ωfhom(x, u(x),∇u(x)) dx+

∫Ωug dx : u = ϕ on ∂Ω

such that a sequence of minimizers of (1.4) converges, as ε tends to zero, to a limit u, whichis a minimizer of (1.5). The fundamental property of De Giorgi’s notion of Γ-convergence,and its main link to the other homogenization techniques, is that, under certain growth andcompactness properties on fε and some regularity on g, it implies a sequence of minimizersto have this convergence property.

Due to the the properties of Γ-convergence, the convergence of minimizers (or almostminimizers) of (1.4) to minima of (1.5) follow from the Γ-convergence of the family

(1.6) Iε(u) =

∫Ωfε(x, u(x),∇u(x)) dx

to the homogenized functional

Ihom(u) =

∫Ωfhom(x, u(x),∇u(x)) dx.

This functional provides the macroscopic, or average description, of the periodic bodyby capturing the limiting behavior of the equilibrium states of Iεε. The effective energydensity fhom is to be determined.

Other applications of Γ-convergence may include thin-structures model derivations in elas-ticity.

1.4. Exercises/Reading Projects.

• Problems with constraints and the same type of boundary conditions as Example1.3.1:

- Use Green’s Theorem to mathematically formulate Dido’s more general problem.• Problems with the same type of boundary conditions as Example 1.3.3 or 1.3.4:


- (Geodesic problem on a plane) Formulate mathematically: Find a (smooth) curvejoining two given points in the plane, with the shortest possible length26

- (Geodesic problem on a sphere) Formulate mathematically: Find a (smooth)curve, joining two given points on a sphere, with the shortest possible length27.In general: try with a general (smooth) surface of revolution

- Derive formula (1.3).- (The problem of minimal surfaces of revolution) Formulate mathematically: Find

the (smooth) curve joining two given points in the xy−plane generates by rota-tion about the x−axis the surface of smallest area possible28.

- Related problem to the one above (“soap-film” problem): finding a surface ofleast area whose boundary is a a given plane simple29 closed curve or Jordan curve(Plateau’s problem). Formulate this problem mathematically for the simple caseof a parametric surface.

- Read the mathematical formulation30 of “Fermat’s principle” (e.g in Buttazzo,Giaquinta and Hildebrant [12]).

• Problems with different restrictions.- Read the mathematical formulation31 of “Newton problem of optimal aerody-

namic profile” (e.g in Buttazzo, Giaquinta and Hildebrant [12]; see also Tonelli[50]).

• Other related problems- Consider an electric charge density ρ : R3 → R immovable (time independent)

in three-dimensional vacuum. Let E : R3 → R3 be the electric field produced bythis charge32. It can be seen that the total electric energy of ρ is given by

I =ε0

2

∫R3

(∇ · E)φdx

where φ is an electric potential field φ : R3 → R of E (i.e such that ∇φ = E)with normalizing condition φ(0) = 0.

(i) Show that (∇ · E)φ = ∇ · (Eφ)− E · (∇φ)(ii) Use Divergence Theorem to show that I = c0

2

∫R3 |∇φ|2 dx.

- Prove some of the basic properties of any extreme problem (assume the mini-mum/maximum exists):

(i) If A ⊂ B then

minu∈A

I(u) ≥ minu∈B

I(u)

26It is well know that the solution would be a line segment. In general one can be interested on finding theshortest path between pairs of points on other more general surfaces (or manifolds). These paths are calledgeodesics and are obtained by minimizing its length.

27Suggestion: use spherical coordinates.28The solutions of this problem, when they exist, are catenoids (curves generated by a catenary or hanging

chain). More precisely the solutions would be of the type u(x) = λ cosh(x+µλ

)for some constants λ > 0 and

µ to be determined.29i.e that does not intersect itself30not the resolution at this point31not the resolution at this point32For simplicity we ignore units here and we assume functions to be smooth enough for all deriva-

tions/computations to make sense.


(ii) It holds that

minu∈A

I(u) = −maxu∈A

[−I(u)].

(iii) Attention:

minu∈A

[I(u) + F (u)] ≥ minu∈A

I(u) + minu∈A

F (u)

(iv) Linearity: b and a > 0 real numbers

minu∈A

[aI(u) + b] = aminu∈A

I(u) + b

(v) What happens under the composition with a monotone function?

(Extra: And if inf and sup are considered instead?)

1.5. Further reading for Section 1.

• For historical references on Calculus of Variations see Goldstine [36], Giaquinta andHildebrant [33], Freguglia and Giaquinta [25], Chapter 6 in Buttazzo, Giaquinta andHildebrant [12], the introductory chapters in Tonelli [50] and Guisti [35, 34]. Forsome motivational examples see also the introductory chapter in Kot [38]. For generalhistorical references on mathematics see Boyer [10, 11]• Other references:

- Menger on the calculus of variations (based on an article by Karl Menger) (http ://www − history.mcs.st− and.ac.uk/Extras/CalculusofV ariations.html)

- (*) Fonseca and Leoni [28](http : //cvgmt.sns.it/media/doc/paper/2438/PCAMCalcV arF inal.pdf)

- (*) Ball [6].- The Feynman Lectures on Physics Volume II (The Principle of Least Action,

Chapter 19; Electrostatics, Chapter 4; Electrostatic energy, Chapter 8; MaxwellEquation, Chapter 18) (http : //www.feynmanlectures.caltech.edu/). See alsoFeynman, Leighton and Sands [24].

- (*) About elasticity theory: see Ciarlet [14] and Gurtin [37]. Some engineeringnotes:(http : //www.brown.edu/Departments/Engineering/Courses/En221/Notes/Elasticity/Elasticity.htm),(http : //www.umich.edu/ bme332/ch6consteqelasticity/bme332consteqelasticity.htm)

- (*) About Γ-convergence: see De Giorgi and dal Maso [31], De Giorgi and Fran-zoni [32] for the origins of this notion; Braides [13] and Dal Maso [20] for acomprehensive treatment and bibliography on the subject, Foccardi [26] for asurvey oriented to physical applications.

- (*) Introduction to periodic homogenization: first chapter of Allaire [2]- New insight into optimization and variational problems in the 17th century, E.

Stein and K. Wiechmann(http : //congress.cimne.com/femclass42/files/37SteinWiechmann.pdf)

- The isoperimetric problem, Viktor Blasjo(https : //www.maa.org/sites/default/files/pdf/uploadlibrary/22/Ford/blasjo526.pdf)


- (*) The isoperimetric inequality, R. Osserman(http : //www.ams.org/journals/bull/1978 − 84 − 06/S0002 − 9904 − 1978 −14553− 4/S0002− 9904− 1978− 14553− 4.pdf)

- (*) A survey on the Newton problem of optimal profiles, G. Buttazzo(http : //cvgmt.sns.it/media/doc/paper/1339/BUTTAZZO.PDF )

- (*) A new solution to the three-body problem, R. Montgomery(http : //www.ams.org/notices/200105/fea−montgomery.pdf)

(*) Some of the concepts appearing in this article/book are not expectable tobe understood at this point. Some of these concepts will be addressed along thiscourse development, and for this reason a rigorous reading of this article/book is onlyrecommended much later.

Lecture 2.

2. Classical (or indirect) Methods

The ideia of these methods33 (in contrast with the direct method34 that we will study inSection ??) is assuming the existence of a solution for a variational problem of the form (1.1)to first derive, assuming certain a priori smoothness conditions, a system (or an equation,depending on the dimensions N and d) of second order differential equations, the Euler-Lagrange equations associated with the functional I, which necessarily (but not sufficiently)have to be satisfied. Any solution of these equations would be called a critical (or stationary)point of I. As in the analogous case of studying extrema of a function of several variablesthen sufficient conditions would be needed to determine the nature of these critical points.

Mainly, we restrict here to the case of one-dimensional problems, that is, when N = d = 1,but we plan also to address (even if in less detail) the case when N = 1 or d = 1 (known asscalar problems). Some comments would be made concerning the vectorial case.

As a motivation we start by reviewing the study of extrema of a function of severalvariables.

2.1. Finding extrema of functions of several variables (e.g [4, 5]).

Given Ω ⊂ RN an open set, let f : Ω→ R be a differentiable function.

Necessary condition for f to have an extreme point: Recall that if x0 ∈ Ω is an extremeof f then Df(x0) = 0 (i.e x0 is a critical point of f). Indeed, being an extreme implies thatfor all v ∈ RN , the function

g(ε) = f(x0 + εv)

ε ∈ h : x0 + hv ∈ Ω, is such that

g′(0) = 0.

To study if x0 is indeed a local minimum or maximum, or a saddle point, apart from usingthe definition, which most of the times is quite difficult, a sufficient condition is to study the

33covering roughly from the end of 17th-century (related ideas) or beginning of the 18th century (firstdevelopments) to the end of the 19th one

34developed afterwards


Hessian of f . Assume that f ∈ C2 and that we are interested on the problem of finding

infx∈Ω

f(x).

Sufficient condition for having a minimizer: Assuming that Df(x0) = 0, if D2f(x0) is apositive definite matrix then x0 is a point of (strict) local minimum of f , that is, there existρ > 0 such that if 0 < ||x− x0|| ≤ ρ then f(x) > f(x0).

Conversely: recall that if x0 is a local minimum, that is, there exist ρ > 0 such that if||x−x0|| ≤ ρ implies f(x) ≥ f(x0), then it can be seen that D2f(x0) is a positive semi-definitematrix.

As we will see, these two last properties relate “minimality” and ”convexity” propertiesof f . Our interest on convex functions relies on the fact that this would also happen for ourgeneral case.

2.2. Remarks on convex analysis: part I. (see Section 2.3)

Definition 2.1 (convex set). A set Ω ⊂ RN is said to be convex if for every x, y ∈ Ω andevery θ ∈ [0, 1] then θx+ (1− θ)y ∈ Ω.

Trivial examples: the empty set, a single point (singleton), RN itself, affine sets (a line,hyperplane), a half-space, the interior of a sphere or an ellipsoid, etc.

In general: The intersection of convex sets are convex but the union may not be convex(see Section 2.2.1).

Definition 2.2. (convex hull) The convex hull of a set Ω ⊂ RN , conv(Ω), is the set of allconvex combinations of points in Ω, i.e,

conv(Ω) =

k∑i=1

θixi : xi ∈ Ω, θi ≥ 0,

k∑i=1

θi = 1

For examples see Section 2.2.1.

Definition 2.3. Let Ω ⊂ RN be a convex set. A function f : Ω→ R is said to be:

- convex iff(θx+ (1− θ)y) ≤ θf(x) + (1− θ)f(y)

for all x, y ∈ Ω and θ ∈ [0, 1].- strictly convex if

f(θx+ (1− θ)y) < θf(x) + (1− θ)f(y)

for all x, y ∈ Ω, x 6= y, and θ ∈ (0, 1).

Geometrically, the function f is convex if the line segment between any two points (x, f(x))and (y, f(y)) on its graph lies above or on its graph (strictly convex: the line segment staysstrictly above the graph).

Equivalently, f is convex if its epigraph (epi(f) = (x, α) ∈ Ω×R : f(x) ≤ α i.e the setof points on or above the graph of the function) is a convex set.

A function is said to be (strictly) concave if −f is (strictly) convex.

Some one-dimensional examples:


(1) Convex functions: Affine functions f(x) = ax+b, x ∈ R (a, b ∈ R); g(x) = |x|, x ∈ R;h(x) = x2, x ∈ R; i(x) = ex, x ∈ R; k(x) = −log(x), x > 0.

(2) g(x) = −e−x, x ∈ R is not convex.

Question: for which of the above functions is it easy to check directly by definition (insteadas geometrically) that are indeed convex? Which ones are strictly convex? See Section 2.2.1.

Some examples of multivariate convex functions:

(1) Affine functions: f(x) = aT x + b, x ∈ RN (a ∈ RN , b ∈ R) or g(x) = tr(ATx) + b,x ∈ Rd×N (A ∈ Rd×N , b ∈ R) are convex but not strictly convex.

(2) Some quadratic functions: f(x) = xTAx+ cTx+ d, x ∈ RN (A ∈ RN ×N, c, d ∈ RNgiven) is convex if and only if A is positive semi-definite matrix and is strictly convexif and only if A is positive definite matrix (Why? See Section 2.2.1)35.

(3) Any norm in RN (or Rd×N ). Why?36

Some trivial properties of convex functions:

- If f, g are convex functions and α, β > 0 then αf + βg is also convex.- The composition of an convex function with an affine one, is convex.- The pointwise supremum of a family of convex functions is a convex function.

Other properties of convex functions (see [27], [47]): continuous, differentiable and twicedifferentiable almost everywhere37.

Proposition 2.4 (Convexity along lines). Let Ω ⊂ RN be a convex set. A function f : Ω→ Ris convex if and only given x ∈ Ω and v ∈ RN the function g(t) = f(x + tv) defined on theset t ∈ R : x+ tv ∈ Ω is convex.

Proof. See Section 2.2.1

Theorem 2.5. Let Ω ⊂ RN be an open and convex set and let f : Ω→ R be a C2-function.Then the following conditions are equivalent:

i) f is convexii)

f(x) ≥ f(y) +∇f(y)T (x− y)

for all x, y ∈ Ω (which expresses the geometrical fact that the graph of f lies above itstangent hyperplane at the point y)

iii) D2f is positive semi-definite in Ω.

Proof. In class.

Remark 2.6. For i)⇔ ii) is enough to assume C1.

Remark 2.7. For strictly convex functions: f (C1 function) is strictly convex in Ω if and onlyif

f(x) > f(y) +∇f(y)T (x− y)

for all x, y ∈ Ω, x 6= y. In addition if f (C2 function) is such that D2f is positive definite inΩ, then f is strictly convex in Ω. The contrary is not true. Why? See Section 2.2.1.

35Maybe: other criteria are needed36Recall the basic properties of a norm37with respect to the Lebesgue measure


Theorem 2.8. Let Ω ⊂ RN be an open and convex set and let f : Ω → R be a convexC1-function. Then any critical point x0 ∈ Ω of f is an absolute minimum of f in Ω.

Proof. In class.

Remark 2.9. To be a critical point is still necessary: e.g f(x) = ex, x ∈ R, is convex but itdoes not have a minimizer in R.

Lemma 2.10. Let Ω ⊂ RN be an open and convex set and let f : Ω→ R be a strictly convexfunction. Assuming that there exists a minimizer of f in Ω, then this minimizer is unique.

Theorem 2.11 (Jensen inequality: discrete version). Let Ω ⊂ RN be a convex set and letf : Ω → R be a convex function. Given xii=1,...,K points in Ω and θii=1,...,K positive

numbers with∑k

i=1 θi = 1, then

f

(k∑i=1

θixi

)≤

k∑i=1

θif(xi)

Proof. See Section 2.2.1.

Theorem 2.12 (More general version of Jensen inequality; see Section ?? in Appendix). LetΩ ⊂ RN be open and bounded and f : R→ R a C1 convex function38. Given u ∈ L1(Ω), then

f

(1

|Ω|

∫Ωu(x) dx

)≤ 1

|Ω|

∫Ωf(u(x)) dx.

Proof. In class.

2.2.1. Exercises.

(1) Show that the intersection of convex sets are convex but the union may not be convex.(2) Find the convex hull of Ω = (1, 0), (1, 2), (−1, 2), (2, 4), (2, 3), (6, 1) and of S =

([2, 6]× [0, 4]) \ ([3, 4]× [0, 1])(3) Show by definition that f(x) = x2 is a strictly convex function.(4) (Convexity along lines) Let Ω ⊂ RN be a convex set. Show that a function f : Ω→ R

is convex if and only given x ∈ Ω and v ∈ RN the function g(t) = f(x + tv) definedon the set t ∈ R : x+ tv ∈ Ω is convex.

(5) Show that f(x) = xTAx+ cTx+ d, x ∈ RN (A ∈ RN ×N, c, d ∈ RN given) is convexif and only if A is positive semi-definite matrix and is strictly convex if and only if Ais positive definite matrix.

(6) Show: f (C1 function) is strictly convex in Ω if and only if

f(x) > f(y) +∇f(y)T (x− y)

for all x, y ∈ Ω, x 6= y. In addition if f (C2 function) is such that D2f is positivedefinite in Ω, then f is strictly convex in Ω. The contrary is not true.

(7) Study the convexity of the following functions: f(x, y) = x2

y , (x, y) ∈ R2, y 6= 0 and

g(x) = log(ex1 + ...+ exN ), x ∈ RN .

38Still true for less regular functions f .


(8) Let Ω ⊂ RN be an open and convex set and let f : Ω → R be a strictly convexfunction. Assuming that there exists a minimizer of f in Ω, prove that this minimizeris unique.

(9) (Linear least square problem) Let A ∈ RN×N such that det(A) 6= 0. Justify thatx = (ATA)−1AT b is the only absolute minimizer of

minx∈RN

||Ax− b||2

(10) Convexity of some functions is very useful to prove some important properties. Provethe so-called Young-inequality by using the fact that the exponential function is astrictly convex function in R. Recall Young-inequality:

ab ≤ ap

p+bq

q, ∀a, b ≥ 0

whenever p, q ∈ (1,∞) and 1/p+ 1/q = 1; equality holds if and only if ap = bq.(11) See that f(x) = |x|p (p > 0), x ∈ RN , is convex if and only if p ≥ 1, and strictly

convex if and only if p > 1. In particular, if N = 1, a, b ≥ 0, p ≥ 1, then by theconvexity of f (

a

2+b

2

)p≤ ap

2+bp

2,

or equivalently,

(a+ b)p ≤ 2p−1(ap + bp).

(12) Show that f(x) =√

1 + |x|2, x ∈ RN , is strictly convex.(13) Prove the discrete version of Jensen inequality. As an application of this inequality

show that39 given xi, i = 1, ...k, positive numbers then

x1 + ...+ xkk

≥ k√x1...xk

Suggestion: use the fact that − log(x), x > 0, is convex.

(14) (Extended-value function). Given f : Ω ⊂ RN → R define f : RN → R as f(x) = f(x)

if x ∈ Ω and f(x) =∞ if x ∈ RN\Ω. Show that the following properties are equivalent:i) for all x, y ∈ RN and θ ∈ [0, 1]

f(θx+ (1− θ)y) ≤ θf(x) + (1− θ)f(y)

(inequality in R ∪ ∞)ii) Ω is convex and for all x, y ∈ Ω and θ ∈ [0, 1] then

f(θx+ (1− θ)y) ≤ θf(x) + (1− θ)f(y)

2.3. Recommended reading.

Convex functions properties:

- See e.g chapter two and three in [9]. Other references: [47], [27].- A. D. Alexandrov, Almost everywhere existence of the second differential of a convex

function and some properties of convex surfaces connected with it, Leningrad StateUniv. Annals [Uchenye Zapiski] Math. Ser. 6 (1939), 3-35.

39There are many other applications of this type


- R. Howard, Alexandrov’s theorem on the second derivatives of convex functions viaRademacher’s theorem on the fisrt derivatives of Lipschitz functions (lectures notes)(http : //people.math.sc.edu/howard/Notes/alex.pdf)

Lecture 3

2.4. Study of the minimization problem (1.1) for N, d = 1.

This case reduces problem (1.1) to consider Ω = (a, b) ⊂ R and an integrand f : [a, b] ×R × R → R, f = f(x, u, ξ), continuous. As for the space X, in this part, we considerX = C1([a, b]).

In analogy to what is done for functions of several variables, we start by studying necessaryconditions to have a minimizer for problem (1.1).

2.4.1. Necessary conditions for the existence of (smooth) minimizers: fixed andfree boundary values problems. Euler-Lagrange equation.

Fixed and free boundary values problems: in these problems A could be (compare withthe examples given at the introduction and other examples in Section 1.4)

A = u ∈ C1([a, b]) : u(a) = α, u(b) = β, α, β ∈ R (Dirichlet boundary conditions)

or

A = u ∈ C1([a, b]) : u(a) = α, α ∈ R

or

A = u ∈ C1([a, b]) : u(b) = β, β ∈ R

or

A = C1([a, b]) (free boundary case)

2.4.1.1. First case: Dirichlet boundary conditions.

Let us consider first the case where we impose Dirichlet boundary conditions (see Examples1.3.2 and 1.3.3 and other examples in Section 1.4). Our objective is to study

(2.1) infu∈A

I(u)

where I : C1([a, b])→ R is given by

I(u) =

∫ b

af(x, u(x), u

′(x)) dx

and

A = u ∈ C1([a, b]) : u(a) = α, u(b) = β, α, β ∈ R


2.4.1.1.1. First necessary condition: Weak form of the Euler-Lagrange equation(or vanishing of the first variation of I). 40

Throughout this part we assume f to be a C1-function41. Assume now that u0 ∈ A is asolution of problem (2.1) that is

I(u0) ≤ I(u), ∀u ∈ A.

If we consider a variation of u0:

u0 + εv, ε ∈ R,with v such that u0 + εv ∈ A then, in particular,

I(u0) ≤ I(u0 + εv).

Hence we can say thatI(u0) ≤ I(u0 + εv)

for all v ∈ C1([a, b]) with v(a) = 0, v(b) = 0. Given such a function v and defining

Φ(ε) = I(u0 + εv)

we have that Φ(0) ≤ Φ(ε) for all ε ∈ R and thus 0 is a minimum point of Φ. If Φ wasdifferentiable then this would imply that

(2.2) Φ′(0) =d

dεI(u0 + εv)bε=0= 0.

At this point recall:

Theorem 2.13. [Differentiation under the integral sign; see e.g Dieudonne [21]]

Given [a, b] ⊂ R and Ω ⊂ RN an open set, let f : Ω× [a, b]→ R be a continuous function.

Then g(x) =∫ ba f(x, t) dt is continuous in Ω. If, in addition, the partial derivative ∂f

∂x existsand is continuous on Ω× [a, b], then g is continuously differentiable on Ω and

g′(x) =

∫ b

a

∂f

∂x(x, t) dt.

By Theorem 2.13 the function Φ is indeed differentiable and (2.2) is equivalent to∫ b

a

[∂f

∂u(x, u0(x), u0

′(x))v(x) +

∂f

∂ξ(x, u0(x), u0

′(x))v

′(x)

]dx = 0

for all v ∈ C1([a, b]) with v(a) = 0, v(b) = 0. In particular

(2.3)

∫ b

a

[∂f

∂u(x, u0(x), u0

′(x))v(x) +

∂f

∂ξ(x, u0(x), u0

′(x))v

′(x)

]dx = 0

for all v ∈ C1c ([a, b]).

40Analogously for a maximum.41for a slightly more general case: see [12]


The integral equality (2.3) is known as the weak form of the Euler-Lagrange equationassociated with I.

A function u0 ∈ C1([a, b]) satisfying (2.3) is called a weak-critical (or weak-stationary)point of I (or a weak solution of the Euler-Lagrange equation associated to I).

The termd

dεI(u0 + εv)bε=0

(or, here, equivalently, the left term of equality (2.3)) is called the first variation of I at u0

in the direction of v and sometimes denoted by ∂I(u0, v).

Remark 2.14. Minimizers of a functional I of the above form are not necessarily C1. Example:

The functional I(u) =

∫ 1

0

([u

′(x)]2 − 1

)2dx does not have a minimizer over A = u ∈

C1([0, 1]) : u(0) = 0, u(1) = 0. Why? (in class).

2.4.1.1.2. Stronger condition for smooth minimizers: Euler-Lagrange equation as-sociated with I. 42

Throughout this part we assume f to be a C2-function. We start with an auxiliary result:the first version of the so-called Lemma of the Calculus of Variations.

Lemma 2.15 (Lemma of the Calculus of Variations; d = 1 smooth version). Let Ω ⊂ RN bean open set and u ∈ C(Ω) be such that∫

Ωu(x)φ(x) dx = 0, ∀φ ∈ C∞c (Ω).

Then u = 0 on Ω.

Proof. In class.

Lemma 2.16 (Lemma of the Calculus of Variations, d = 1 more general version). LetΩ ⊂ RN be an open set and u ∈ L1

loc(Ω) be such that for any φ ∈ C∞c (Ω) it follows that∫Ωu(x)φ(x) dx = 0.

Then u = 0 a.e in Ω.

Proof. For the easier case where u ∈ L2(Ω) see Theorem 1.24 in Dacorogna [18]. For a proofin L1(Ω) where also N = 1 see Lemma 1.4 in Buttazzo, Giaquinta and Hildebrant [12]. Forthis more general statement see Corollary 3.26 in Adams [1].

We are now ready to derive the Euler-Lagrange equation associated with I.

42Analogously for a maximum.


Theorem 2.17 (Euler-Lagrange equation). Given f ∈ C2([a, b] × R × R), if problem (2.1)admits a minimizer u0 ∈ A ∩ C2([a, b]) then necessarily

(2.4)d

dx

[∂f

∂ξ(x, u0(x), u0

′(x))

]=∂f

∂u(x, u0(x), u0

′(x)) for all x ∈ [a, b]

or equivalently, by the chain rule,

∂2f

∂x∂u(x, u0(x), u0

′(x)) +

∂2f

∂u∂ξ(x, u0(x), u0

′(x))u0

′(x)(2.5)

+∂2f

∂ξ2(x, u0(x), u0

′(x))u0

′′(x) =

∂f

∂u(x, u0(x), u0

′(x)) for all x ∈ [a, b].

Proof. In class.

Remark 2.18. If ∂2f∂ξ2

(x, u0(x), u0′(x)) 6= 0 then

u0′′(x) =

∂f∂u(x, u0(x), u0

′(x))

∂2f∂x∂u(x, u0(x), u0

′(x)) + ∂2f∂u∂ξ (x, u0(x), u0

′(x))u0′(x)

.

We call equation (2.4) the Euler-Lagrange equation associated with the functional I. Notethat for the case we are analizing N = d = 1 it is a non-linear second order ODE for thefunction u0, and thus it general solution would depend on two constants that it will bedetermined by the boundary conditions in A. Note also that in general the boundary valueproblem

d

dx

[∂f

∂ξ(x, u0(x), u0

′(x))

]=∂f

∂u(x, u0(x), u0

′(x)) for all x ∈ [a, b]

u(a) = α, u(b) = β

may not have a solution, and even when it does have a solution, this solution may not beunique.

A solution of (2.4) is called a critical (or stationary) point of I (sometimes also an extremalor Lagrange curve).

Let us obtain the Euler-Lagrange equation in the following cases:

• f(x, u, ξ) = f(ξ)• f(x, u, ξ) = f(x, ξ)• f(x, u, ξ) = f(u, ξ)

In class.

We remark that Theorem 2.17 is not an existence result: generally a solution of (2.4)(even C2) is not a minimizer of problem (2.1). Compare these several examples:

i) I(u) =

∫ 1

0

([u

′(x)]2 − 1

)2dx, A = u ∈ C1([0, 1]) : u(0) = 0, u(1) = 0.

ii) I(u) =

∫ 1

0e−[u

′(x)]

2

dx, A = u ∈ C1([0, 1]) : u(0) = 0, u(1) = 0.

iii) I(u) =∫ 1

0 [u′(x)− 1]2 dx, A = u ∈ C1[0, 1] : u(0) = 0, u(1) = 1.

In class.


Lemma 2.19. Given f ∈ C2([a, b]×R×R) assume that u0 ∈ A ∩C2([a, b]) is a solution ofthe Euler-Lagrange equation (2.4) then

(2.6)d

dx

[f(x, u0(x), u0

′(x))− u0

′(x)

∂f

∂ξ(x, u0(x), u0

′(x))

]=∂f

∂x(x, u0(x), u0

′(x))

for all x ∈ [a, b].

Proof. In class.

Equation (2.6) is called Du Bois-Reymond equation or second form of the Euler-Lagrangeequation.

Corollary 2.20 (Other necessary condition: Second form of the Euler-Lagrange equation).Given f ∈ C2([a, b]×R×R) assume that u0 ∈ A∩C2([a, b]) is a minimizer of problem (2.1).Then (2.6) holds for all x ∈ [a, b].

Remark 2.21. In general a solution for (2.6) is not a solution for (2.4) (see counter-examplein Section 2.4.2). From the proof of Lemma 2.19 it is, however, easy to see that if theintegrand does not depend on x, that is, f = f(u, ξ), f ∈ C2, then any non-constant solutionu0 ∈ A ∩ C2([a, b]) of Du Bois-Reymond equation (2.6) is also a solution of the Euler-Lagrange quation (2.4) (from which both necessary conditions are equivalent for non-constantsolutions). We also note that in this case (2.6) imply[

f(u0(x), u0′(x))− u0

′(x)

∂f

∂ξ(u0(x), u0

′(x))

]= c, c ∈ R.

Assuming that u0 ∈ A ∩ C2([a, b]) is a solution of the Euler-Lagrange equation (2.4) anddenoting

φ(u, ξ) := f(u, ξ)− ξ ∂f∂ξ

(u, ξ)

it follows that φ(u0(x), u0′(x)) = c, c ∈ R, for all x ∈ (a, b) (conservation law). Such a

function with this property is called a first integrand for the Euler-Lagrange equation (2.4).

Example. Application in mechanics (Notation: x = x(t) instead of u = u(x))

Let

I(x) =

∫ t2

t1

(m2|x(t)|2 − V (x(t))

)dt

be the action of a motion x = x(t), t ∈ [t1, t2], of a point mass m in a conservative force fieldF = − V’ with potential energy V . If E represent the total energy of x, then it can be seenthat E(x(t), x(t)) = c, c ∈ R, along the solutions of Newton’s law (see section 2.4.2).

Another observation that it is worth to mention about Theorem 2.17 is that minimizers,when exist in A, are not necessarily of class C2.


Example. The functional

I(u) =

∫ 1

−1u2(x)[2x− u′

(x)]2 dx

has a minimizer over A = u ∈ C1([−1, 1]) : u(−1) = 0, u(1) = 1 that is of class C1 but notof class C2. Indeed, it can be seen that this minimization problem has a unique minimizerin A. In class.

In general, it is not so easy to derive directly neither existence nor uniqueness of minimizers.

In analogy to the case of functions of several variables (see Lemma 2.10 and Theorem 2.8)we have the following result.

Lemma 2.22. (uniqueness) If (u, ξ) → f(x, u, ξ) is a strictly convex function43 for all x ∈[a, b], then if problem (2.1) (for any of the above class A) has a minimizer, this minimizer isunique.

Proof. In class.

Lemma 2.23. (sufficiency) Assume f ∈ C2 and such that (u, ξ) → f(x, u, ξ) is a convexfunction for all x ∈ [a, b]. Then, any solution u0 ∈ A ∩ C2 of the Euler-Lagrange equation(2.4) is a minimizer of problem (2.1).

Proof. In class.

2.4.2. Exercises.

(1) (Notation: x = x(t) instead of u = u(x)) Let

I(x) =

∫ t2

t1

(m2|x(t)|2 − V (x(t))

)dt

be the action of a motion x = x(t), t ∈ [t1, t2] of a point mass m in a conservativeforce field F = − V’ with potential energy V . If E(x, v) = m

2 |v|2 + V (x) represent

the total energy of a motion x = x(t) show that E(x(t), x(t)) = c, c ∈ R along thesolutions of Newton’s law.

(2) Generalize Theorem 2.17 for the case u : [a, b]→ R and I(u) =∫ ba f(x, u(x), u′(x), u′′(x)) dx

and A = u ∈ C2([a, b]) : u(a) = α1, u′(a) = α2, u(b) = β1, u

′(b) = β2.(3) Obtain the Euler-Lagrange equation associated to

I(u) = H

∫ 1

−1|u′′

(x)|2 dx

where H > 0. Discuss the existence of minimizers for

infu∈A

I(u)

43thus continuous


where A = u ∈ C2([−1, 1]) : u(−1) = u′(1) = 0, u(1) = u′(1) = 0 (simpler modelin thin elastic beam problem; H is a positive constant depending on the material ofthe beam and u = u(x) represents the vertical deflection of the beam at a point x).

(4) Assume that f = f(ξ). Use Jensen’s inequality to give an alternative proof of the

fact that if f is convex then u0(x) = β−αb−a (x − a) + α is a minimizer for our general

minimization problem.44

(5) Let Ω ⊂ RN and u ∈ L1loc(Ω) be such that for any φ ∈ C∞c (Ω) then∫

Ωu(x)φ(x) dx = 0 with

∫Ωφ(x) dx = 0.

Show that there exists a constant K such that u = K a.e in Ω. Suggestion: Use theFundamental Lemma of the Calculus of Variations to show that u =

∫Ω u(y)f(y) dy

a.e for any f ∈ C∞c (Ω) with∫

Ω f(y) dy = 1.

(6) (DuBois-Reymond’s lemma) Let Ω = (a, b) ⊂ R and u ∈ L1loc(Ω) be such that for any

φ ∈ C∞c (Ω) then ∫Ωu(x)φ

′(x) dx = 0

then there exists a constant K such that u = K a.e in Ω. Sugestion: Use the previousexercise.

(7) Use Du Bois-Reymond’s lemma above to show that if u ∈ C1([a, b]) is a solution ofthe weak Euler-Lagrange equation associated to the functional I in problem (2.1)(with f of class C1) then there exist a constant c ∈ R such that

∂f

∂ξ(x, u0(x), u0

′(x)) = c+

∫ x

a

∂f

∂u(x, u0(x), u0

′(x)) dx.

This equation is called Du Bois-Reymond’s equation or as the Euler equation ofintegrated form. Conclude that

d

dx

[∂f

∂ξ(x, u0(x), u0

′(x))

]=∂f

∂u(x, u0(x), u0

′(x)) for all x ∈ (a, b).

Note however that this is not equivalent to equation (2.5). Why?

2.5. Recommended reading.

(1) For general reading on the Classical (or indirect) Method see [18], [12], [35], [48], [30],[40], [7]. See also the lecture notes [40] and [17].

References

[1] A. R. Adams, Sobolev spaces, Academic Press, New York, 1975.

[2] G. Allaire, Shape optimization by the homogenization method, Springer, New York, 2002.[3] L. Ambrosio, N. Fusco and D. Pallara, Functions of bounded variation and free discontinuity problems, Oxford

Mathematical Monographs. Oxford: Clarendon Press, 2000.[4] T. M. Apostol, Calculus, Vol. 1, One-variable calculus, with an introduction to linear algebra, Wiley, New York,

1967.

[5] T. M. Apostol, Calculus, Vol. 2, Multi-variable calculus and linear algebra with applications to differential equationsand probability, Wiley, New York, 1969.

[6] J. M. Ball, The calculus of variations and materials science, Quarterly of Applied Mathematics, Vol LVI, NAo 4

(1998), 719-740.

44Recall that u0 is a solution of the Euler-Lagrange equation for I


[7] O. Bolza, Lectures on the Calculus of Variations, Chelsea Publishing Company, New York, 1904.

[8] A. Braides, Γ-convergence for beginners, Oxford Lecture Series in Mathematics and its Applications, 22, Oxford

University Press, Oxford, 2002.[9] S. Boyd and L. Vandenberghe, Convex optimization, Cambridge University Press, 2004.

[10] C. B. Boyer, The history of the calculus and its conceptual development, Dover, New York, 1949.[11] C. B. Boyer, A history of mathematics, Wiley, 1968.

[12] G. Buttazzo, M. Giaquinta and S. Hildebrant, One dimensional variational problems, an introduction, Oxford

Press, Oxford, 1998.[13] H. Brezis, Analyse Fonctionnelle, theorie et applications, Masson, Paris, 1983.

[14] Ph. G. Ciarlet, Mathematical elasticity, Vol 1., three dimensional elasticity, North-Holland, 1988.

[15] E. A Coddington, An introduction to ordinary differential equations, Dover, 1961.[16] E. A Coddington and N. Levison, Theory of ordinary differential equations, McGraw-Hill, 1963.

[17] R. Cristoferi, Calculus of Variations, Lecture Notes, Spring Term 2016.

[18] B. Dacorogna, Introduction to the Calculus of Variations, Imperial College Press, 2004.[19] B. Dacorogna, Direct Methods on the Calculus of Variations, Applied Mathematical Sciences, 78, Springer-Verlag,

Berlin, 1989.

[20] G. Dal Maso, An introduction to Γ- convergence, Progress in Nonlinear Differential Equations and their Applica-tions, 8. Birkhauser Boston, Inc., Boston, MA, 1993.

[21] J. Dieudonne, Infinitesimal calculus, Hermann, 1971.

[22] L. Evans, Partial differential equations, Amer. Math. Soc., Providence, 1998.[23] L. C. Evans and R. F. Gariepy, Measure theory and fine properties of functions, CRC Press, 1992.

[24] R. Feynman, R. Leighton and M. Sands, The Feynman Lectures on Physics, Vol II, Addison-Wesly, 1966[25] P. Freguglia and M. Giaquinta, The early period of the Calculus of Variations, Birkhauser, 2016.

[26] M. Focardi, Γ-convergence: a tool to investigate physical phenomena across scales, Mathematical Models and

Methods in Apllied Sciences, 2012.[27] I. Fonseca and G. Leoni, Modern Methods in the Calculus of Variations: Lp-Spaces, Springer Monographs in

Mathematics. New York, NY: Springer, 2007.

[28] I. Fonseca and G. Leoni, Calculus of Variations, The Princeton Companion to Applied Mathematics, PrincetonUniversity Press, 2015.

[29] G. Folland, Real Analysis: Modern techniques and their applications, Wiley-Interscience, 1984.

[30] I. M. Gelfand and S. V. Fomin, Calculus of Variations, Prentice-Hall, Englewood, 1963.[31] E. De Giorgi and G. Dal Maso, Γ-convergence and calculus of variations, Mathematical theories of optimization

(Genova 1981), 121–143, Lecture Notes in Math. 979, Springer, Berlin, 1983.

[32] E. De Giorgi and T. Franzoni, Su un tipo di convergenza variazionale, Atti Accad. Naz. Lincei Rend. Cl. Sci. Fis.Mat. Natur. (8) 58 (1975), No. 6, 842–850.

[33] M. Giaquinta and S. Hildebrant, Calculus of variations I, II, Grundlehren math. Wiss. 310, 311, Springer, Berlin,1996.

[34] E. Giusti, Metodi Diretti nel Calcolo delle Variazioni, Unione Matematica Italiana, Bologna, 1994.

[35] E. Giusti, Direct Methods in the Calculus of Variations, World Scientific, Singapore, 2003.[36] H. H. Golstine, A history of the calculus of variations from the 17th to the 19th century, Springer, Berlin, 1980.

[37] M. E. Gurtin, An introduction to continuum mechanics, academic Press, New York, 1981.

[38] M. Kot, A first course in the calculus of variations, Student mathematical library, Vol. 72, AMS, Rhode Island,2014.

[39] E. Kreyszig, Introductory Functional Analysis with Applications, Wiley, New York, 1978.

[40] E. Lenzmann, Calculus of Variations, Lecture Notes, Fall Term 2014.[41] G. Leoni, A First Course in Sobolev Spaces (See also: Lecture Notes by this author)

[42] M.I.T Open Course, Convex Analysis[43] P. H. Rabinowitz[44] F. Rindler, Introduction to the Modern Calculus of Variations, Lecture Notes, Spring Term 2015.[45] W. Rudin, Functional analysis, McGraw-Hill, New York, 1073.[46] H. L. Royden, Real analysis, Macmillan Publishing Company, New York, 1988.

[47] R. T. Rockafellar, Convex analysis, Princeton University Press, 1988.

[48] H. Sagan, Introduction to the Calculus of Variations, Dover, New-York, 1969.[49] Struwe ...

[50] L. Tonelli, Fondamenti di Calcolo delle Variazioni, Vol. 1, Zanichelli, Bologna, 1921.[51] L. Tonelli, Fondamenti di Calcolo delle Variazioni, Vol. 2, Zanichelli, Bologna, 1923.[52] M. Willen

lecture notes on calculus of variations and partial · the calculus of variations is a mathematical...

Documents