finite element methods for non-linear continua · 2019. 5. 7. · an inﬁnitesimal material area...

ME280B

Finite Element Methods for Non-linearContinua

Panos Papadopoulos

Department of Mechanical Engineering

University of California, Berkeley

2019 edition

Copyright c©2019 by Panos Papadopoulos

Introduction

This is a set of notes written as part of teaching ME280B, a graduate course on non-linear

finite elements in the Department of Mechanical Engineering at the University of California,

Berkeley.

Berkeley, California P. P.

January 2019

i

Contents

1 Review of Continuum Mechanics 1

1.1 Kinematics of Deformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Balance Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.2.1 Conservation of Mass . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.2.2 Balance of Linear Momentum . . . . . . . . . . . . . . . . . . . . . . 11

1.2.3 Balance of Angular Momentum . . . . . . . . . . . . . . . . . . . . . 13

1.2.4 Balance of Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.3 Invariance under Superposed Rigid-body Motions . . . . . . . . . . . . . . . 16

1.4 Closure of the Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2 Consistent Linearization 19

2.1 Sources of Non-linearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.1.1 Geometric non-linearity . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.1.2 Material non-linearity . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.1.3 Non-linearity in the balance laws . . . . . . . . . . . . . . . . . . . . 20

2.2 Gateaux and Frechet Differentials . . . . . . . . . . . . . . . . . . . . . . . . 21

2.3 Consistent Linearization in Continuum Mechanics . . . . . . . . . . . . . . . 25

3 Incremental Formulations 36

3.1 Strong and Weak Form of the Balance Laws . . . . . . . . . . . . . . . . . . 36

3.2 Lagrangian and Eulerian Methods . . . . . . . . . . . . . . . . . . . . . . . . 43

3.3 Semi-discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3.4 Total and Updated Lagrangian Methods . . . . . . . . . . . . . . . . . . . . 48

3.5 Corotational Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

3.6 Arbitrary Lagrangian-Eulerian Methods . . . . . . . . . . . . . . . . . . . . 62

3.7 Eulerian Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

ii

4 Solution of Non-linear Field Equations 85

4.1 Generalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

4.2 The Newton-Raphson Method and its Variants . . . . . . . . . . . . . . . . . 90

4.3 The Newton-Raphson Method in Non-linear Finite Elements . . . . . . . . . 96

4.4 Continuation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

4.4.1 Force control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

4.4.2 Displacement control . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

4.4.3 Arc-length control . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

4.5 Computational Treatment of Constraints . . . . . . . . . . . . . . . . . . . . 126

5 Constitutive Modeling of Deformable Continua 138

5.1 Incompressible Newtonian Viscous Fluid . . . . . . . . . . . . . . . . . . . . 138

5.2 Nonlinear Elasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

iii

List of Figures

1.1 A body B and its subset S . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Reference and current configurations of a body B at times t0 and t, respectively. 2

1.3 Mapping of an infinitesimal material line element dX from the reference to

the current configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.4 Mapping of an infinitesimal material surface element dA to its image da in

the current configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.5 Interpretation of the right polar decomposition. . . . . . . . . . . . . . . . . . 6

1.6 Angular momentum of an infinitesimal volume element. . . . . . . . . . . . . 14

1.7 Configurations associated with motions χ and χ+ differing by a superposed

rigid-body motion χ+. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.1 Non-linear mapping between Banach spaces . . . . . . . . . . . . . . . . . . . 21

2.2 Configurations of the body for consistent linearization . . . . . . . . . . . . . 25

3.1 Reference and current configuration . . . . . . . . . . . . . . . . . . . . . . . 37

3.2 Partition of boundary ∂R of R . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.3 Tangent vector at a point in the current configuration . . . . . . . . . . . . . 39

3.4 A surface S traversing the configuration R at time t . . . . . . . . . . . . . . 43

3.5 Finite element mesh schematic . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.6 Lagrangian finite element surface . . . . . . . . . . . . . . . . . . . . . . . . 45

3.7 Eulerian finite element mesh with a moving solid or fluid . . . . . . . . . . . 46

3.8 Time discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3.9 Configurations at times t0, tn and tn+1 . . . . . . . . . . . . . . . . . . . . . 48

3.10 Partition of element boundary ∂Ωe0 . . . . . . . . . . . . . . . . . . . . . . . 52

3.11 Configurations for updated Lagrangian formulation . . . . . . . . . . . . . . . 57

3.12 Fixed vs. corotational coordinate system . . . . . . . . . . . . . . . . . . . . 61

3.13 General corotational coordinate system . . . . . . . . . . . . . . . . . . . . . 61

iv

3.14 Configurations of ALE method . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.15 Velocity constraint on the boundary of an ALE mesh . . . . . . . . . . . . . 64

3.16 Configurations of ALE method with coincident body and mesh boundaries . . 65

3.17 Mesh region PM in the current configuration . . . . . . . . . . . . . . . . . . 65

3.18 Neumann boundary conditions in ALE methods (R and M are shown apart

from each other for clarity) . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

3.19 Operator-split approach for ALE implementation (Rn+1, resulting from La-

grangian step andMn+1, resulting from Eulerian remapping are shown apart

from each other for clarity) . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

3.20 Projection method from one mesh (denoted “-”) to another (denoted “+”) . . 76

3.21 Volumetric and shear distortion indicators for a patch of 4-node quadrilaterals 77

3.22 Smooth map between equilateral triangles in the natural space (ξ, η) and gen-

eral triangles in the physical space (x, y) . . . . . . . . . . . . . . . . . . . . 78

3.23 Smooth map between squares in the natural space (ξ, η) and general quadrilat-

erals in the physical space (x, y) . . . . . . . . . . . . . . . . . . . . . . . . . 79

3.24 Smooth map from a square in the natural space (ξ, η) and a general quadrilat-

eral domain in the physical space (x, y) . . . . . . . . . . . . . . . . . . . . . 80

3.25 Finite difference stencil for linear triangles with unit spacing in the natural

domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

3.26 Finite difference stencil for bilinear quadrilaterals with unit spacing in the

natural domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

4.1 Ball of radius δ centered at x0 . . . . . . . . . . . . . . . . . . . . . . . . . . 86

4.2 Convex and non-convex sets . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

4.3 Contraction mapping in S . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

4.4 One-dimensional geometric interpretation of the Newton-Raphson method . . 90

4.5 Failure of the Newton-Raphson method due to singular J(x(k)) . . . . . . . . 91

4.6 Failure of the Newton-Raphson method due to the large distance of x(k) from

the solution x∗ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

4.7 One-dimensional geometric interpretation of the modified Newton-Raphson

method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

4.8 Balls B(x∗; ρ) and B(x∗; δ) around the solution x∗ . . . . . . . . . . . . . . . 94

4.9 Functions with different radii of attractions for the Newton-Raphson method 95

4.10 Convergence under the Newton-Kantorovich conditions . . . . . . . . . . . . 96

v

4.11 Pressure-type follower traction . . . . . . . . . . . . . . . . . . . . . . . . . . 98

4.12 Body force in a gravitational field . . . . . . . . . . . . . . . . . . . . . . . . 98

4.13 The Kelvin-Voigt solid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

4.14 Schematic of line search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

4.15 Schematic of quasi-Newton method in one-dimension . . . . . . . . . . . . . 109

4.16 Limit and bifurcation points in (u, λ)-space . . . . . . . . . . . . . . . . . . . 114

4.17 Snapping of a deep arch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

4.18 Solution in neighborhood of point (u(s), λ(s)) . . . . . . . . . . . . . . . . . 115

4.19 Relation between the loading vector c and the null eigenvector ψ of KT . . . 119

4.20 Force control in (u, λ)-space . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

4.21 Range of validity of force control in (u, λ)-space . . . . . . . . . . . . . . . . 121

4.22 Displacement control in (ul, λ)-space . . . . . . . . . . . . . . . . . . . . . . 122

4.23 Range of validity of displacement control in (ul, λ)-space . . . . . . . . . . . 122

4.24 Arc-length in (u, τλ)-space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

4.25 Spherical arc-length in (u, τλ)-space . . . . . . . . . . . . . . . . . . . . . . . 123

4.26 Normal-plane arc-length in (u, τλ)-space . . . . . . . . . . . . . . . . . . . . 124

4.27 Contact of deformable solid with rigid foundation . . . . . . . . . . . . . . . 127

4.28 Penalty functional and the enforcement of the constraint c = 0 . . . . . . . . 134

vi

Chapter 1

Review of Continuum Mechanics

1.1 Kinematics of Deformation

Define a body B as a collection of material points and identify a typical material point

with P , as in Figure 1.1.

S

B

P

Figure 1.1. A body B and its subset S .

At some time t, define the configuration of B as the region in the three-dimensional Eu-

clidean point space E3 occupied by B, and denote this configuration R and its boundary ∂R.The material point P in the configuration of B at time t is identified with a vector x in

three-dimensional Euclidean vector space E3 drawn from any fixed origin o, see Figure 1.2.

Let x be resolved with respect to a fixed orthonormal basis ei as

x = xiei ; xi = x · ei , (1.1)

where “·” denotes dot-product between two vectors and the Einsteinian summation conven-

tion in enforced on repeated indices. The configuration of B at time t will be referred to as

the current or present configuration.

1

2 Review of Continuum Mechanics

RR0

x

X

O

oEAei

Figure 1.2. Reference and current configurations of a body B at times t0 and t, respectively.

Let the same body B occupy at some other time t = t0 (often set, without loss of

generality, to t0 = 0) the region R0 in E3 with boundary ∂R0. The material point P at time

t0 is identified with another vector X in E3 drawn, in general, from another fixed origin O,

see Figure 1.2. Let X be now resolved with respect to a fixed orthonormal basis EA as

X = XAEA ; XA = X · EA . (1.2)

The configuration of B at time t0 will be referred to as the reference configuration.

The motion χ : E3 × R→ E3 is a mapping of points X in the reference configuration to

points in the current configuration x at given time t, that is,

x = χ (X, t) ; xi = χi (XA, t) . (1.3)

The velocity and acceleration fields are respectively defined at time t as

v = χ (X, t) =∂χ (X, t)

∂t; vi = χi =

∂χi (XA, t)

∂t(1.4)

and

a = χ (X, t) =∂2χ (X, t)

∂t2; ai = χi =

∂2χi (XA, t)

∂t2. (1.5)

where ˙( ) denotes the material time derivative, that is, the time derivative of a quantity for

a fixed material point.

The fundamental measure of relative deformation is the deformation gradient tensor F

which may be viewed as a linear mapping of an infinitesimal material line element dX of the

reference configuration to an infinitesimal line element dx in the current configuration, that

is,

dx = FdX ; dxi = FiAdXA , (1.6)

ME280B May 7, 2019

Kinematics of Deformation 3

x

dx

X

dX

RR0

Figure 1.3. Mapping of an infinitesimal material line element dX from the reference to the

current configuration.

see Figure 1.3. Assuming that there is a motion χ, where

χ : (X, t)→ x = χ(X, t) , (1.7)

one readily concludes from (1.6) that

F =∂χ

∂X. (1.8)

The deformation gradient is a two-point tensor, in the sense that

F = FiAei ⊗ EA , (1.9)

where “⊗” denotes tensor product of two vectors. As seen from (1.8), the deformation

gradient F has one “leg” (the first) in the current configuration and the other “leg” in the

reference configuration.

An infinitesimal material volume element dV in the reference configuration is likewise

mapped to an infinitesimal volume element dv in the current configuration, such that

dv = JdV , (1.10)

where J = detF is the Jacobian of the motion. Appealing to the inverse function theorem of

calculus, it can be established that a continuously differentiable (in X) motion χ is invertible

at a given time t if, and only if,

detF = det∂χ

∂X6= 0 , (1.11)

for all X ∈ R0. Upon assuming, on physical grounds, that the sign of a given material

volume should be always preserved, it can be concluded from (1.10) that invertibility of χ

necessitates that J > 0.

May 7, 2019 ME280B


An infinitesimal material area element dA in the reference configuration is mapped to

an infinitesimal area element da in the current configuration according to Nanson’s formula,

which states that

n da = JF−TN dA , (1.12)

where N (resp. n) is a unit vector normal to the surface dA (resp. da), see Figure 1.4.

xX

R0 R

dX1

dX2

dx1

dx2

N n

Figure 1.4. Mapping of an infinitesimal material surface element dA to its image da in the

current configuration.

Other useful measures of deformation are the right and left Cauchy-Green deformation

tensors C and B defined, respectively, as

C = FTF ; CAB = FiAFiB (1.13)

and

B = FFT ; Bij = FiAFjA . (1.14)

It is clear from the preceding component representations that C is a referential tensor,

while B is a spatial tensor. Both tensors are symmetric, which means that their components

form can be represented in the form of symmetric matrices.

Let M and m be unit vectors in the direction of dX and dx, respectively, such that

dX = MdS , dx = mds , (1.15)

and define the ratio

λ =ds

dS(1.16)

as the stretch of the material line element dX. It can be easily established with the aid

of (1.8), (1.15), and (1.16) that

λm = FM (1.17)

ME280B May 7, 2019


and, upon also using (1.13), that

λ2 = M ·CM ; λ2 = MACABMB . (1.18)

Likewise, (1.17) can be alternatively expressed as

1

λM = F−1m , (1.19)

hence, with the aid of (1.14), it follows that

1

λ2= m ·B−1m ;

1

λ2= miB

−1ij mj . (1.20)

Two important measures of strain can be obtained by resolving the difference ds2 − dS2

as

ds2 − dS2 = dx · dx− dX · dX= (FdX) · (FdX)− dX · dX= dX · (CdX)− dX · dX = dX · (C− I) dX

= dX · 2EdX (1.21)

or, alternatively,

ds2 − dS2 = dx · dx− dX · dX= dx · dx−

(F−1dx

)·(F−1dx

)

= dx · dx− dx ·B−1dx = dx ·(i−B−1

)dx

= dx · 2edx , (1.22)

where I and i are the referential and spatial identity tensors, respectively. These measures

are the (relative) Lagrangian strain tensor E, given with the aid of (1.21) by

E =1

2(C− I) ; EAB =

1

2(CAB − δAB) . (1.23)

and the (relative) Eulerian (or Almansi) strain tensor e, defined from (1.22) as

e =1

2

(i−B−1

); eij =

1

2

(δij − B−1

ij

). (1.24)

The polar decomposition theorem states that F, being non-singular by virtue of J =

detF > 0, can be uniquely resolved into

F = RU = VR ; FiA = RiBUBA = VijRjA . (1.25)

May 7, 2019 ME280B


Clearly, R is a two-point tensor, U is a referential tensor, and V is a spatial tensor.

More specifically, R is a proper-orthogonal (or indexrotation—seealsotensor!proper orthog-

onalrotation) tensor, that is, one which satisfies RTR = I, RRT = i, and detR = +1. In

addition, U and V are symmetric positive-definite tensors referred to as the right and left

stretch tensors, respectively. In light of these definitions, equations (1.25)1,2 are correspond-

ingly referred to as the right and left polar decompositions of F.

Starting from (1.17) it can be seen that

dx = R (UdX) , (1.26)

which implies that dX is first stretched and rotated byU, and is subsequently further rotated

by R, see Figure 1.5. A corresponding interpretation may be readily formulated for the left

dxdX dX′

U R

Figure 1.5. Interpretation of the right polar decomposition.

polar decomposition.

Along its principal directions, U effects stretch without any rotation, which implies that,

if dX lies along such a principal direction, then

UdX = λdX (1.27)

or, upon recalling (1.15)1,

UM = λM . (1.28)

With reference to the principal stretches λI , I = 1, 2, 3 and the associated principal direc-

tions MI , I = 1, 2, 31 of the preceding linear eigenvalue problem, one may use the spectral

representation theorem to write

U =3∑

I=1

λIMI ⊗MI , (1.29)

1The set of eigenvectors MI may be taken to be orthonormal without loss of generality.

ME280B May 7, 2019


from which it follows that

F = RU =3∑

I=1

λI (RMI)⊗MI (1.30)

and also, given the definition of C and the orthogonality of R, that

C = FTF = (RU)T (RU) = U2 =3∑

I=1

λ2IMI ⊗MI . (1.31)

Generalized referential measures of deformation can be introduced in the form

Cm =3∑

I=1

λ2mI MI ⊗MI m 6= 0 (1.32)

logC =3∑

I=1

log λ2IMI ⊗MI (Hencky strain) (1.33)

with associated measures of generalized referential strain

E(m) =

1m

(Cm/2 − I

)if m 6= 0

12lnC if m = 0

. (1.34)

Note that m = 2 in (1.34) recovers the Lagrangian strain tensor.

It is readily concluded from (1.25) and (1.29) that

V =3∑

I=1

λI (RMI)⊗ (RMI) (1.35)

and also, given the definition of B and the orthogonality of R, that

B = FFT = (VR) (VR)T = V2 =3∑

I=1

λ2I (RMI)⊗ (RMI) . (1.36)

There exist several ways to extract the polar factors of F. In particular, focusing attention

to the right polar decomposition, one may:

(a) Extract the stretch U directly from F in closed form2, and subsequently obtain R as

R = FU−1 . (1.37)

2A. Hoger and D.E. Carlson, “Determination of the stretch and rotation in the polar decomposition of

the deformation gradient”, Q. Appl. Math., 42:113–117, (1984).

May 7, 2019 ME280B

https://www.jstor.org/stable/43637254?seq=1#metadata_info_tab_contents

https://www.jstor.org/stable/43637254?seq=1#metadata_info_tab_contents


(b) Extract the rotation R directly from F in closed form3, and subsequently obtain U as

U = RTF . (1.38)

(c) Use a numerical method to solve the matrix equation

[CAB] = [FiAFiB] = [UACUCB] (1.39)

for the components [UAB] of U to within the required tolerance, then evaluate R as

in (a). A computationally efficient algorithm for solving this matrix equation employs

the Newton-Raphson method, according to which

U(i+1) =1

2

(U(i) +CU(i)−1

), (1.40)

with initial guess

U(0) = I (or mI, m > 0) . (1.41)

It can be shown that with this choice of U(0), all U(i) are positive-definite and also

U(i) converges to U at a quadratic rate4

(d) Solve the linear eigenvalue problem (1.31) for the eigenpairs (λ2I ,MI), form U us-

ing (1.29), and then calculate R as R = FU−1.

Lastly, attention is focused on mixed space-time derivatives of the motion, which give

rise to measures of deformation rate. Invoking the chain rule, write

F =d

dt

∂χ

X=

∂

∂X

dχ

dt=

∂v

∂X=

∂v

∂x

∂x

∂X= LF . (1.42)

In (1.42), L is the spatial velocity gradient, defined as

L =∂v

∂x; Lij =

∂vi∂xj

(1.43)

The tensor L may be uniquely decomposed into a symmetric tensor D, termed the rate-

of-deformation tensor, and a skew-symmetric tensor W, termed the vorticity or indexspin

tensor—seealsovorticity tensorspin tensor, such that

L = D+W . (1.44)

3P. Papadopoulos and J. Lu, “On the direct determination of the rotation tensor from the deformation

gradient”, Math. Mech. Solids., 2:17–26, (1997).4N.J. Higham, “Newton’s method for the matrix square root”, Math. Comp., 46:537–549, (1986).

ME280B May 7, 2019

http://mms.sagepub.com/content/2/1/17.abstract

http://mms.sagepub.com/content/2/1/17.abstract

http://www.ams.org/journals/mcom/1986-46-174/S0025-5718-1986-0829624-5/

Balance Laws 9

The physical meaning of D can be established by observing that (1.17), (1.42), and (1.44)

imply that

˙lnλ =λ

λ= m ·Dm . (1.45)

Regarding W, first note that owing to its skew-symmetry, it can be put into a one-to-one

correspondence with a vector w of E3, such that for any vector z in E3,

Wz = w × z . (1.46)

The vector w is called the axial vector of W. Then, considering a unit vector m along a

principal direction of D, it can be established with the aid of (1.17), (1.42), (1.44), and (1.45)

that

m = Wm = w ×m . (1.47)

This shows that the axial vector of W can be thought of as the angular velocity of a line

element positioned along a principal direction of D.

Using (1.42), as well as the earlier definitions of the deformation and strain tensors, it

can be readily concluded that

C =˙

FTF = 2FTDF ; CAB = 2FiADijFjB , (1.48)

E = FTDF ; EAB = FiADijFjB , (1.49)

and also

B =˙

FFT = LB+BLT ; Bij = LikBkj + BikLjk , (1.50)

e =1

2

(B−1L+ LTB−1

); eij =

1

2

(B−1

ik Lkj + LkiB−1kj

). (1.51)

It is also easy to show that

dJ

dt= J div v = J trL . (1.52)

1.2 Balance Laws

The underlying physical principal of continuum mechanics are the conservation of mass,

the balance of linear and angular momentum, and the balance of energy. There are briefly

reviewed in this section.

May 7, 2019 ME280B


Preliminary to the introduction of the balance laws, three important results from calculus

and real analysis are stated without proof. The first is the divergence theorem, according

to which, given a continuously differentiable real-valued function φ = φ(x, t) : P × R → R

defined at time t on some subset P of E3 with boundary ∂P , then∫

P

gradφ dv =

∫

∂P

φn da , (1.53)

where gradφ is the gradient of the function φ, defined as

gradφ =∂φ

∂xiei . (1.54)

Versions of this theorem may be easily deduced from the above for vector and tensor func-

tions. The second result is known as the Reynolds’ transport theorem, and states that if φ is

differentiable in both of its variables, then

d

dt

∫

P

φ dv =

∫

P

(φ+ φ div v) dv , (1.55)

where

φ =∂φ

∂t+∂φ

∂x· v . (1.56)

Using 1.56 and the divergence theorem, it is easy to show that (1.55) may be recast as

d

dt

∫

P

φ dv =

∫

P

[∂φ

∂t+ div(φv)

]dv =

∫

P

∂φ

∂tdv +

∫

∂P

φv · n da . (1.57)

Lastly, the localization theorem states that if φ is continuous in x and∫

P

φ dv = 0 , (1.58)

for all subsets P of R at a given time t, then φ = 0 everywhere in R at time t.

Extensive use of these three important results will be made henceforth.

1.2.1 Conservation of Mass

Start by axiomatically admitting the existence of a measure of mass m(S ) for every subset

S of the body B. Upon assuming certain regularity conditions for this measure, it can be

established that there exists a mass density ρ(x, t) such that∫

S

dm =

∫

P

ρ dv (1.59)

ME280B May 7, 2019

Balance Laws 11

for the region P ⊂ R that occupies the part S at time t.

Likewise, there exists a mass density function ρ0(X) at time t0, such that

∫

S

dm =

∫

P0

ρ0 dV (1.60)

for the region P0 ⊂ R0 that occupies the part S at time t0. The preceding two statements

reflect the conservation of mass between the configurations occupied by the body at times t

and t0. Equating the right-hand sides of these statements and recalling (1.10) leads to the

integral statement ∫

P0

ρ0 dV =

∫

P

ρ dv =

∫

P0

ρJ dV (1.61)

or ∫

P0

(ρ0 − ρJ) dV = 0 . (1.62)

Invoking the localization theorem, this implies that

ρ0 = ρJ . (1.63)

This is a local statement of mass conservation between the two configurations.

Alternatively, conservation of mass may be stated in integral form as

d

dt

∫

P

ρ dv = 0 . (1.64)

Applying Reynolds’ transport theorem in the form of (1.55) implies that

∫

P

(ρ+ ρ div v) dv = 0 . (1.65)

Appealing to the localization theorem, this leads to a local statement of mass conservation

in the form

ρ+ ρ div v = 0 . (1.66)

1.2.2 Balance of Linear Momentum

The linear momentum of the material which occupies the region P at time t is∫Pρv dv.

Admit the existence of two types of external forces, both of which quantify the interaction

of P with its surrounding environment. Specifically, consider: (a) body forces per unit mass

b = b(X, t) acting on the material particles the region S which occupies P at time t and (b)

contact forces per unit area t = t(x, t;n) acting on the boundary surface ∂P and depending

May 7, 2019 ME280B


on the orientation n of the surface. The integral statement of linear momentum balance for

the continuum occupying P at time t takes the form

d

dt

∫

P

ρv dv =

∫

P

ρb dv +

∫

∂P

t da . (1.67)

Using linear momentum balance in the preceding primitive form it can be shown that

t(x, t;n) = −t(x, t;−n) . (1.68)

This result is known as Cauchy’s lemma, and asserts that contact forces acting at a point x

on opposite sides of the same surface are equal and opposite.

Applying balance of linear momentum to a right-angled tetrahedral region (the Cauchy

tetrahedron) in conjunction with Cauchy’s lemma, it can be shown that there exists a ten-

sor T, appropriately termed the Cauchy stress tensor, such that

t = Tn ; ti = Tijnj . (1.69)

Upon using mass conservation, (1.68), and the divergence theorem, it can be concluded

from the linear momentum balance statement (1.67) that

d

dt

∫

P

ρv dv =

∫

P

ρa dv

=

∫

P

ρb dv +

∫

P

Tn da

=

∫

P

ρb dv +

∫

P

divT dv . (1.70)

Upon invoking the localization theorem, this results in

divT+ ρb = ρa ; Tij,j + ρbi = ρai . (1.71)

The preceding statement of linear momentum balance can be also resolved on the geometry

of the reference configuration, where, upon again using mass conservation, Cauchy’s lemma

ME280B May 7, 2019

Balance Laws 13

and the divergence theorem, in conjunction with Nanson’s formula (1.12), one finds that

d

dt

∫

P

ρv dv =d

dt

∫

P0

ρ0v dV

=

∫

P

ρb dv +

∫

∂P

Tn da

=

∫

P0

ρ0b dV +

∫

∂P0

TJF−TN dA

=

∫

P0

ρ0b dV +

∫

∂P0

PN dA

=

∫

P0

ρ0b dV +

∫

∂P0

DivP dV , (1.72)

where P, defined as

P = JTF−T ; PiA = JTijF−1Aj , (1.73)

is the first Piola-Kirchhoff stress tensor. Upon again invoking the localization theorem, it

follows that the local statement of linear momentum balance may be expressed in referential

form as

DivP+ ρ0b = ρ0a ; PiA,A + ρ0bi = ρ0ai . (1.74)

Recalling (1.76), one may define the traction p resolved on the geometry of the reference

configuration as

p = PN ; pi = PiANA . (1.75)

where the (differential) force df across a surface is given by

df = tda = pdA . (1.76)

1.2.3 Balance of Angular Momentum

The angular momentum of the material which occupies the region P at time t is∫Px×ρv dv,

see Figure 1.6.

The integral statement of angular momentum balance for the continuum occupying P at

time t isd

dt

∫

P

x× ρv dv =

∫

P

x× ρb dv +∫

∂P

x× t da . (1.77)

Upon invoking mass conservation and linear momentum balance, as well as using the diver-

gence and localization theorems, it follows from the above that

T = TT ; Tij = Tji , (1.78)

May 7, 2019 ME280B


O

x

v

P

Figure 1.6. Angular momentum of an infinitesimal volume element.

namely angular momentum balance requires that T be symmetric. An equivalent local

statement of angular momentum balance may be formulated in terms of the first Piola-

Kirchhoff stress. Indeed, upon using (1.73), this takes the form

PFT = FPT ; PiAFjA = FiAPjA . (1.79)

1.2.4 Balance of Energy

First, consider the mechanical energy balance theorem, which is a direct consequence of the

preceding three balance laws. According to it,

d

dt

∫

P

1

2ρv · v dv +

∫

P

T ·D dv =

∫

P

ρb · v dv +∫

∂P

t · v da , (1.80)

where

K (P) =

∫

P

1

2ρv · v dv (1.81)

is the kinetic energy of the continuum in P ,

S (P) =

∫

P

T ·D dv (1.82)

is the stress power , and

R(P) =

∫

P

ρb · v dv +∫

∂P

t · v da (1.83)

is the rate at which the external forces do work in P . In words, the theorem asserts that the

rate of change of the kinetic energy and the stress power of the material in P are equal to

the rate at which work is done by the external forces acting on R.Note that the stress power can be expressed as

∫

P

T ·D dv =

∫

P0

P · F dV =

∫

P0

S · E dV , (1.84)

ME280B May 7, 2019

Invariance under Superposed Rigid-body Motions 15

where S is the second Piola-Kirchhoff stress tensor. Given (1.42) and (1.73), it can be easily

concluded from (1.84) that

S = F−1P = JF−1TF−T ; SAB = F−1Ai PiB = JF−1

Ai TijF−1Bj (1.85)

or

T =1

JPFT =

1

JFSFT ; Tij =

1

JPiAFiA =

1

JFiASABFjB . (1.86)

It is immediately concluded from the above that S is a symmetric tensor.

Next, admit the existence of a scalar heat supply r = r(x, t) per unit mass, which quan-

tifies the rate at which heat is supplied (or absorbed) by the body. Also, introduce a scalar

heat flux h = h(x, t;n) per unit area across a surface ∂P with outward unit normal n. In

addition, assume that there exists a scalar function ε = ε(x, t) per unit mass, called the

internal energy which quantifies all forms of energy stored in the body other than kinetic

energy.

Now the principle of energy balance is postulated in the form

d

dt

∫

P

[1

2ρv · v + ρε

]dv =

∫

P

ρb · v dv +∫

∂P

t · v da+∫

P

ρr dv −∫

∂P

h da . (1.87)

This is a statement of the first law of thermodynamics. Using a standard procedure akin to

the one used to deduce (1.68), it can be established that

h(x, t;n) = −h(x, t;−n) . (1.88)

Likewise, using (1.88), the Cauchy tetrahedron argument can be repeated for the balance of

energy to show that there exists a heat flux vector q, such that

h = q · n . (1.89)

Now, using the divergence and localization theorems, along with the previously deduced

local forms of mass, linear momentum, and angular momentum balance, a local statement

of energy balance can be obtained from (1.87) as

ρε = ρr − div q+T ·D ; ρε = ρr − qi,i + TijDij . (1.90)

An equivalent referential local statement of energy balance may be readily deduced from (1.90)

as

ρ0ǫ = ρ0r −Div q0 +P · F ; ρ0ε = ρ0r − q0A,A + PiAFiA , (1.91)

where q0 = JF−1q.

May 7, 2019 ME280B

16 1.3. INVARIANCE UNDER SUPERPOSED RIGID-BODY MOTIONS

1.3 Invariance under Superposed Rigid-body Motions

Consider two motions χ and χ+ which map points of the same reference configuration to

points of two current configuration which at time t differ from each other by a superposed

rigid motion χ+, as in Figure 1.7.

x

x+

X

y

y+

Y

R0

R

R+

χ

χ+

χ+

Figure 1.7. Configurations associated with motions χ and χ+ differing by a superposed rigid-

body motion χ+.

It can be shown using a standard procedure that

x+ = χ+(x, t) = Q(t)x+ c(t) (1.92)

where Q is a proper orthogonal tensor and c a vector, both functions of time t only.

It follows easily from chain rule that

F+ =∂χ+

∂X=

∂χ+

∂x

∂x

∂X= QF , (1.93)

This can be used to establish the relations

R+ = QR , U+ = U , V+ = QVQT (1.94)

and also

C+ = C , E+ = E , B+ = QBQT , e+ = QeQT . (1.95)

In addition, it is easy to show that

n+ = Qn , m+ = Qm , ds+ = ds , da+ = da , dv+ = dv . (1.96)

ME280B May 7, 2019

Closure of the Theory 17

Also, taking the material time derivative of (1.92), it follows that

v+ = x+ = Qx+Qv + c

= ΩQx+Qv + c(QQT = Ω

)

= Ω(x+ − c) +Qv + c = ω × (x+ − c) +Qv + c (1.97)

Using the above, it can be readily establish that

L+ = QLQT +Ω , D+ = QDQT , W+ = QWQT +Ω . (1.98)

A physically plausible assumption is made about the transformation of the stress vector t

under a superposed rigid-body motion. Indeed, recalling that

t = Tn, (1.99)

namely that t is linear in n, assume that

t+ = Qt (1.100)

which immediately implies that

T+ = QTQT . (1.101)

From the above, it follows easily that

P+ = QP , S+ = S . (1.102)

Invariance under superposed rigid-body motions is a physically motivated postulate in

continuum mechanics, according to which the stress response remains unaltered when a rigid

motion is superposed onto the actual motion of interest. Following this postulate, assume

that

T = T(F, F, · · · ,GradF, · · · , ρ, · · ·

)(1.103)

then invariance under superposed rigid-body motion implies

T+ = T(F+, F+, · · · ,GradF+, · · · , ρ+, · · ·

), (1.104)

namely T remains unaltered by the superposed rigid motion.

Regarding invariance, it is also postulated that

r+ = r , ε+ = ε , q+ = Qq . (1.105)

The last relation implies

h+ = h . (1.106)

May 7, 2019 ME280B

18 1.4. CLOSURE OF THE THEORY

1.4 Closure of the Theory

It is instructive to summarize the governing equations of continuum mechanics in the context

of a thermomechanical theory. There are a total of 8 equations stemming from the balance

laws, that is,

1. 1 equation from balance of mass, see (1.63) or (1.66),

2. 3 equations from balance of linear momentum, see (1.71) or (1.74),

3. 3 equations from balance of angular momentum, see (1.78) or (1.79)

4. 1 equation from balance of energy, see (1.90) or (1.91).

The unknowns of the problem are the position x (or velocity v), the mass density ρ, the

stress T, the heat flux q, and the temperature θ, which is a total of 3+1+9+3+1=17.

Closure is provided in the theory by constitutive laws for the stress (6 unknowns) and

heat flux (3 unknowns).

ME280B May 7, 2019

Chapter 2

Consistent Linearization

2.1 Sources of Non-linearity

The consistent linearization of the kinematic and kinetic variables, as well as of the balance

laws themselves is of importance in two ways. First, in deriving “linearized” theories con-

sistent with general non-linear theories of continuum mechanics. For instance, consistent

linearization may be used to derive the classical theory of linear elasticity from non-linear

elasticity or a theory of infinitesimal deformations superposed on pre-existing finite deforma-

tions (“small-on-large” theories). The latter has an important application spectral analysis

of non-linear systems at a given time.

A second application of consistent linearization, which is especially relevant to finite ele-

ment analysis, is in obtaining the solution to boundary- and initial- value problem of contin-

uum mechanics using iterative methods methods that rely on instantaneous approximation

of the non-linear system by a linear counterpart.

There are three distinct sources of non-linearity when solving problems in continuum

mechanics. Each of them is briefly discussed here.

2.1.1 Geometric non-linearity

The measures of deformation may be non-linear functions of x (or v) and its gradients. For

instance, consider the (relative) Lagrangian strain tensor E defined in (1.23), and recall that

its components EAB may be expressed in terms of the components xi of the position vector

x as

EAB =1

2

(∂xi∂XA

∂xi∂XB

− δAB

). (2.1)

19

20 Consistent Linearization

This is clearly a non-linear function of xi through its derivative with respect to the referential

position X. This form of non-linearity is referred to as geometric.

Geometric non-linearity can be present even when the strains are “infinitesimal” (that

is, when they can be accurately approximated as linear in x). This could be the case when

the rotations are large.

2.1.2 Material non-linearity

The constitutive equation of the stress or heat flux may be a non-linear function of its

arguments (which are themselves generally functions of the unknowns of the problem). This

form of non-linearity is referred to as material or physical.

Material non-linearity would exist in a general isotropic non-linear elastic solid, for which

the constitutive law is of the form

T = α0i+ α1B+ α2B2 , (2.2)

where α0, α1, and α2 are functions of the scalar invariants of B. Likewise, material non-

linearity exists in a general Reiner-Rivlin fluid, for which

T = β0i+ β1D+ β2D2 , (2.3)

where β1, β2, and β3 are functions of the scalar invariants of D.

2.1.3 Non-linearity in the balance laws

The equation of linear momentum balance can exhibit non-linearity due to the convective

term in the acceleration term. Indeed, recalling (1.71) and also taking into account that

v = v(x, t), it follows that

divT+ ρb = ρv = ρ

(∂v

∂t+∂v

∂x· v)

(2.4)

The second term on the right-hand side of (2.4) is clearly non-linear in v.

Note that this form of non-linearity vanishes when linear momentum balance is written

in the referential form of (1.74), since, in this case v = v(X, t) and

DivP+ ρ0b = ρ0v = ρ0∂v

∂t. (2.5)

ME280B May 7, 2019

Gateaux and Frechet Differentials 21

2.2 Gateaux and Frechet Differentials

In this section, the Gateaux and Frechet differentials are introduced in the context of a non-

linear operator F mapping elements from a Banach space U to another Banach space V ,

as in Figure 2.1.1 As a special case, one may U = V = Rn, where R

n is equipped with the

U

U0V

F

Figure 2.1. Non-linear mapping between Banach spaces

usual Euclidean norm.2

Let F : U0 ⊂ U → V be a non-linear mapping, where U0 is an open set, and take

x ∈ U0, v ∈ U0 and ω ∈ R. The Gateaux differential DGF (x, v) of the operator F at x in

the direction v is formally defined as

DGF (x, v) = limω→0

F (x+ ωv)− F (x)ω

. (2.6)

Equivalently, the Gateaux differential may be defined as

DGF (x, v) =

[d

dωF (x+ ωv)

]

ω=0

, (2.7)

since[d

dωF (x+ ωv)

]

ω=0

=

[lim

∆ω→0

F (x+ ω +∆ωv)− F (x+ ωv)

∆ω

]

ω=0

= lim∆ω→0

[F (x+ ωv +∆ωv)− F (x+ ωv)

∆ω

]

ω=0

= lim∆ω→0

F (x+∆ωv)− F (x)∆ω

= DGF (x, v) ,

(2.8)

where continuity of F is invoked for fixed x and v.

1The treatment in this section follows closely the presentation in M.M. Vainberg, Variational Methods

for the Study of Non-linear Operators(English translation from Russian), Hold-Day, San Francisco, 1964.2It is easy to show that all finite-dimensional normed vector spaces are complete.

May 7, 2019 ME280B


The Gateaux differential DGF (x, v) is not necessarily linear in v. However, DGF (x, v) is

necessarily homogeneous in v, that is, for any α ∈ R, α 6= 0,

DGF (x, αv) = limω→0

F (x+ αωv)− F (x)ω

= α limαω→0

F (x+ αωv)− F (x)αω

= αDGF (x, v) .

(2.9)

It can be shown that if:

(a) DGF (x, v) exists in some open neighborhood U0 of x0 and is continuous in x at x = x0,

(b) DGF (x0, v) is continuous in v at v = 0,

then the Gateaux differential DGF (x0, v) is actually linear in v and one may write

DGF (x0, v) = [DGF (x0)] (v) , (2.10)

where the function DGF (x0) is called the Gateaux derivative of F at x = x0. The Gateaux

derivative is a linear operator on the Banach space U , since for a given x = x0 it maps

v ∈ U to [DGF (x0)] (v) ∈ V , that is,

DGF (x0) : v → [DGF (x0)] (v) . (2.11)

Example 2.2.1: Gateaux differential and derivative of a function in R

Let U = V = R and define F according to F (x) = x2. Then, use the definition (2.6) to write

DGF (x, v) = limω→0

F (x+ ωv)− F (x)ω

= limω→0

(x+ ωv)2 − x2ω

= limω→0

[2xv + ωv2

]= 2xv .

Clearly, in this case the Gateaux differential is linear in v, therefore the Gateaux derivative DGF (x) isa linear operator defined as

DGF (x) : v ∈ R → [DGF (x)] (v) = 2xv , (2.12)

for any given x ∈ R.

Note that the differential is not necessarily “infinitesimal” in magnitude. For a given x,

this depends on the magnitude of v.

Let F be again a non-linear mapping from the Banach space U to the Banach space V .

Also, assume that at a given point x ∈ U ,

F (x+ v) = F (x) +DFF (x, v) +O(x, v) , (2.13)

ME280B May 7, 2019

Gateaux and Frechet Differentials 23

for all v ∈ U , where DFF (x, v) is a linear mapping in v ∈ U and

lim‖v‖U→0

‖O(x, v)‖V‖v‖U

= 0 . (2.14)

Then, DFF (x, v) is called the Frechet differential of F at x and O(x, v) is the remainder

of F at x. The Frechet derivative DFF (x0) at a given point x = x0 is defined as the linear

operator on U which takes any v ∈ U and maps it to [DFF (x0)] (v) = DFF (x0, v), that is,

DFF (x0) : u ∈ U → [DFF (x0)] (v) . (2.15)

In view of (2.13) and the properties of its constituent parts, one may define the linear part

of F at x as

L[F ; v]x = F (x) +DFF (x, v) . (2.16)

A corollary of the above definition and the property of the remainder in (2.14) is that if

v 6= 0 is fixed and α > 0, then

lim‖αv‖U→0

‖O(x, αv)‖V‖αv‖U

= 0 (2.17)

implies

limα→0

‖O(x, αv)‖Vα

= 0 . (2.18)

Note that DFF (x, v) is unique as defined in (2.13). To argue this, let, by contradiction,

F (x+ v) = F (x) +DFF (x, v) +O(x, v) = F (x) +D′FF (x, v) +O′(x, v) . (2.19)

Then, clearly,

DFF (x, v)−D′FF (x, v) = O′(x, v)−O(x, v) . (2.20)

If one now replaces in the above v with αv, α 6= 0, it follows that

α [DFF (x, v)−D′FF (x, v)] = O′(x, αv)−O(x, αv) (2.21)

or

DFF (x, v)−D′FF (x, v) =

O′(x, αv)−O(x, αv)α

(2.22)

Now, taking norms of both sides and then considering the limit α→ 0 and recalling (2.18),

it follows that D′FF (x, v) = DFF (x, v).

May 7, 2019 ME280B


Example 2.2.2: Frechet differential and derivative of a function in R

Take, again, F (x) = x2 and write

F (x+ v) = (x+ v)2 = x2 + 2xv + v2 = F (x) +DFF (x, v) +O(x, v) ,

where, clearly,

DFF (x, v) = 2xv ,

which is linear in v, and also

O(x, v) = v2 ,

such that

lim|v|→0

|v|2|v| = 0 .

Thus, the Frechet differential of F at x is given in this case by

DFF (x, v) = 2xv

and the Frechet derivative DFF (x) is

DFF (x) : v ∈ R → [DFF (x)] (v) = 2xv .

From the definition of the Gateaux and Frechet derivatives, it follows that if the Frechet

derivative of F exists at a point x = x0, then so does the Gateaux derivative and, furthermore,

the two derivatives coincide. Indeed, if the Frechet derivative exists at x0, then, by definition,

one may write

F (x0 + ωv) = F (x0) + [DFF (x0)] (ωv) +O(x, ωv) , (2.23)

therefore,

limω→0

F (x0 + ωv)− F (x0)ω

= [DGF (x0)] (v) + limω→0

O(x, ωv)

ω= [DGF (x0)] (v) , (2.24)

in view of the property (2.18) of the remainder. The inverse of the above assertion does not

hold, namely the Gateaux derivative is weaker than the Frechet derivative. However, it can

be shown that if DGF (x) exists and is continuous at x, then DFF (x) exists and coincides

with DGF (x).

Note that continuity of an operator such as DGF (x) at a point x = x0 is defined using

the natural norm induced with the mapping F , namely DGF (x) is continuous at x = x0 if

for every ε > 0 there is a δ = δ(x0, ε) such that

‖DGF (x)−DGF (x0)‖UV < ε , (2.25)

ME280B May 7, 2019

Consistent Linearization in Continuum Mechanics 25

whenever

‖x− x0‖U < δ . (2.26)

Here, the natural norm is defined as

‖DGF (x)‖UV = supy∈U

‖DG(x, y)‖V‖y‖U

, (2.27)

where “sup” denotes least upper bound.

2.3 Consistent Linearization in Continuum Mechanics

In this section, linearization will be considered for quantities defined in the current configu-

ration R with respect to another configuration of a body associated with a region R.To this end, start with the region R0 occupied by the body in the reference configuration,

and define a motion χ, such that

χ : (X, t) → x = χ(X, t) = χt(X) , (2.28)

so that at time t the body occupies the region R. In addition, define another motion χ as

in (1.7), that is,

χ : (X, t) → x = χ(X, t) = χt(X) , (2.29)

where the body ends up in the region R at time t, see Figure 2.2. Both motions are assumed

EA

e1

R

R0

R

x

x

ξ

χt

χtχδ

t

Figure 2.2. Configurations of the body for consistent linearization

invertible at any fixed t, that is both mappings χt and χt are invertible.

May 7, 2019 ME280B


One may now write

x = χt(X) = χt(χ−1t (x)) = χδ

t (x) (2.30)

where χδt is a motion superposed on R, see, again, Figure 2.2. Symbolically, one may write

χt = χδt χt , (2.31)

where “” denotes functional composition, hence

(χδ

t χt

)(X) = χδ

t (χt(X)) = χδt (x) = x = χ−1

t (X) (2.32)

Next, define a vector field ξ = ξ(X, t), such that

ξ = x− x = χ(X, t)− χ(X, t) ; ξ = ξiei . (2.33)

The vector ξ is called the tangent vector field on R.The consistent linearization of kinematic and kinetic fields, as well as balance laws with re-

spect to the configuration associated with the region R is achieved by application of Gateaux

differentiation to each field (or balance law) atR in the direction of the tangent vector field ξ.

In what follows, all involved functions will be assumed as smooth as necessary to make their

Gateaux and Frechet derivatives coincide. For this reason, the common notation DF (x, ξ)

and DF (x) will be employed henceforth for differentials and derivatives, respectively.

Note that the tangent vector field ξ is not the usual displacement field u, given by

u = x−X = χ(X, t)−X . (2.34)

However, in the special case

χ(X, t) = X , (2.35)

for which R ≡ R0, the displacement and tangent vectors coincide. In this case, all lineariza-

tions are taken with respect to the reference configuration3

Start with the deformation gradient F and note that, by (1.8) and (2.7),

DF(x, ξ) =

[d

dwF(x+ wξ)

]

w=0

=

[d

dw

∂(x+ wξ)

∂X

]

w=0

=∂ξ

∂X, (2.36)

3See J. Casey, “On infinitesimal deformation measures”, J. Elast., 28:257-269, (1992), for derivations of

linearized kinematic quantities relative to the reference configuration.

ME280B May 7, 2019

http://link.springer.com/article/10.1007/BF00132214


where ξ = ξ(X, t). This implies that

L [F; ξ]x = F+∂ξ

∂X= F , F =

∂x

∂X, (2.37)

that is, F is itself linear in ξ.

Observe that

ξ = ξ(X, t) = ξ(χ−1

t (x), t)

= ξ(x, t) , (2.38)

therefore, by the chain rule,∂ξ

∂X=

∂ξ

∂x

∂x

∂X=

∂ξ

∂xF (2.39)

and the linear part of F can be alternatively written as

L [F; ξ]x = F+∂ξ

∂xF . (2.40)

Consider next the linearization of the Lagrangian strain tensor E defined in (1.23). Here,

employing again the definition (2.7), it follows that

DE(x, ξ) =

[d

dw

1

2

(FT (x+ wξ)F(x+ wξ)− I

)]

w=0

=1

2

[d

dwFT (x+ wξ)

]

w=0

F+ FT[d

dwF(x+ wξ)

]

w=0

=1

2

[(∂ξ

∂X

)T

F+ FT ∂ξ

∂X

]. (2.41)

Alternatively, in view of (2.39),

DE(x, ξ) =1

2

[F

T(∂ξ

∂x

)T

F+ FT ∂ξ

∂xF

]= F

T(∂ξ

∂x

)S

F , (2.42)

where (∂ξ

∂x

)S

=1

2

[∂ξ

∂x+

(∂ξ

∂x

)T]. (2.43)

Therefore,

L [E; ξ]x = E+ FT(∂ξ

∂x

)S

F (2.44)

and the linear part of E remains symmetric, as expected. Likewise, given (1.13) and (2.42),

it is immediately clear that

L [C; ξ]x = C+ 2FT(∂ξ

∂x

)S

F . (2.45)

May 7, 2019 ME280B


Similarly, it is simple to conclude from (1.14) and (2.40) that

L [B; ξ]x = B+∂ξ

∂xB+B

(∂ξ

∂x

)T

. (2.46)

In the special case when the linearization is performed with respect to the reference

configuration, where

x = X , ξ = u , F = I , (2.47)

it is easy to see that, given (2.39),

∂ξ

∂X=

∂ξ

∂xF =

∂ξ

∂x. (2.48)

Hence, once may conclude from (2.42) that

DE(x, ξ) = FT(∂ξ

∂x

)S

F =

(∂ξ

∂x

)S

=

(∂ξ

∂X

)S

(2.49)

and

L [E; ξ]X = E+1

2

∂ξ∂X

+

(∂ξ

∂X

)T = ε(ξ) , (2.50)

where

ε(ξ) =1

2(ξA,B + ξB,A)EA ⊗ EB (2.51)

and since, in view of (2.47)3, E = 0. Given (2.47)2 and (2.50), one may also conclude that

L [E; ξ]X = L [E;u]X = ε(u) =1

2(uA,B + uB,A)EA ⊗ EB , (2.52)

which is the usual measure of strain at infinitesimal deformations. In addition, it follows

from (2.45) and (2.45) that

L [C; ξ]X = L [C;u]X = 2ε(u) . (2.53)

and

L [B; ξ]X = L [B;u]X = 2ε(u) . (2.54)

ME280B May 7, 2019


Next, consider the linearization of J = detF with respect to the configurationR. Startingonce more with the definition (2.7),

DJ(x, ξ) =

[d

dwdetF(x+ wξ)

]

w=0

=

[d

dwdet

(F+ w

∂ξ

∂X

)]

w=0

=

[d

dwdet(F+ wG

)]

w=0

; G =∂ξ

∂X

=

[d

dwdet

wF

[F

−1G−

(− 1

w

)I

]]

w=0

=

[d

dw

w3 detF

[−(− 1

w

)3

+

(1

w

)2

IF

−1G−(− 1

w

)II

F−1

G+ III

F−1

G

]]

w=0

=

[d

dw

detF

[1 + wI

F−1

G+ w2II

F−1

G+ w3III

F−1

G

]]

w=0

= detF IF

−1G

(2.55)

where it is recalled that for any second-order tensor, say T,

det(T− λI) = −λ3 + λ2IT − λIIT + IIIT . (2.56)

In the above, IT, IIT, and IIIT are the principal invariants of T, defined as

IT = trT (2.57)

IIT =1

2

[(trT)2 − trT2

](2.58)

IIIT = detT =1

6

[(trT)3 − 3 (trT)

(trT2

)+ 2 trT3

]. (2.59)

Equation (2.55) may be expressed as

[d

dwdetF(x+ wξ)

]

w=0

= J tr

(F

−1 ∂ξ

∂X

)(2.60)

where J = detF, and the linear part of detF relative to the configuration R takes the form

L [detF; ξ]x = J + J tr

(F

−1 ∂ξ

∂X

)= J

[1 + tr

(F

−1 ∂ξ

∂X

)]. (2.61)

May 7, 2019 ME280B


Alternatively, taking into account (2.39), one may obtain an Eulerian form of the above

linearization by rewriting (2.60) as

DJ(x, ξ) = J tr

(F

−1 ∂ξ

∂X

)= J tr

[F

−1(∂ξ

∂xF

)]

= J tr

[FF

−1 ∂ξ

∂x

]; (tr [AB] = tr [BA])

= J tr

(∂ξ

∂x

)= J div ξ , (2.62)

where div ξ = ∂ξi∂xi

4, therefore

L [J ; ξ]x = J + J div ξ = J(1 + div ξ) . (2.63)

For the special case of linearization with respect to the reference configuration, equations (2.47)

and (2.51) imply that

L [J ; ξ]X = 1 + Div ξ = 1 + tr ε(ξ) = 1 + tr ε(u) . (2.64)

Also, consider the linearization ofda

dA, namely the ratio of the surface element areas of the

current over the reference configuration. To this end, start with Nanson’s formula (1.12),

from where it follows thatda

dAn = JF−TN = F∗N , (2.65)

where

F∗ = JF−T (2.66)

is the adjugate of F. Upon taking the dot-product of each side of (2.65) with itself, one finds

that

(da

dA

)2

= (F∗N) · (F∗N)

= N · (F∗TF∗N) = N · (C∗N) , (2.67)

where

C∗ = F∗TF∗ = J2C−1 . (2.68)

4In strict terms, one should use the notation divξ for this term, but the extra overbar is neglected here.

ME280B May 7, 2019


It now follows from the usual definition (2.7) that

D

[(da

dA

)2](x, ξ) =

[d

dwN ·C∗(x+ wξ)N

]

w=0

, (2.69)

where[d

dwC∗(x+ wξ)

]

w=0

=

[d

dw

J2(x+ wξ)C−1(x+ wξ)

]

w=0

= 2J

[d

dwJ(x+ wξ)

]

w=0

C−1

+ J2[d

dwC−1(x+ wξ)

]

w=0

. (2.70)

Since CC−1 = I,[d

dw

C(x+ wξ)C−1(x+ wξ)

]

w=0

= DC(x, ξ)C−1

+CDC−1(x, ξ) = 0 , (2.71)

therefore

DC−1(x, ξ) =

[d

dwC−1(x+ wξ)

]

w=0

= −C−1DC(x, ξ)C

−1(2.72)

or, in light of (2.45),

DC−1(x, ξ) = −F−1F

−T

[2F

T(∂ξ

∂x

)S

F

]F

−1F

−T= −2F−1

(∂ξ

∂x

)S

F−T

. (2.73)

Return to (2.70) and substituting in it (2.62) and (2.45), it follows that

[d

dwC∗(x+ wξ)

]

w=0

= 2J[J div ξ

]C

−1+ J

2

[−2F−1

(∂ξ

∂x

)S

F−T

]

= 2J2

[div ξC

−1 − F−1(∂ξ

∂x

)S

F−T

]. (2.74)

This means that (2.69) takes the form

D

[(da

dA

)2](x, ξ) = 2

(da

dA

)D

[da

dA

](x, ξ)

= N · 2J2

[div ξC

−1 − F−1(∂ξ

∂x

)S

F−T

]N , (2.75)

hence

D

[da

dA

](x, ξ) =

(dA

da

)N · J2

[div ξC

−1 − F−1(∂ξ

∂x

)S

F−T

]N (2.76)

May 7, 2019 ME280B


and, accordingly,

L[da

dA; ξ

]

x

=

(da

dA

)+D

[da

dA

](x, ξ) . (2.77)

When the linearization is performed with respect to the reference configuration, it follows

from (2.47) and (2.51) that (2.76) reduces to

D

[da

dA

](X, ξ) = N ·

[div ξI−

(∂ξ

∂x

)S]N

= N ·[Div ξI−

(∂ξ

∂X

)S]N

= N · [tr ε(ξ)I− ε(ξ)]N= tr ε(u)−N · ε(u)N (2.78)

and

L[da

dA; ξ

]

X

= 1 + tr ε(u)−N · ε(u)N . (2.79)

The balance laws are also subject to linearization. For instance, consider the balance of

mass in the form (1.63) and linearize it relative to the configuration R, that is, write

L [ρ0; ξ]x = L [ρJ ; ξ]x ⇒ (2.80)

which leads, with the aid of (2.62) to

ρ0 = ρJ +D[ρJ ](x, ξ)

= ρJ +Dρ(x, ξ)J + ρDJ(x, ξ)

= ρJ +Dρ(x, ξ)J + ρJ div ξ . (2.81)

Assuming that conservation of mass holds in R (which is tantamount to letting ρ be such

that ρ0 = ρJ), it follows from the above that

Dρ(x, ξ) = −ρ div ξ , (2.82)

so that

L [ρ; ξ]x = ρ+ (−ρ div ξ) = ρ(1− div ξ) . (2.83)

In view of (2.82), the special case of linearization with respect to R0 readily yields

L [ρ; ξ]X = ρ0(1− div ξ) = ρ0(1−Div ξ) = ρ0(1− tr ε(ξ)) = ρ0(1− tr ε(u)) , (2.84)

ME280B May 7, 2019


which implies that conservation of mass may be expressed as

ρ = ρ0(1− tr ε(u)) (2.85)

Note that ρ is not equal to ρ0 when linearized with respect to R0.

Lastly, consider the balance of linear momentum (1.71) in spatial form and take its linear

part with respect to R, which is

L [divT; ξ]x + L [ρb; ξ]x = L [ρu; ξ]x (2.86)

The differential of each of the three terms in the preceding equation is considered separately.

First, write the acceleration term as

D [ρu] (x, ξ) =

[d

dwρ(x+ wξ)u(x+ wξ)

]

w=0

= −ρ div ξu+ ρξ (2.87)

with the use of (2.82) and the fact that u(x) = x. Next, the body force term becomes

D [ρb] (x, ξ) =

[d

dwρ(x+ wξ)b(x+ wξ)

]

w=0

= −ρ div ξb+ ρDb(x, ξ) (2.88)

where, again, use is made of (2.82) and also of the definition

L [b; ξ]x = b+Db(x, ξ) . (2.89)

For the stress-divergence term, it can be shown that

D [divT] (x, ξ) =

[d

dwdivT(x+ wξ)

]

w=0

= div

[d

dwT(x+ wξ)

]

w=0

− T ij,kξk,jei

= div [DT(x, ξ)]− gradT · gradT ξ . (2.90)

Note that a constitutive equation for T is required in order to evaluate the first term on the

right-hand side of (2.90)3. To derive (2.90), it is best to resort to component representation,

as follows:

D [divT] (x, ξ) =

[d

dwdivT(x+ wξ)

]

w=0

=

[d

dw

∂Tij(x+ wξ)

∂(xj + wξj)

]

w=0

ei

=

[d

dw

∂Tij(x+ wξ)

∂xk

∂xk∂(xj + wξj)

]

w=0

ei

=

[d

dw

∂Tij(x+ wξ)

∂xk

]

w=0

∂xk∂xj

ei +∂T ij

∂xk

[d

dw

∂xk

∂(xj + wξj)

]

w=0

ei .

(2.91)

May 7, 2019 ME280B


Note that it is not possible to switch the order of the differential operators ddw

and ∂∂x

in (2.91)2 as the latter is explicitly dependent on ω.5 To evaluate the second term on the

right-hand side of (2.91)4, observe that

[d

dw

∂(xk + wξk)

∂xl

∂xl∂(xj + wξj)

]

w=0

=

[d

dwδkj

]

w=0

= 0 , (2.92)

therefore,

[d

dw

∂(xk + wξk)

∂xl

]

w=0

δlj + δkl

[d

dw

∂xl

∂(xj + wξj)

]

w=0

= 0 , (2.93)

hence [d

dw

∂xk

∂(xj + wξj)

]

w=0

= −ξk,j . (2.94)

Equation (2.90) follows upon substituting (2.94) into (2.91)4.

Return to the linearized statement of linear momentum balance (2.86) and taking into

account (2.87), (2.88) and (2.90), it follows that

divT+ div [DT(x, ξ)]− gradT · gradT ξ + ρb+−ρ div ξb+ ρDb(x, ξ)

= ρu+−ρ div ξu+ ρξ

. (2.95)

Assuming that linear momentum balance holds in R, the preceding equation reduces to

div [DT(x, ξ)]− gradT · gradT ξ + ρ div ξ(u− b) + ρDb(x, ξ) = ρξ (2.96)

or, equivalently,

div [DT(x, ξ)]− gradT · gradT ξ + divT div ξ + ρDb(x, ξ) = ρξ . (2.97)

Assuming that the refefence configuration R0 is initial stress- and initial acceleration-

free, it is immediately obvious from linear momentum balance that the body force vanishes

as well in this configuration. As a result,

L [T; ξ]X = DT(X, ξ) = σ , (2.98)

L [u; ξ]X = Du(X, ξ) = a , (2.99)

5This point appears to have been overlooked in T.J.R. Hughes and K.S. Pister, “Consistent linearization

in mechanics of solids and structures”, Comp. Struct., 8:391–397, (1978).

ME280B May 7, 2019

http://www.sciencedirect.com/science/article/pii/0045794978901839



and

L [b; ξ]X = Db(x, ξ) = b . (2.100)

It follows from (2.97) and the above three equations that the linearized statement of linear

momentum balance with respect to R0 reduces to

Divσ + ρ0b = ρ0a . (2.101)

May 7, 2019 ME280B

Chapter 3

Incremental Formulations

A typical procedure for the solution of continuummechanics problems using the finite element

method starts with the derivation of a weak (e.g., Galerkin) form of the governing (balance)

equations from the original strong form. In general, the weak form comprises a set of

integro-differential equations in space and time. Next, these equations are considered in a

given time domain (real or implied by the application of the external loads). The solution of

these equations is now performed by partitioning the time into time increments and deducing

the solution from the beginning of the first time increment (where this solution is assumed

to be given) to the end of the first time increment, and so on up to the end of the last

increment. This process gives rise to the term incremental formulation for the solution of

the continuum mechanics equations.

Each incremental solution requires that dependent variables of the problem be interpo-

lated in space. This reduces the original integro-differential equations in space-time to merely

differential equations in time. Upon observing that the incrementation in time induces a

natural discretization, it follows that the preceding differential equations in time are further

reduced to either implicit (generally, non-linear) algebraic equations or merely explicit equa-

tions. The solution to the former requires some iterative approach (e.g., Newton-Raphson

method or its variants), while the latter does not.

3.1 Strong and Weak Form of the Balance Laws

Consider a motion χ : R0 × R 7→ R ⊂ E3 of a body, as in (1.3), which is due to externally

applied loads. Note that the preceding definition of this motion is predicated on the existence

36

Strong and Weak Form of the Balance Laws 37

of a reference configuration R0. In this case, recall that the displacement field u is defined

as in (2.34), see also Figure 3.1. If a reference configuration is not available, then one needs

to formulate the strong and weak forms of the initial/boundary-value problem without using

(explicitly or implicitly) such a configuration.

RR0

Xx

u

EAe1

χ

Figure 3.1. Reference and current configuration

Let the boundary ∂R of R be smooth enough for a unique outward unit normal n defined

everywhere on ∂R, as in Figure 3.2. Assume further that the boundary ∂R is decomposed

into parts Γu and Γq such that at any time t

∂R = Γu ∪ Γq . (3.1)

These two parts are defined as

n

R∂R

ΓqiΓui

Figure 3.2. Partition of boundary ∂R of R

Γu = ∪3i=1Γui

, Γq = ∪3j=1Γqj , (3.2)

such that

ui = ui(x, t) on Γui(Dirichlet boundary conditions in solids) (3.3)

May 7, 2019 ME280B

38 Incremental Formulations

or

vi = vi(x, t) on Γui(Dirichlet boundary conditions in fluids) (3.4)

and

tj = tj(x, t) on Γqj (Neumann boundary conditions) (3.5)

where u (resp., v) and t are the prescribed displacements (resp., velocities) and tractions on

∂R and

Γui∩ Γqi = ∅ . (3.6)

Note that the components in (3.3-3.5) do not need to be those resolved on the basis vectors

ei associated with the current configuration.

The strong form of the general initial/boundary-value problem in continuum mechanics

can be stated as follows: given the external forces and prescribed displacements (resp.,

velocities) b, t, u (resp., v) defined as above, the initial velocity and density fields v0 and

ρ0, and a constitutive law for the Cauchy stress tensor T, such that T = TT , determine

the displacement field u such that det∂(X+ u)

∂X> 0 (or the velocity field v) and the mass

density ρ > 0, such that, for all t, linear momentum holds in the form (1.71) or (1.74) and

mass balance holds in the form (1.66) or (1.63), subject to the boundary conditions (3.3)

(or (3.4)) and (3.5), and the initial conditions

u(X, 0) = 0 , u(X, 0) = v0 in R0 (3.7)

for solids and

v(x, 0) = v0 in R (3.8)

for fluids.

At every point x of the current configuration, define a tangent vector ξ = ξ(x, t), that

is any vector which originates at x, as in Figure 3.3. Define the space of admissible tangent

vector field W as

W =ξ : R× R −→ E3 , ξi(x, t) = 0 on Γui

, (3.9)

that is, such that ξ vanishes identically where a displacement or velocity boundary condition

is enforced.

Now define the weighted residual form

R =

∫

R×I

ξ · (ρu− ρb− divT) dvdt+

∫

Γq×I

ξ · (t− t) dadt , (3.10)

ME280B May 7, 2019


R

ξ

Figure 3.3. Tangent vector at a point in the current configuration

where I = (t0, T ) is the time domain of the analysis. Note that the second integral on the

right-hand side of (3.10) is understood as being a short-hand for∑

i

∫Γq,i

ξi(ti− ti) dadt, thatis, it includes the contributions of all the components of the prescribed traction in all of their

directions.

The weak form of the general initial/boundary-value problem in continuum mechanics

reads as follows: given the external forces b and t, a constitutive law for T, such that

TT = T, find the displacement field u ∈ U (or the velocity field v ∈ V), such that

R = 0 (3.11)

for all ξ ∈ W . Here,

U =

u : R0 × R→ E3 | det

(I+

∂u

∂X

)> 0 , ui(X, t) = ui on Γui

,

u(X, 0) = 0 , u(X, 0) = v0 (3.12)

for solids and

V =v : R× R→ E3 | vi(x, t) = vi on Γui

, v(x, 0) = v0

(3.13)

for fluids. In the solids problem, mass conservation is enforced trivially, in the sense that,

given ρ0 > 0 at time t0, one may find ρ as a function of ρ0 and the deformation, according

to (1.63). Therefore, there is no need for mass conservation to be enforced through the

weak form. On the contrary, in fluid mechanics, one needs to additionally enforce mass

conservation according to (1.66). To achieve this, the weighted residual R of (3.10) must be

amended by an additional term as

R =

∫

R×I

ξ ·(ρu−ρb−divT) dvdt+

∫

Γq×I

ξ ·(t−t) dadt+∫

R×I

σ(ρ+ρ div v) dvdt , (3.14)

May 7, 2019 ME280B


such that a solution (v, p) ∈ V × Z+ is sought by enforcing (3.11) for all (ξ, σ) ∈ W × Z,where

Z = σ : R× R→ R , Z+ = σ : R× R→ R | σ > 0 . (3.15)

Additional assumptions on smoothness are required for the functions in U , V ,W ,Z in order

to guarantee the existence of the integrals which comprise R in (3.14). These requirements

will be discussed when introducing finite element counterparts of these spaces.

It is worth observing here that the balance of linear momentum (as well as the mass

balance in fluids) and the traction boundary conditions are enforced through the weighted

residual form. On the other hand, the displacement (or velocity) boundary conditions and

initial conditions are satisfied from the outset by the choice of admissible fields U or V . In

addition, balance of angular momentum is satisfied by the constitutive law for stress.

The above strong and weak form are equivalent (that is, derivable from each other) only

under the assumption of continuity of all integrands in (3.10) or (3.14). This is a fairly

severe assumption, which, when violated, leads to solutions obtained from the weak form

(termed weak solutions) that may differ from the strong solution where/when discontinuities

are encountered.

The complete proof of the preceding result may be found elsewhere. Here, only a sketch

of the proof is provided, as follows: first, it is clear by inspection that the strong form implies

the weak form. for the opposite, note that (3.10) (or (3.14)) applies for any ξ ∈ W (and,

σ ∈ Z, if applicable). Now fix a time t and choose ξ such that ξ = 0 on Γq (and also σ = 0

on R, if applicable). Next, argue, by contradiction, that if there is a point x ∈ R where

ρa−ρb−divT 6= 0, then one may choose ξ to vanish everywhere and every time except in a

neighborhood Nx of x in which ξ · (ρa−ρb−divT) > 0, which is always possible due to the

continuity of the preceding scalar quantity. It follows that∫R×I

ξ ·(ρa−ρb−divT) dvdt > 0,

which contradicts the original assumption, thus ρa− ρb− divT = 0 everywhere in R at all

times t. Then, the same argument may be repeated on Γq (as well as on R for the continuity

equation in (3.14), if applicable).

Proceed with separating the space from the time integrals in the weighted residual

form (3.10), and consider first the spatial integrals (that is, “freeze” time). In particular, set

∫

R

ξ · (ρu− ρb− divT) dv +

∫

Γq

ξ · (t− t) da = 0 (3.16)

and note that, upon using integration by parts, the divergence theorem, and the property

ME280B May 7, 2019


of ξ specified in (3.9)∫

R

ξ · divT dv =

∫

R

ξiTij,j dv =

∫

R

[(ξiTij),j − ξi,jTij ] dv

=

∫

∂R

ξiTijnj da−∫

R

ξi,jTij dv

=

∫

Γu∪Γq

ξiTijnj da−∫

R

ξi,jTij dv

=

∫

Γq

ξitida−∫

R

ξi,jTij dv

=

∫

Γq

ξ · t da−∫

R

∂ξ

∂x·T dv . (3.17)

Substituting the preceding expression to the weighted-residual form (3.16), it follows that∫

R

ξ · ρu dv +∫

R

∂ξ

∂x·T dv =

∫

R

ξ · ρb dv +∫

Γq

ξ · t da (3.18)

The full space-time integral form can be easily reconstructed from the above. Observing

the symmetry of T, the second term on the left-hand side of (3.18) can be also written as∫R

(∂ξ∂x

)s ·T dv The above expression is computationally convenient, as it involves the term

(∂ξ

∂x

)s

=1

2

[∂ξ

∂x+

(∂ξ

∂x

)T]

(3.19)

which resembles the infinitesimal strain tensor. This observation is important when attempt-

ing to extend finite element formulations originally developed for infinitesimal deformations

to finite deformations.

If ξ is identified as the variation δu1 of u, one obtains a statement of virtual work,

according to which the virtual work done by the internal forces (inertial forces and stresses)

is equal to the virtual work done by the external forces (body forces and surface tractions)

over the whole body occupying the configuration R. A corresponding statement of virtual

power is obtained from (3.18) when ξ is substituted with a virtual velocity. It is instructive

to contrast these statements to the mechanical energy balance equation for the whole body,

see (1.80).

One may assume at the outset that the traction boundary conditions (3.5) are satisfied

at the outset and do not need to be included in the weighted residual form (3.10). While

1Recall that, by definition, the variation of u vanishes where u is specified, which confirms that the space

of admissible variations of u is precisely W.

May 7, 2019 ME280B


this goes against the premise of finite element modeling (where one may one directly impose

boundary conditions on the dependent variables that enter the finite element approximation),

it is instructive to consider this option, where the weighted-residual statement is reduced∫

R

ξ · (ρa− ρb− divT) dv = 0 (3.20)

for all ξ ∈ W . Repeating the preceding procedure leads to:∫

R

ξ · ρa dv +∫

R

∂ξ

∂x·T dv =

∫

R

ξ · ρb dv +∫

Γq

ξ · t da . (3.21)

Upon now enforcing (3.5), one immediately recovers the derived weighted-residual statement

in (3.18).

A completely equivalent weighted-residual form can be derived based on the referential

statement of linear momentum balance (1.74). In this case, one observes that the tangent

vector ξ may be represented as

ξ = ξ(X, t) = ξ(x, t) . (3.22)

Next, “freeze” again time and start from the weighted residual statement

R0 =

∫

R0

ξ · (ρ0a− ρ0b−DivP) dV +

∫

Γq0

ξ · (p− p) dA = 0 , (3.23)

for all ξ ∈ W , where p = PN is the traction vector on the boundary of the body in

the current configuration, but resolved using the geometry of the reference configuration.

Repeating the same procedure as with the spatial weak form, it is easy to show that∫

R0

ξ · ρ0a dV +

∫

R0

∂ξ

∂X·P dV =

∫

R0

ξ · ρ0b dV +

∫

Γq0

ξ · p dA . (3.24)

Recalling (2.36), it is concluded that the integrand of the second term on the left-hand side

of (3.24) satisfies∂ξ

∂X·P = DF(X, ξ) ·P . (3.25)

Furthermore,

∂ξ

∂X·P =

∂ξ

∂X· (FS) =

(FT ∂ξ

∂X

)· S

=1

2

[FT ∂ξ

∂X+

(∂ξ

∂X

)T

F

]· S = DE(x, ξ) · S , (3.26)

where use is made of (1.85), (2.41), and the symmetry of S.

ME280B May 7, 2019

Lagrangian and Eulerian Methods 43

3.2 Lagrangian and Eulerian Methods

Preliminary to the presentation of any specific finite element methods, it is essential to

formally distinguish between Lagrangian and Eulerian computations. To this end, consider

a body occupying the configuration R at time and define a surface S(t), as in Figure 3.4,

which may be represented by the parametric equation

xs = xs(ξ, η, t) , (3.27)

where ξ, η are surface parameters. These parameters may be eliminated to obtain an alter-

R

S

v

vs

Figure 3.4. A surface S traversing the configuration R at time t

native representation of the surface as

f(xs, t) = 0 . (3.28)

Taking the time derivative of the above equation, while following any given point on the

surface, it follows that∂f

∂t+∂f

∂x· vs = 0 , (3.29)

where vs is the velocity of the surface, see Figure (3.4), which can be readily obtained

from (3.27). Alternatively, consider a material point which occupies a point x ∈ S at time t.

The material time derivative of f describes the rate of change of f for the given material

point. This equals

f =∂f

∂t+∂f

∂x· v , (3.30)

where v = χ is the velocity of the particle, see, again, Figure 3.4. Generally, at the fixed

time t, vs 6= v. Hence, subtracting (3.29) from (3.30), it follows that

f =∂f

∂x· (v − vs) (3.31)

May 7, 2019 ME280B


and, since the unit normal n to the surface may be written as

n =∂f∂x∣∣∂f∂x

∣∣ , (3.32)

it is concluded that the flux of mass through the surface S at time t is given by

∫

S

ρ(v − vs) · n da =

∫

S

ρf

∣∣∣∣∂f

∂x

∣∣∣∣−1

da . (3.33)

Upon invoking the localization theorem, it follows from the above that the necessary and

sufficient condition for the flux of mass through S (or any part of it) to vanishes is f = 0, in

which case

(v − vs) · n = 0 . (3.34)

This mandates that the surface S and the material particles on it travel with the same

normal velocity to S. In this case, the surface S is called a material surface. The necessary

and sufficient condition

f = 0 (3.35)

is referred to as Lagrange’s criterion of materiality.

Assume that at any given time t the region R is spatially discretized according to

R .= ∪eΩ

e, where Ωe is the domain of the typical finite element e, such that all bound-

R

Ωe

Figure 3.5. Finite element mesh schematic

ary surfaces (or edges) of each Ωe are material. Lagrangian finite element computations are

making use of finite elements with domains whose boundaries are material.

To understand how it can be guaranteed that the boundary of a finite element domain are

material, note that in Lagrangian finite element computations the motion of the element is

completely parametrized by the motion of its nodal points. Hence, the finite element system

ME280B May 7, 2019

Lagrangian and Eulerian Methods 45

is finite-dimensional, in the sense that the motion of the discretized body is fully determined

at any given time t by a finite set of parameters.

Consider a typical surface S of a finite element domain Ωe, in the context of classical

isoparametric interpolations. For a typical point X on S in the reference configuration, one

may write

X =∑

i

Ni(ξ, η)Xi (3.36)

where (ξ, η) are the natural coordinates (which coincide, here, with the surface coordinates),

S ξ

η

X

Figure 3.6. Lagrangian finite element surface

Ni are the isoparametric shape functions, and Xi the position vector of node i on S, as inFigure 3.6. The point X is mapped in the current configuration to x = χ(X, t), such that

x =∑

i

Ni(ξ, η)xi

=∑

i

Ni(ξ, η)χ(Xi, t)

=∑

i

Ni(ξ, η) [Xi + ui]

=∑

i

Ni(ξ, η)Xi +∑

i

Ni(ξ, η)ui = X+ u . (3.37)

The surface S is clearly material, since it consists of the same material points at all times

with respect to the motion χ of the nodal points i and the isoparametric interpolation.

Put differently, all xi are material points by definition, while any x on S is also material by

construction of the motion in terms of the nodal motions and the isoparametric interpolation.

Alternatively, one may consider an Eulerian finite element formulation for which the

mesh is stationary, that is vs = 0 on the boundary ∂Ωe of any element element e. In this

case, each element becomes a control volume. In view of (3.34) it is clear that the element

May 7, 2019 ME280B


region Ωe is not material. Thus, mass flux takes place across ∂Ωe, which necessitates that

mass balance be enforced through the equation

∫

Ωe

σ(ρ+ ρ div v) dv = 0 , (3.38)

for all admissible σ, as also argued earlier (see (3.14)).

Lagrangian and Eulerian methods have both advantages and disadvantages from an algo-

rithmic standpoint. For instance, when Eulerian methods are used for solids or free-surface

fluids, then

(a) A geometric detection algorithm is required to identify the “full” elements at each

time t, see Figure 3.7.

(b) A special treatment is generally needed for “partial” element at each time t, see again

Figure 3.7.

(c) The application of Dirichlet conditions on the physical boundary is not necessarily

trivial.

full element

partial element

Figure 3.7. Eulerian finite element mesh with a moving solid or fluid

Moreover, the tracking of material-based history variables requires a special procedure, since

such variables cannot be directly stored at nodal or integration points.

Lagrangian methods suffer from excessive mesh distortion, which may cause loss of

uniqueness in the isoparametric mapping, as well as from inability to directly handle loss of

materiality, as is the case, for instance, when modeling crack propagation or fragmentation.

ME280B May 7, 2019

Total and Updated Lagrangian Methods 47

3.3 Semi-discretization

As done earlier, it is customary (and practical) to separate the time from the space integrals

in (3.14), which often referred to as semi-discretization. Assuming that the solution must be

determined in the time domain (t0, T ], perform a time discretization by introducing a finite

sequence of distinct times t1, t2, ..., tN , as in Figure 3.8. Here, ∆tn = tn+1− tn is the discrete

t0 t1 t2 tn tn+1 tN−1 tN = T

∆tn

Figure 3.8. Time discretization

time step at time tn and

∆t =

N∑

n=1

∆tn−1

N=

T − t0N

(3.39)

is the average step size, with the understanding that ∆tn = O(∆t), n = 0, 1, . . . , N − 1.

The basic idea of incremental methods is to satisfy the equations of motion only at

discrete time instances t1, t2, ..., tN , where at each such tn+1, the solution to these equations

is known for all ti, i = 1, 2, . . . , n. In the limit as N → ∞, incremental methods recover a

solution in a dense subset of (t0, T ].

Incremental methods make sense on two grounds. First, they recover the history of a

deformation, which is oftentimes, in itself, useful to the analyst. Second, they are necessary

when modeling continua made of path-dependent material or which are subject to path-

dependent loading. In the latter case, iterative methods are essentially used to resolve the

path-dependency.

A weighted-residual interpretation of semi-discretization can be obtained by starting

from (3.10) (similarly, for (3.14)) and writing ξ(x, t) = θ(t)ξ(x), such that

R =

∫ T

t0

θ

[∫

R

ξ · (ρa− ρb− divT) dv +

∫

Γq

ξ · (t− t) da

]dt = 0 , (3.40)

where the time-dependent weighting function θ(t) is defined as θ(t) =∑N

i=1 θiδ(t− ti). Thisdiscretization leads naturally to the requirement that linear momentum balance and traction

boundary conditions must be enforced at each of the times t1, t2, ..., tN .

May 7, 2019 ME280B

48 3.4. TOTAL AND UPDATED LAGRANGIAN METHODS

3.4 Total and Updated Lagrangian Methods

Consider an incremental Lagrangian method and assume that the balance laws have been

already satisfied (in a weak form) and the motion χ has been determined for times t1, t2, ..., tn,

see Figure 3.9. The objective is now to determine the position of all material particles

(therefore, the configuration of the body) at t = tn+1, such that

R0

Rn

Rn+1

Figure 3.9. Configurations at times t0, tn and tn+1

∫

Rn+1

ξn+1 · ρn+1an+1 dv +

∫

Rn+1

(∂ξ

∂x

)s

n+1

·Tn+1 dv

=

∫

Rn+1

ξn+1 · ρn+1bn+1 dv +

∫

Γqn+1

ξn+1 · tn+1 da , (3.41)

for all ξn+1 ∈ W . The preceding equation may be resolved iteratively for the displacement

field un+1 ∈ U (or, equivalently, the relative displacement field un+1 − un), assuming the

data (that is, b and t) is known in (tn, tn+1].

Alternatively, one may pull-back (3.41) to the reference configuration and write

∫

R0

ξn+1 · ρ0an+1 dV +

∫

R0

∂ξn+1

∂X·Pn+1 dV

=

∫

R0

ξn+1 · ρ0bn+1 dV +

∫

Γq0

ξn+1 · pn+1 dA , (3.42)

for all ξn+1 ∈ W . Note that, since, in general, Γq depends on time, by Γq0 in (3.42),

one denotes the pull-back of Γqn+1 to the reference configuration, rather than the traction

boundary of the reference configuration itself.

ME280B May 7, 2019


Starting with the referential weak form (3.42), concentrate on a typical Lagrangian finite

element e with domain Ωe0 in the reference configuration. For this element, write at t = tn+1

u(X, tn+1)∣∣Ωe

0= ue

n+1(X).=

NEN∑

i=1

N ei u

ei n+1 ,

an+1(X, tn+1)∣∣Ωe

0= ae

n+1(X).=

NEN∑

i=1

N ei a

ei n+1 , (3.43)

ξn+1(X, tn+1)∣∣Ωe

0= ξen+1(X)

.=

NEN∑

i=1

N ei ξ

ei n+1 ,

where the element is assumed isoparametric with NEN nodes and interpolation functions

N ei (X), i = 1, 2 . . . , NEN. Special cases, such as those involving hierarchical, mixed or incom-

patible interpolations can be easily handled by extending the above definitions.

For ease of computer implementation, one may write the preceding interpolations in

matrix form as

uen+1

.=

NEN∑

i=1

N ei u

ei n+1

=

[N e

113︸︷︷︸(3×3)

N e213 · · · N e

NEN13

]

︸︷︷︸3×(3×NEN)

ue1

ue2...

ueNEN

n+1︸︷︷︸(3×NEN)×1

= [Ne][uen+1] , (3.44)

where [uen+1] is a vector that contains all nodal displacements, while 13 denotes the 3 × 3

identity matrix. Similarly, one may write

aen+1

.= [Ne][ae

n+1] , ξen+1.= [Ne][ξ

e

n+1] . (3.45)

Also, to resolve in convenient matrix form the stress Pn+1 and the derivative∂ξen+1

∂Xin (3.42),

May 7, 2019 ME280B


write their 9 components as column vectors according to

〈Pn+1〉 =

P11

P22

P33

P12

P23

P31

P13

P21

P32

n+1

,

⟨∂ξen+1

∂X

⟩=

ξe1,1

ξe2,2

ξe3,3

ξe1,2

ξe2,3

ξe3,1

ξe1,3

ξe2,1

ξe3,2

n+1

. (3.46)

Thus, one may write

⟨∂ξen+1

∂X

⟩

︸︷︷︸(9×1)

= [Be]︸︷︷︸9×(3×NEN)

[ξe

n+1]︸︷︷︸(3×NEN)×1

, (3.47)

where

[Be]︸︷︷︸9×(3×NEN)

=[[Be

1]︸︷︷︸(9×3)

[Be2] · · · [Be

NEN]]

(3.48)

and [Bei ] is such that

< Bei ξ

e

in+1> =

N ei,1ξ

ei1

N ei,2ξ

ei2

N ei,3ξ

ei3

N ei,2ξ

ei1

N ei,3ξ

ei2

N ei,1ξ

ei3

N ei,3ξ

ei1

N ei,1ξ

ei2

N ei,2ξ

ei3

n+1

(no summation on i) , (3.49)

ME280B May 7, 2019


consequently

[Bei ]︸︷︷︸

9×3

=

N ei,1

N ei,2

N ei,3

N ei,2

N ei,3

N ei,1

N ei,3

N ei,1

N ei,2

, (3.50)

where N ei,A =

∂Nei

∂XA.

Likewise, to express the spatial statement (3.41) in matrix form, one need to express the

(symmetric) stress Tn+1 and the derivative(

∂ξen+1

∂x

)sas 6-dimensional vectors according to

〈Tn+1〉 =

T11

T22

T33

T12

T23

T31

n+1

,

⟨(∂ξe

∂x

)s

n+1

⟩=

ξe1,1

ξe2,2

ξe3,3

ξe1,2 + ξe2,1

ξe2,3 + ξe3,2

ξe3,1 + ξe1,3

n+1

. (3.51)

It follows that

⟨(∂ξe

∂x

)s

n+1

⟩

︸︷︷︸6×1

= [Ben+1]︸︷︷︸

6×(3×NEN)

[ξe

n+1]︸︷︷︸(3×NEN)×1

=

[[Be

1,n+1]︸︷︷︸6×3

[Be2,n+1] · · · [Be

NEN,n+1]]

ξe1

ξe2...

ξeNEN

n+1

, (3.52)

May 7, 2019 ME280B


where

[Be

i,n+1

]︸︷︷︸

6×3

=

N ei,1

N ei,2

N ei,3

N ei,2 N e

i,1

N ei,3 N e

i,2

N ei,3 N e

i,1

n+1

(3.53)

and N ei,j =

∂Nei

∂xj, where now it is understood that the element interpolation functions are

functions of the current position of the element points, that is, N ei = N e

i (x).

Substituting the interpolations (3.44), (3.45) and (3.47), and the vectorial representation

of stress from (3.46)1 to the weak form (restricted to the domain Ωe0 of element e), one finds

that

[ξe

n+1]T

(∫

Ωe0

[Ne]Tρ0[Ne] dV

)[ae

n+1] + [ξe

n+1]T

∫

Ωe0

[Be]T < Pn+1 > dV

= [ξe

n+1]T

∫

Ωe0

[Ne]Tρ0[bn+1] dV + [ξe

n+1]T

∫

∂Ωe0∩Γq0

[Ne]T [pn+1] dA

+ [ξe

n+1]T

∫

∂Ωe0\Γq0

[Ne]Tpn+1 dA , (3.54)

where ∂Ωe0 ∩ Γq0 is the part of the element boundary which (potentially) intersects with the

Neumann boundary Γq0 and ∂Ωe0 \Γq0 is the rest of the element boundary, as in Figure 3.10.

∂Ωe0 ∩ Γq0

∂Ωe0 \ Γq0

Figure 3.10. Partition of element boundary ∂Ωe0

The preceding statement may be compactly rewritten as

[ξe

n+1]T[[Me][ae

n+1] + [Re(uen+1)]− [Fe

n+1]]

= [ξe

n+1]T

∫

∂Ωe0\Γq0

[Ne]T [pn+1] dA (3.55)

ME280B May 7, 2019


where

[Me]︸︷︷︸(3×NEN)×(3×NEN)

=

∫

Ωe0

[Ne]Tρ0[Ne] dV (3.56)

is the element mass matrix, which is clearly symmetric,

[Ren+1]︸︷︷︸

(3×NEN)×1

=

∫

Ωe0

[Be]T < Pn+1 > dV (3.57)

is the element stress-divergence vector, and

[Fen+1]︸︷︷︸

(3×NEN)×1

=

∫

Ωe0

[Ne]Tρ0[bn+1] dV +

∫

∂Ωe0∩Γq0

[Ne]T [pn+1] dA (3.58)

is the element external force vector.

Since ξn+1 is an arbitrary tangent vector on the current configuration, it follows that

[Me][aen+1] + [Re(ue

n+1)] = [Fen+1] +

∫

∂Ωe0\Γq0

[Ne]T [pn+1] dA . (3.59)

Under the action of the assembly operator Ae, the above equation leads to

Ae

[[Me][ae

n+1] + [Re(uen+1)]

]= A

e

[[Fe

n+1] +

∫

∂Ωe0\Γq0

[Ne]T [pn+1] dA

]. (3.60)

which, in turn, yields

[M][an+1] + [R(un+1)] = [Fn+1] , (3.61)

where

[M] = Ae[Me] (3.62)

is the global mass matrix, which is symmetric,

[Rn+1] = Ae[Re

n+1] (3.63)

is the global stress-divergence vector, and

[Fn+1] = Ae[Fe

n+1] (3.64)

is the global external force vector. In (3.61), the termAe

∫∂Ωe

0\Γq0[Ne]T [pn+1] dA, which quan-

tifies the cumulative interelement jump in tractions is neglected, as is customary in finite

May 7, 2019 ME280B


element approximations. This term is frequently used to quantify the error in the finite

element analysis a posteriori. In addition, the global displacement and acceleration vectors

are defined as

[un+1] =

u1n+1

u2n+1

· · ·uNUMNP

n+1

, [an+1] =

a1n+1

a2n+1

· · ·aNUMNPn+1

, (3.65)

where NUMNP is the total number of nodes in the mesh. Equations (3.61) constitute the semi-

discrete form of linear momentum balance, that is, the approximate equations of motion

obtained after spatial discretization. This is a (generally) non-linear system of second-order

ordinary differential equations in time, with unknowns the nodal displacements un+1. Under

quasi-static conditions (that is, when the inertia term can be neglected), the above system

of equations becomes purely algebraic. Also, Dirichlet boundary conditions are readily ac-

counted for by merely fixing the relevant displacement degrees-of-freedom on nodes which

lie on Γu and, if desired, calculating the corresponding forces upon solution of the overall

problem as reactions.

Note that the stress-divergence vector in (3.57) may well depend not only on un+1, but

also on vn+1. This would be the case when the stress response is rate-dependent. Also, the

external force vector in (3.58) may be explicitly dependent on the motion, as would be the

case, for instance, when the externally imposed traction is a follower load (e.g., a pressure

load). In this case, Fn+1 = Fn+1(un+1), so the external force is itself a function of the

solution.

If one chooses to start with the spatial statement of linear momentum balance in (3.41),

an analogous procedure would ensue that would yield the element arrays

[Me]︸︷︷︸(3×NEN)×(3×NEN)

=

∫

Ωe

[Ne]Tρ[Ne] dv (3.66)

[Ren+1]︸︷︷︸

(3×NEN)×1

=

∫

Ωe

[Ben+1]

T < Tn+1 > dv (3.67)

and

[Fen+1]︸︷︷︸

(3×NEN)×1

=

∫

Ωe

[Ne]Tρ[bn+1] dv +

∫

∂Ωe∩Γqn+1

[Ne]T [tn+1] da . (3.68)

The system of differential equations in (3.61) may be integrated in time using one of

several standard integration methods. The most commonly used such time integrator in

ME280B May 7, 2019


solid mechanics is the Newmark method,2 which is based on a time series expansion of

(un+1, vn+1), such as one may write in the time interval (tn, tn+1]

un+1 = un + vn∆tn +1

2[(1− 2β)an + 2βan+1] ∆t

2n

vn+1 = vn + [(1− γ)an + γan+1] ∆tn

, (3.69)

where the parameters β and γ are chosen such that

0 < γ ≤ 1 , 0 < β ≤ 1

2, (3.70)

and the brackets around each of the matrix terms is dropped for brevity. It is easy to show

that the special case β = 14, γ = 1

2corresponds to the trapezoidal rule. Equation (3.69)1 may

be solved for the acceleration an+1 as

an+1 =1

β∆t2n(un+1 − un − vn∆tn)−

1− 2β

2βan . (3.71)

Substituting the above to (3.61) yields

1

β∆t2nMun+1 +R(un+1) = Fn+1 +M

(un + vn∆tn)

1

β∆t2n+

1− 2β

2βan

︸︷︷︸known quantities

. (3.72)

The preceding system of non-linear algebraic equations for un+1 can be solved using a stan-

dard iterative method (e.g., the Newton-Raphson method or one of its variants). After un+1

is determined, an+1 and, then, vn+1 are computed from (3.71) and (3.69)2, respectively.

An alternative solution sequence would entail substituting (3.69)1 into (3.61), solving for

the acceleration an+1, then substituting it back into (3.69)1,2 to determine un+1 and vn+1.

In either approach, minor and straightforward modifications suffice to accommodate the

case where Fn+1 depends explicitly on un+1 or the case where Rn+1 depends explicitly on

both un+1 and vn+1.

The above time integration scheme is implicit, in the sense that determining the state at

time tn+1 requires the solution of a system of algebraic equations. A weighted-residual formal-

ization in time can be easily formulated results in deriving the Newmark equation (3.69).3

2N.M. Newmark, “A Method of Computation in Structural Dynamics”, J. Engr. Mech. Div. ASCE,

85:67-94, (1959).3O.C. Zienkiewicz, “A New Look at the Newmark, Houbolt and other Stepping Formulas. A Weighted

Residual Approach”, Earthquake Engin. Struct. Dyn., 5:413-418, (1977).

May 7, 2019 ME280B

https://onlinelibrary.wiley.com/doi/epdf/10.1002/eqe.4290050407

https://onlinelibrary.wiley.com/doi/epdf/10.1002/eqe.4290050407


An explicit integration scheme may be obtained from the Newmark formulae by setting

β = 0. This leads to the equations

un+1 = un + vn∆tn +1

2an∆t

2n

vn+1 = vn + [(1− γ)an + γan+1] ∆tn

. (3.73)

It is again easy to show that in the special case γ = 0.5 one recovers the classical centered-

difference method. The general explicit Newmark integrator is implemented as follows: start-

ing from the semi-discrete form (3.61), one may substitute un+1 from (3.73)1 to get

Man+1 = Fn+1 −R(un + vn∆tn +1

2an∆t

2n) . (3.74)

If M is rendered diagonal,4 then an+1 can be determined without solving equations. Sub-

sequently, the velocity vector vn+1 is computed from (3.73)2. Note that the displacement

vector un+1 is updated through (3.73)1 without using an+1.

As with the implicit Newmark scheme, it is easy to accomodate the case where Fn+1

depends on un+1. On the other hand, if Rn+1 depends on un+1, then the explicit nature

of the time-stepping may not be preserved unless one resorts to a time-shifting modifica-

tion, in which the term Rn+1(un+1, vn+1) is replaced, at an error that depends on ∆tn, by

Rn+1(un+1, vn) or Rn+1(un+1, vn + an∆tn).

Explicit time integration is computationally inexpensive, as there is no need to compute

gradients and solve coupled algebraic equations. However, explicit time integration can only

be conditionally stable, that is, the step-size must be restricted (sometimes severely) to yield

convergent solutions.

The preceding formulation is often referred to total Lagrangian. This means that the mo-

tion and deformation is always measured relative to the original fixed reference configuration

regardless of its magnitude. Alternatively, one may choose to measure the motion and de-

formation at time tn+1 relative to any previously determined (and, therefore, de facto fixed)

configuration at time t = tk, where k = 1, 2, · · · , n. The so-called updated Lagrangian formu-

lation is specifically obtained by measuring the motion and deformation at t = tn+1 relative

to the configuration at time t = tn.5 To this end, consider again a motion χ : R0×R→ E3,

as in Figure 3.11.

4See Appendix H of O.C. Zienkiewicz, R.L. Taylor, and J.Z. Zhu, The Finite Element Method: Its

Basis & Fundamentals, 7th edition, Butterworth-Heinemann, Boston, (2013), for full details on the theory

and practice of mass matrix diagonalization.5K.-J. Bathe, E. Ramm, and E.L. Wilson, “Finite element formulations for large deformation dynamic

analysis”, Int. J. Num. Meth. Engr., 9:353–386, (1975).

ME280B May 7, 2019

https://www.sciencedirect.com/book/9781856176330/the-finite-element-method-its-basis-and-fundamentals

https://www.sciencedirect.com/book/9781856176330/the-finite-element-method-its-basis-and-fundamentals

http://onlinelibrary.wiley.com/doi/10.1002/nme.1620090207/abstract

http://onlinelibrary.wiley.com/doi/10.1002/nme.1620090207/abstract


X xn

xn+1

unn+1

χn

R0

Rn

Rn+1

Figure 3.11. Configurations for updated Lagrangian formulation

Note that, by definition,

xn = χ(X, tn) (3.75)

and

xn+1 = χ(X, tn+1) , (3.76)

so that a mapping χn : Rn × R→ E3 is defined, such that for any time t

x = χ(X, t) = χ(χ−1tn (xn), t) = χn(xn, t) . (3.77)

Clearly, χn is the relative motion with respect to the (generally deformed) configuration Rn.

It follows that the deformation gradient Fnn+1 at time t = tn+1 relative to the configurationRn

is defined as

Fnn+1 =

∂χn(xn, tn+1)

∂xn

=∂xn+1

∂xn

=∂(xn + un

n+1)

∂xn

= I+∂un

n+1

∂xn

,

where unn+1 = xn+1− xn is the displacement at tn+1 relative to tn, see Figure 3.11. Invoking

the chain rule and recalling (3.77),

∂χ(X, tn+1)

∂X=

∂χn(xn, tn+1)

∂xn

∂χ(X, tn)

∂X(3.78)

or

Fn+1 = Fnn+1Fn . (3.79)

It follows from the above that, since detF > 0 for all (X, t), the same applies to Fnn+1 for all

(xn, t).

May 7, 2019 ME280B


Starting from the spatial statement of the weak form in (3.41), one may “pull-back” to

the configuration Rn to find that

∫

Rn

ξnn+1 · ρnann+1 dVn +

∫

Rn

∂ξnn+1

∂xn

·Pnn+1 dVn

=

∫

Rn

ξnn+1 · ρnbnn+1 dVn +

∫

Γqn

ξnn+1 · pnn+1 dAn , (3.80)

where

ξ = ξ(x, t) = ξ(χn t(xn), t) = ξn(xn, t) , (3.81)

hence

ξnn+1 = ξn(xn, tn+1) . (3.82)

Also, for the acceleration and body forces

a = a(x, t) = a (χn(xn, t), t) = an(xn, t) (3.83)

b = b(x, t) = b (χn(xn, t), t) = bn(xn, t) , (3.84)

therefore

ann+1 = an(xn, tn+1) , bn

n+1 = bn(xn, tn+1) . (3.85)

In addition,

ρn = ρn+1Jnn+1 , (3.86)

where Jnn+1 = detFn

n+1, which confirms that ρn in (3.86) is precisely the mass density of the

body at t = tn. Likewise,

dVn =1

Jnn+1

dvn+1 (3.87)

and

nn+1 dan+1 = Jnn+1

(Fn

n+1

)−Tnn dAn , (3.88)

where nn and nn+1 are unit normals to the same material surface at times tn and tn+1,

respectively.

Analogous definitions apply for the traction and stress. In particular, the traction vec-

tor pnn+1 is defined according to

tn+1 dan+1 = pnn+1 dAn , (3.89)

ME280B May 7, 2019


while Pnn+1 is the first Piola-Kirchhoff stress tensor obtained by resolving tractions on the

geometry of the configuration Rn, that is,

Pnn+1 = Jn

n+1Tn+1

(Fn

n+1

)−T. (3.90)

In view of the above and upon recalling (3.44), one may write for a typical Lagrangian

element which at time t = tn occupies the region Ωen

un(xn, tn+1)∣∣Ωe

n= ue n

n+1(xn).=

NEN∑

i=1

N ei u

e ni n+1 = [Ne][ue n

n+1] , (3.91)

and, likewise,

ae nn+1

.= [Ne][ae n

n+1] , ξe nn+1.= [Ne][ξ

e n

n+1] . (3.92)

Furthermore, ⟨∂ξe nn+1

∂xn

⟩

︸︷︷︸(9×1)

= [Be n]︸︷︷︸9×(3×NEN)

[ξe n

n+1]︸︷︷︸(3×NEN)×1

, (3.93)

where

[Be n]︸︷︷︸9×(3×NEN)

=[[Be n

1 ]︸︷︷︸(9×3)

[Be n2 ] · · · [Be n

NEN]]

(3.94)

and

[Be ni ]︸︷︷︸

9×3

=

N ei,1

N ei,2

N ei,3

N ei,2

N ei,3

N ei,1

N ei,3

N ei,1

N ei,2

. (3.95)

Note that here N ei = N e

i (xn) and Nei,j =

∂Nei

∂xnj

=∂Ne

i

∂XAF−1nAj.

Substituting the interpolations in (3.91), (3.92) and (3.93) to the weak form (3.80) applied

to element e yields

[ξe n

n+1]T[[Me][ae n

n+1] + [Re n(ue nn+1)]− [Fe n

n+1]]

= [ξe n

n+1]T

∫

∂Ωen\Γqn

[Ne]T [pnn+1] dAn , (3.96)

May 7, 2019 ME280B


where

[Me]︸︷︷︸(3×NEN)×(3×NEN)

=

∫

Ωe0

[Ne]Tρn[Ne] dVn , (3.97)

[Re nn+1]︸︷︷︸

(3×NEN)×1

=

∫

Ωen

[Be n]T < Pnn+1 > dVn (3.98)

and

[Fe nn+1]︸︷︷︸

(3×NEN)×1

=

∫

Ωen

[Ne]Tρn[bnn+1] dVn +

∫

∂Ωen∩Γqn

[Ne]T [pnn+1] dAn . (3.99)

Under the action of the assembly operator Ae

over all elements one recovers the semi-

discrete form of the updated Lagrangian formulation as

Mann+1 +Rn(un

n+1) = Fnn+1 , (3.100)

where

[unn+1] =

u1nn+1

u2nn+1

· · ·uNUMNPnn+1

, [ann+1] =

a1nn+1

a2nn+1

· · ·aNUMNPnn+1

. (3.101)

The solution of the above system of non-linear second-order differential equations in time is

achieved as in the total Lagrangian formulation.

The total and updated Lagrangian formulations are completely equivalent, in the sense

that, given the same finite element mesh, they yield the exact same solution. However,

since they involve different unknown vectors ([un+1] for the total Lagrangian and [unn+1] for

the update Lagrangian method), this solution is obtained by solving two different non-linear

algebraic systems. Therefore, it is conceivable that one of the two systems may be more easily

(or, given the limitations of finite arithmetic, more accurately) solvable than the other. This

would form a practical basis for selecting one of the two methods over the other.

3.5 Corotational Methods

When solving problems in solid mechanics, it is sometimes desirable to use coordinate systems

which rotate together with a Lagrangian mesh. This is principally the case when using

structural elements (e.g., bars, beams). By way of motivation, contrast the cases of fixed vs.

corotating coordinate systems for a two-dimensional bar element, as in Figure 3.12.

ME280B May 7, 2019

Corotational Methods 61

R0 R

R

fixed

corotational

Figure 3.12. Fixed vs. corotational coordinate system

Consider the general case shown in Figure 3.13, where one may identify a rotation

field R(x, t), such that the balance laws are expressed at any point x and time t in terms

of the corotating right-hand orthonormal coordinate system e′i. This system is related to

the fixed global coordinate system ei according to

e′i = Rei , (3.102)

for i = 1, 2, 3. Any spatial vector, say v, is resolved relative to ei and e′i as

v = viei = v′ie′i . (3.103)

Rx

ei

e′i

Figure 3.13. General corotational coordinate system

In view of (3.102) and (3.103), one may write

v′i = v · e′i= vjej · e′i = vjej · (Rei) = Rjivj .

(3.104)

May 7, 2019 ME280B


In direct notation, equation (3.104) may be stated as

v′ = RTv , (3.105)

where v′ = v′iei is the vector with the corotational components ofv on the global basis (hence,

in general, v′ 6= v).

Similarly, for a spatial tensor, say T, one may write with the aid of (3.102)

T = Tijei ⊗ ej = T ′ije

′i ⊗ e′j

= T ′ij(Rei)⊗ (Rej)

= R(T ′ijei ⊗ ej)R

T ,

(3.106)

which implies that

T′ = RTTR ; T ′ij = RkiTklRlj . (3.107)

Of all terms in the weak form of linear momentum balance (3.41) typically only the

stress-divergence term is resolved using the corotational system (the other terms are resolved

directly in the global system). Ultimately, the stress-divergence term is also rotated back to

the global system before the integration of the semi-discrete equations.

Taking into account (3.105) and (3.106), and using the chain rule, the stress-divergence

term in (3.41) takes the form

∫

R

(∂ξ

∂x

)s

·T dv =

∫

R

(∂ξ

∂ξ′∂ξ′

∂x′

∂x′

∂x

)s

· (RTRT ) dv

=

∫

R

(R∂ξ′

∂x′RT

)s

· (RTRT ) dv

=

∫

R

(∂ξ′

∂x′

)s

·T′ dv .

(3.108)

It is important to emphasize that the coordinate system e′i is not inertial. This means

that if one needs to represent time-dependent effects (such as those of inertia) using a corota-

tional frame, then one needs to explicitly account for the dynamics of the coordinate system.

3.6 Arbitrary Lagrangian-Eulerian Methods

Arbitrary Lagrangian-Eulerian (ALE) methods constitute a creative compromise between

the classical Lagrangian and Eulerian methods. In ALE methods, the mesh undergoes a

ME280B May 7, 2019

Arbitrary Lagrangian-Eulerian Methods 63

motion which is generally different from the motion of the continuum. The mesh motion

is specifically designed to preserve, to the extent possible, the quality of the mesh as the

continuum undergoes significant deformation.

To set up a general ALE method, consider a body B with reference configuration R0 at

time t = 0, and identify this configuration with an initial “mesh configuration”M0 at the

same time, see Figure 3.14. Next, in addition to the usual body motion χ : R0 × R → E3,

X

R

R0 ≡M0

M

x

xM

ei

EA

χ

χM

Figure 3.14. Configurations of ALE method

admit the existence of a “mesh motion” χM :M0 × R→ E3, such that

xM = χM(XM , t) , (3.109)

where, in general, x 6= xM when X = XM , as in Figure 3.14. This mesh motion is sometimes

referred to as “arbitrary”, in the sense that it is generally different from the motion of the

body. The velocity vm of the mesh is now defined as

vM =∂χM(XM , t)

∂t. (3.110)

Two special cases arise naturally: (a) χM = χ, in which case the mesh motion is identical

to body motion and the method is Lagrangian, and (b) χM = i, hence xM = X, in which

case the mesh remains stationary, thereby rendering the method Eulerian.

Attention is focused henceforth to the case χM 6= χ and χM 6= i. Generally, creative

choices for χM can provide serious advantages over pure Lagrangian or Eulerian methods

when the continuum undergoes significant non-homogeneous deformations. The mesh mo-

tion χM is typically chosen to satisfy the following four conditions:

May 7, 2019 ME280B


(i) The ALE mesh consists of elements with better aspect ratios and narrower size distri-

bution than the corresponding Lagrangian mesh,

(ii) The element connectivities do not change (otherwise, the procedure yields a new mesh,

and is referred to as remeshing),

(iii) The mesh motion χM is assumed invertible at all times, that is,

JM = det∂χM

∂XM

6= 0 . (3.111)

As argued in Section 1.1, the inverse function theorem enables the unique mapping of a

mesh point xM to its referential counterpart XM and, through it, to a material point x

with which xM coincided at time t = 0. Actually, since JM(XM , 0) = 1 and χM is

assumed smooth, it follows from (3.111) that JM(XM , t) > 0 at all times.

Since the mesh motion is invertible, one may write

vM = vM(XM , t) = vM(χ−1M t(xM), t) = vM(xM , t) , (3.112)

therefore the mesh velocity may be readily expressed in spatial form.

(iv) In certain classes of problems, the body motion and the mesh motion are chosen such

that the normal components of their velocities coincide on the boundary of the body (or

on part of it) at all times. This is generally the case with problems in solid Mechanics

and in free-surface flow problems of fluid mechanics (at least along the free surface).

This restriction is placed to avoid the flow of material outside the mesh, which would

render it unavailable for approximation. At a given time t, let x = xM lie on the

(common) boundary ∂R = ∂M, having outward unit normal n, as in Figure 3.15.

∂R ≡ ∂M

nv

vM

Figure 3.15. Velocity constraint on the boundary of an ALE mesh

ME280B May 7, 2019


The above constraint on the mesh motion implies that

(v − vM) · n = 0 (3.113)

on ∂R ≡ ∂M. Thus, the mesh motion χM is such that it convects with the body

along its boundary. Assuming that the constraint (3.113) is enforced on the whole

(common) boundary of the mesh and body, the kinematic setting for the ALE method

is now depicted in Figure 3.16.

X

R ≡MR0 ≡M0

x

xM

eiEA

χ

χM

Figure 3.16. Configurations of ALE method with coincident body and mesh boundaries

The two main tasks in developing an ALE method are: (a) to formulate and enforce the

balance laws on the ALE finite elements, and (b) to prescribe (in an automated fashion) an

appropriate motion χM .

Consider the first of the above tasks and start with conservation of mass. Let P ⊂ R be

a region occupied by material in the current configuration an let PM ⊂M be a mesh region,

such that P ≡ PM at time t, as in Figure 3.17.

R ≡M

P ≡ PM

n

∂P ≡ ∂PM

Figure 3.17. Mesh region PM in the current configuration

May 7, 2019 ME280B


Recall from 1.2.1 that, according to the principle of mass conservation, the rate of change

of the mass of the particles which occupy the material region P at time t is equal to zero.

Starting with the spatial statement of conservation of mass in (1.64), note that

d

dt

∫

P

ρ(x, t)dv =

∫

P

dρ

dtdv

︸︷︷︸rate of change of mass

for material particles in P

+

∫

P

ρd(dv)

dtdt

︸︷︷︸rate of change of massdue to change in P

=

∫

P

(ρ+ ρ div v) dv

=

∫

P

(∂ρ

∂t+∂ρ

∂x· v + ρ div v

)dv

=

∫

P

[∂ρ

∂t+ div(ρv)

]dv

=

∫

P

∂ρ

∂tdv

︸︷︷︸rate of change of mass

in fixed region P

+

∫

∂P

ρv · n da︸︷︷︸

flux of massthrough boundary ∂P

= 0 , (3.114)

where use is made of (1.10), (1.52), and the divergence theorem. Next, proceed with the

derivation of an ALE statement of mass balance by defining the rate of change of the mass

contained in a mesh region PM at time t as

dMdt

∫

PM

ρM(xM , t)dvM , (3.115)

wheredMdt

denotes the mesh time derivative, that is, the time derivative which keeps the

mesh points fixed.

At this state, it is instructive to consider the different representations of a scalar function,

say f , using material, spatial, or mesh coordinates. Specifically, one may write

f = f(X, t) = fM(XM , t) = f(x, t) = fM(xM , t) . (3.116)

Here, it is clear that fM(xM , t) = f(xM , t) (which means that there is no distinction between

the two spatial representations, hence, in principal, one may drop the subscript “M” from

fM),6 fM(XM , t) 6= f(XM , t) (which is tantamount to observing that at time t the density

of the mesh point which was at XM at time t0 is not equal to that of the material point that

occupied XM at t0), and, alsodMdtfM(XM , t) =

∂fM∂t

. It is also clear from (3.116) that

dMf

dt=

∂fM∂t

=∂fM∂t

+∂fM∂xM

· vM . (3.117)

6It is important to emphasize, in connection to the different representations of a function in (3.116), that

the mesh velocity vM is not equal to the mesh-coordinate representation of the material velocity.

ME280B May 7, 2019


The variables (xM , t) used in ALE methods are often referred to as the mixed variables.

Following the procedure above, one may also write

dMdt

∫

PM

ρM(xM , t) dvM =

∫

PM

dM ρMdt

dvM︸︷︷︸rate of change of massfor particles in PM

+

∫

PM

ρdM(dvM)

dt︸︷︷︸rate of change of mass

because of change in PM

=

∫

PM

(dM ρMdt

+ ρ div vM

)dvM

=

∫

PM

(∂ρM∂t

+∂ρM∂xM

· vM + ρ div vM

)dvM

=

∫

PM

[∂ρM∂t

+ div(ρvM)

]dvM

=

∫

PM

∂ρM∂t

dvM +

∫

∂PM

ρvM · n daM . (3.118)

Clearly, since∂ρM∂t

above denotes the rate of change of ρ for fixed xM in the (necessarily)

fixed region PM ≡ P , it follows from (3.118) that∫

PM

∂ρM∂t

dvM =

∫

P

∂ρ

∂tdv = −

∫

∂P

ρv · n da . (3.119)

Therefore, it is concluded from (3.118) and (3.119) that

dMdt

∫

PM

ρM(xM , t) dvM =

∫

∂PM

ρ(vM − v) · n daM , (3.120)

which states that the rate of change of the mass contained in a mesh region equals to the

rate at which mass flows through the boundary of the region. Using the divergence theorem,

one may alternatively express (3.120) as

dMdt

∫

PM

ρM(xM , t) dvM =

∫

PM

div [ρ(vM − v)] dvM . (3.121)

Upon combining (3.118)2 and (3.121), it follows that∫

PM

(dM ρMdt

+ ρ div vM

)dvM =

∫

PM

ρ div(vM − v) dvM +

∫

PM

∂ρM∂xM

· (vM − v) dvM ,

(3.122)

hence, appealing to the localization theorem,

dM ρMdt

+ ρ div v =∂ρM∂xM

· (vM − v) . (3.123)

For consistency, it merits attention to check the two following two cases:

May 7, 2019 ME280B


(a) vM = v, where (3.123) reduces to the standard mass balance statementdρ

dt+ρ div v = 0

(since, in this case,dMdt

=d

dt),

(b) vM = 0, where (3.123) reduces to the standard mass balance statement∂ρ

∂t+ div(ρv) = 0

(since, in this case,dMdt

=∂

∂t).

If (vM − v) · n = 0 on ∂M, then setting PM = M it is immediately concluded

from (3.120) that

dMdt

∫

M

ρ(xM , t) dvM = 0 , (3.124)

that is, the total mass of the body is conserved automatically by the ALE method.

The local statements of mass balance (1.66) (in the spatial frame) and (3.123) (in the

mesh frame) jointly imply that

ρ = −ρ div v =dM ρMdt

− ∂ρM∂xm

· (vM − v) . (3.125)

The derivation of linear momentum balance in the mesh frame follows a similar procedure.

First, define the rate of change of linear momentum in a mesh region PM at time t as

dMdt

∫

PM

ρv dvM , (3.126)

and then evaluate the mesh time derivative to conclude that

dMdt

∫

PM

ρvdvM =

∫

PM

dMdt

(ρv) dvM +

∫

PM

ρvdM(dvM)

dt

=

∫

PM

dMdt

(ρv) dvM +

∫

PM

ρv div vM dvM

=

∫

PM

[∂

∂t(ρv) +

∂(ρv)

∂xM

· vM + ρv div vM

]dvM

=

∫

PM

∂

∂t(ρv) dvM +

∫

PM

div(ρv ⊗ vM) dvM

=

∫

PM

∂

∂t(ρv) dvM +

∫

∂PM

ρv(vM · n) daM . (3.127)

ME280B May 7, 2019


Setting PM ≡ P and writing the linear momentum balance statement in Eulerian form as

d

dt

∫

P

ρv dv =

∫

P

ρb dv +

∫

P

t da

=

∫

P

(ρv + ρv div v

)dv

=

∫

P

[∂

∂t(ρv) +

∂(ρv)

∂x· v + ρv div v

]dv

=

∫

P

∂

∂t(ρv) dv +

∫

P

div(ρv ⊗ v) dv

=

∫

P

∂

∂t(ρv) dv +

∫

∂P

ρv(v · n) da . (3.128)

As argued before for mass balance,∫

PM

∂

∂t(ρv) dvM =

∫

P

∂

∂t(ρv) dv =

∫

P

ρb dv +

∫

∂P

t da−∫

∂P

ρv(v · n) da , (3.129)

where use is also made of (3.128). It follows from (3.127), (3.129), and the divergence

theorem that

dMdt

∫

PM

ρv dvM =

∫

PM

ρb dvM +

∫

∂PM

t daM +

∫

∂PM

ρv [(vM − v) · n] daM

=

∫

PM

ρb dvM +

∫

PM

divT dvM +

∫

PM

div [ρv ⊗ (vM − v)] dvM .

(3.130)

Expanding the left- and the right-hand sides of the above leads to

∫

PM

(dM ρMdt

v + ρdM v

dt+ ρv div vM

)dvM =

∫

PM

ρbdvM +

∫

PM

divTdvM

+

∫

PM

v

[∂ρM∂xM

· (vM − v) + ρ div(vM − v)

]dvM +

∫

PM

∂vM

∂xM

ρ(vM − v) dvM (3.131)

or, upon imposing conservation of mass in the form (3.123),∫

PM

ρdM v

dtdv =

∫

PM

ρb dvM +

∫

PM

divT dvM +

∫

PM

∂vM

∂xM

ρ(vM − v) dvM . (3.132)

Hence, the localization theorem, when applied to (3.132) results in the local form of linear

momentum balance

ρdM v

dt= ρb+ divT+

∂v

∂xM

ρ (vM − v) . (3.133)

As with mass conservation, it is instructive to consider the two special cases:

May 7, 2019 ME280B


(a) vM = v, where (3.133) reduces to the standard linear momentum balance statement

ρa = ρb+ divT (since, in this case,dMdt

=d

dt),

(b) vM = 0, where (3.133) reduces to ρ

(∂v

∂t+∂v

∂x· v)

= ρb + divT (since, in this case,

dMdt

=∂

∂t).

Angular momentum balance in the mesh frame yields the symmetry of the Cauchy stress,

as usual.

An alternative procedure for the derivation of the balance laws is possible without resort-

ing to any integral statements.7 Indeed, start by recalling (3.117) and use (1.66) to conclude

that

dMρMdt

=∂ρM∂t

+∂ρM∂xM

· vM =∂ρ

∂t+∂ρM∂xM

· vM

= − div(ρv) +∂ρM∂xM

· vM

= −ρ div v − ∂ρ

∂x· v +

∂ρM∂xM

· vM

= −ρ div v − ∂ρM∂xM

· (v − vM) , (3.134)

since ρ(xM , t) = ρM(xM , t) implies that∂ρ

∂t=∂ρM∂t

and also∂ρ

∂x=∂ρM∂x

. Therefore, (3.134)

immediately implies (3.123). Linear momentum balance may be derived analogously.

To derive weak forms of the balance laws (3.123) and (3.133), σ = σM(xM , t) and ξ =

xM(xM , t) be a scalar and vector function at a point xM ofM at time t, respectively. The

weak form of (3.123) amounts to finding ρ ∈ Z+M , such that

∫

M

σ

[dM ρ

dt+ ρ div v − ∂ρ

∂xM

· (vM − v)

]dvM = 0 , (3.135)

for all σ ∈ ZM , where, in analogy to the definitions in (3.15),

ZM = σ :M× R→ R , Z+M = σ :M× R→ R | σ > 0 . (3.136)

7J. Donea, A. Huerta, J.-Ph. Ponthot, and A. Rodriguez-Ferran. Arbitrary Lagrangian-Eulerian methods.

In E. Stein and R. De Borst and T.J.R. Hughes, editors, Encyclopedia of Computational Mechanics, Volume 1:

Fundamentals, chapter 14. John Wiley, New York, 2004.

ME280B May 7, 2019

https://onlinelibrary.wiley.com/doi/full/10.1002/0470091355.ecm009


Likewise, the weighted-residual form of the linear momentum balance statement in (3.133)

including the Neumann boundary conditions at a given time t requires that the displace-

ment u ∈ UM (or velocity v ∈ VM) be such that∫

M

ξ ·[ρdM v

dt− ρb− divT− ∂v

∂xM

ρ(vM − v)

]dvM +

∫

ΓqM

ξ · (t− t) daM = 0 , (3.137)

for all ξ ∈ WM , where

UM =

u :M0 × R→ E3 | det

(I+

∂u

X

)> 0 , ui(X, t) = ui on Γui

,

u(X, 0) = 0 , u(X, 0) = v0 (3.138)

for solids,

VM =v :M× R→ E3 | vi(x, t) = vi on Γui

, v(x, 0) = v0

(3.139)

for fluids, and

WM =ξ :M× R −→ E3 , ξi(x, t) = 0 on Γui

. (3.140)

Note that ΓqM in (3.137) is the part of the exterior boundary ∂M which coincides with

the material boundary Γq at time t, as in Figure 3.18. Using integration by parts and the

R0

R

M

PM

QM

P

PQ

Q

χ

χM

Γq

ΓqM

Figure 3.18. Neumann boundary conditions in ALE methods (R andM are shown apart from

each other for clarity)

divergence theorem, as well as the definition of WM in (3.140), it follows that

∫

M

ξ · divT dvM =

∫

ΓqM

ξ · t daM −∫

M

∂ξ

∂xM

·T dvM . (3.141)

May 7, 2019 ME280B


It follows from (3.137) and (3.141) that the weak form of linear momentum balance takes

the form

∫

M

ξ · ρdM v

dtdvM +

∫

M

∂ξ

∂xM

·T dvM

=

∫

M

ξ · ρb dvM +

∫

ΓqM

ξ · t daM +

∫

M

ξ · ρ[∂v

∂xM

(vM − v)

]dvM . (3.142)

There exist two general methodologies for the finite element approximation of (3.135)

and (3.142) in an incremental formulation:

(a) Solve (3.135) and (3.142) simultaneaously at every time step,

(b) Use an operator-split procedure, where a typical ALE time step comprises two sub-

steps:

(b.1) a purely Lagrangian step (where conservation of mass is automatic),

(b.2) if needed, a sub-step in which the mesh convects according to a specified mesh

motion χM (hence, mass balance should be enforced).

In both methodologies, it is important to emphasize that convection of any history vari-

ables (attached, by definition, to material points) requires an additional algorithmic proce-

dure, since the mesh points are not material.

Starting with the first of the above methodologies, interpolate the relevant fields in an

ALE finite element with domain Ωe at time tn+1 as

u(xM , tn+1)∣∣Ωe = ue

n+1(xM).=

NEN∑

i=1

N ei u

ein+1

= [Ne][uen+1] ,

v(xM , tn+1)∣∣Ωe = ve

n+1(xM).=

NEN∑

i=1

N ei v

ein+1

= [Ne][ven+1] ,

dMv

dt(xM , tn+1)

∣∣Ωe = ve

n+1(xM).=

NEN∑

i=1

N ei v

ein+1

= [Ne][ˆven+1] ,

vM(xM , tn+1)∣∣Ωe = ve

M n+1(xM).=

NEN∑

i=1

N ei v

eM in+1

= [Ne][veM n+1] ,

ξ(xM , tn+1)∣∣Ωe = ξen+1(xM)

.=

NEN∑

i=1

N ei ξ

ein+1

= [Ne][ξe

n+1] ,

(3.143)

ME280B May 7, 2019


where N ei = N e

i (xM), i = 1, 2, . . . , NEN. In addition,

ρ(xM , tn+1)∣∣Ωe = ρen+1(xM)

.=

NEM∑

j=1

N ej ρ

ejn+1

= [Ne][ρen+1] ,

dMρ

dt(xM , tn+1)

∣∣Ωe = ρen+1(xM)

.=

NEM∑

j=1

N ej ρ

ejn+1

= [Ne][ˆρen+1] ,

σ(xM , tn+1)∣∣Ωe = σe

n+1(xM).=

NEM∑

j=1

N ej σ

ejn+1

= [Ne][σen+1] ,

(3.144)

where N ej = N e

j (xM), j = 1, 2, . . . NEM, are the element interpolation function for density and

its derivatives. These may, in general, be different from those used for the body motion and

its derivatives.

Substituting the interpolations in (3.143) and (3.144) to (3.135), when the latter is re-

stricted to the domain Ωe of an ALE element, it follows that

[σen+1]

T

∫

Ωe

[Ne]T[Ne][ˆρe

n+1] +([Ne][ρe

n+1]) (

[∆en+1][v

en+1]

)

−([ρe

n+1]T [Λe

n+1]T)[Ne]

([ve

Mn+1]− [ve

n+1])

dvM = 0 , (3.145)

where

[∆en+1] =

[N e

1,1 N e1,2 N e

1,3 · · · N eI,1 N e

I,2 N eI,3 · · · N e

NEN,1 N eNEN,2 N e

NEN,3

](3.146)

and

[Λen+1] =

N e

1,1 · · · N eI,1 · · · N e

NEM,1

N e1,2 · · · N e

I,2 · · · N eNEM,2

N e1,3 · · · N e

I,3 · · · N eNEM,3

. (3.147)

Similarly, substituting the interpolations in (3.143) and (3.144) to (3.142), when the latter

is written for the domain Ωe of an ALE element, it follows that

[ξe

n+1]T

[∫

Ωe

[Ne]T([Ne][ˆve

n+1]) (

[Ne][ρen+1]

)dvM +

∫

Ωe

[Ben+1]

T < Tn+1 > dvM

−∫

Ωe

[Ne]T [bn+1]([Ne][ρe

n+1])dvM −

∫

∂Ωe

[Ne]T [tn+1]daM

−∫

∂Ωe

[Ne]T [Γen+1][N

e][([ve

Mn+1]− [ve

n+1]) (

[Ne][ρen+1]

)dvM

]= 0 . (3.148)

Here,

[Γen+1] =

[[Ne

,1][ven+1] [Ne

,2][ven+1] [Ne

,3][ven+1]

], (3.149)

May 7, 2019 ME280B


and all derivatives in [Ben+1] and [Γe

n+1] are taken with respect to xM .

Recalling that σen+1 and ξ

e

n+1 are arbitrary and after assemblying the element equations,

as usual, one recovers a system of coupled non-linear ordinary differential equations in time

with unknowns un+1 (or vn+1) and ρn+1. Note that the mesh motion χM is assumed to be

either prescribed or deduced (an extra task that will be discussed later in this section).

The two sets of equations can be integrated either by an implicit or an explicit method,

along the lines of the corresponding discussion in Section 3.4. In the former case, one may

use, e.g., the Newmark method at the cost of repeated solving of coupled algebraic equations.

In the latter case, one may use, e.g., the centered-difference method, which, for the domain

(tn, tn+1], yields the equations

ρn+1 = ρn + ˆρn∆tn (3.150)

and also

vn+1 = vn +1

2

(ˆvn + ˆvn+1

)∆tn , un+1 = un + vn∆tn +

1

2ˆvn∆t

2n . (3.151)

For the centered-difference method to be free of equation-solving, it is necessary that the

two global associated with the element matrices

[Men+1] =

∫

Ωe

[Ne]T [Ne] dvM , [Men+1] =

∫

Ωe

ρ[Ne]T [Ne] dvM (3.152)

to be rendered diagonal. In addition, it is readily seen from (3.145) and (3.148) that an

explicit velocity estimate ve,0n+1 is needed to estimate the ALE-related flux terms before the

velocity vector is updated using (3.151)1.

Consider next the alternative operator-split algorithm, which consists of a purely La-

grangian step, followed, if needed, by an advective or Eulerian remapping sub-step, where

the nodal/elemental variables are mapped from the previous Lagrangian mesh to the new

ALE mesh, as dictated by the mesh motion. Most often in practice, there are multiple

Lagrangian steps before a single remapping sub-step. Also, the operator-split algorithm fa-

cilitates the automatic determination of the mesh motion based on the distortion encountered

in the Lagrangian mesh between two Eulerian remappings.

The advective remapping sub-step involves the following tasks:

(i) select the mesh regions to be remapped,

(ii) deduce a suitable mesh motion,

(iii) compute the advection of mass and linear momentum,

ME280B May 7, 2019


(iv) compute the advection of history variables.

By way of background, it is instructive to distinguish advective remapping from adaptive

remeshing (or mesh rezoning). In the latter, a new mesh is generated at t = tn+1 not

necessarily with the same connectivities with the old mesh from t = tn, and all variables are

projected from the old mesh. In contrast, Eulerian remapping by the mesh motion χM maps

the mesh from t = tn to its new configuration at t = tn+1, and updates all relevant variables

to account for the transport of material. Hence, Eulerian remapping entails no change in

the element connectivities.

Deferring momentarily the discussion of tasks (i) and (ii) above, concentrate on the last

two tasks of Eulerian remapping. For task (iii), two distinct approaches may be adopted.

In the first approach, the ALE equations are solved in (tn, tn+1] for a fixed motion χ (as

computed from the Lagrangian step) and for prescribed mesh motion χM to obtain updated

values for ρn+1, un+1, vn+1, etc. Indeed, assume that the vectors ρLn+1, uLn+1, v

Ln+1, etc.,

have been computed from the Lagrangian step. Also, assume that the mesh motion in

(tn, tn+1] has been somehow determined. The nodal and elemental quantities will change

Rn ≡Mn

Rn+1

Mn+1

χ = χM

χM (prescribed)

χ (known)

Figure 3.19. Operator-split approach for ALE implementation (Rn+1, resulting from Lagrangian

step and Mn+1, resulting from Eulerian remapping are shown apart from each

other for clarity)

during the remapping sub-step because the nodes are moving independently of the fixed

body, as in Figure 3.19. The new nodal and elemental values can be computed separately for

conservation of mass and balance of linear momentum, using the ALE weak forms introduced

earlier, except that here the density and body motion are already known with reference to

May 7, 2019 ME280B


the Lagrangian mesh and need only be calculated on the remapped mesh. In particular, the

discrete counterpart of the mass conservation equation (3.135) becomes

∫

M

σ

[dM ρ

dt+ ρ div vL − ∂ρ

∂xM

· (vM − vL)

]dvM = 0 , (3.153)

and can be integrated in (tn, tn+1] with given vL and vM for ρn+1. Likewise, one may

obtain vn+1 by rewriting the discrete counterpart of (3.142) as

∫

M

ξ · ρLdM v

dtdvM +

∫

M

∂ξ

∂xM

·T dvM

=

∫

M

ξ · ρLb dvM +

∫

ΓqM

ξ · t daM +

∫

M

ξ · ρL[∂v

∂xM

(vM − v)

]dvM , (3.154)

and integrating for vn+1 in (tn, tn+1] with given ρL and vM . Note that the sets of equations

resulting from (3.153) and (3.154) are uncoupled.

Projection methods are addressing the same problem in a purely geometric fashion, which

is also applicable to the advection of history variables in task (iv) above. To introduce these

methods, consider a scalar field f which must be remapped at a given time tn+1. With

“-” mesh “+” mesh

Figure 3.20. Projection method from one mesh (denoted “-”) to another (denoted “+”)

reference to Figure 3.20, let the interpolation of f on the initial mesh (denoted here as the

“-” mesh) be

f− =∑

I

ϕ−I f

−I , (3.155)

where ϕ−I are global interporation functions. After remapping, the same function is interpo-

lated on the new mesh (denoted here as the “+” mesh) as

f+ =∑

I

ϕ+I f

+I , (3.156)

ME280B May 7, 2019


where ϕ+I are the global interpolation functions on the new mesh. Assuming that f− is

given, the goal of a projection method is to determine the nodal values f+I , such that the

error between f− and f+ be minimized in some way. This is generally accomplished by

choosing f+I such that ∫

M

ψ(f+ − f−) dvM = 0 . (3.157)

Here, ψ =∑

I αIψI , where ψI are linearly independent weighting functions and αI are

arbitrary parameters. The projection condition (3.157) leads to the system of linear algebraic

equations

∫

M

ψI(f+ − f−) dvM =

∫

M

ψI

[∑

J

ϕ+J f

+J −

∑

J

ϕ−J f

−J

]dvM = 0 . (3.158)

for all I. This system can be solved for f+J assuming non-singularity of the matrix with

components

∫

M

ψIϕ+J dvM . A typical choice for this projection is ψI = ϕ+

I , which leads to a

classical least-squares problem.

Attention is now turned to tasks (i) and (ii), which pertain to the definition of a suitable

mesh motion. Preliminary to any developments, it is essential to stress that such for such

a motion to be practical, it has to be defined in an automated manner as the ALE solution

evolves. Start by considering a popular heuristic procedure employed for meshes with 4-

node quadrilateral elements.8 The idea here is to maintain a mesh which resembles as

much as possible a uniform mesh, see Figure 3.21. With reference to Figure 3.21, consider a

ideal mesh patch actual mesh patch

A1A2

A3A4

θ1 θ2

θ3θ4

Figure 3.21. Volumetric and shear distortion indicators for a patch of 4-node quadrilaterals

8D.J. Benson. An efficient, accurate, simple ALE method for nonlinear finite element programs. Comp.

Meth. Appl. Mech. Engr., 72:305–350, (1989).

May 7, 2019 ME280B



typical node in a mesh of 4-node quadrilaterals, and define a scalar measure Rv of volumetric

distortion around this node as

Rv =minI=1−4

(AI)

maxI=1−4

(AI), (3.159)

where, generally, 0 < Rv ≤ 1 and Rv = 1 for the uniform mesh. A scalar measure Rs is

shear distortion is likewise defined as

Rs = minI=1−4

(sin θi) , (3.160)

where, generally, 0 < Rs ≤ 1 and Rs = 1 for a uniform mesh. Note that Rs detects both

acute and obtuse angles. Subsequently, define an empirical non-negative function f(Rv, Rs),

such that f(1, 1) = 0 (optimal value), and a node I is “flagged” to move when f > f , where

f > 0 is a user-specified constant.

The last outstanding task is to decuce a mesh motion. The simplest possible scheme

would place node I in the coordinate-wise arithmetic mean of the locations of its immediately

neighboring nodes. A more sophisticated (and better performing) scheme is based on the so-

called equipotential relaxation originally propounded by A.M. Winslow for triangular meshes

natural domain (ξ, η) physical domain (x, y)

ξ = const ξ = const

η = const η = const

Figure 3.22. Smooth map between equilateral triangles in the natural space (ξ, η) and general

triangles in the physical space (x, y)

used in hydrocodes.9 Winslow argued that, since any triagular element domain can be

mapped uniquely into a regular equilateral triangle (common for all elements of a mesh) in

the space of natural coordinates (ξ, η), a high-quality mesh can be created by the intersections

of equally-spaced lines of “potential” ξ = constant, η = constant, together with a third set

9A.M. Winslow. Numerical solution of the quasilinear Poisson equation in a nonuniform triangle mesh.

J. Comp. Physics, 2:149–172, (1967).

ME280B May 7, 2019



of lines drawn through the intersection points, as in Figure 3.22. The Laplace-Poisson

equation (being an archetypical example of an elliptic partial differential equation) produces

a smooth set of equipotential lines, provided the boundary conditions and force are smooth.

Therefore, generating a high-quality mesh (or, improving the quality of an existing mesh) is

tantamount to solving the Laplace-Poisson equation for the natural coordinates ξ and η in

the physical domain and defining element edges at equipotential lines of ξ and η. The same

logic applies for quadrilateral meshes mapped from square elements in the natural domain,

see Figure 3.23. Equipotential relaxation has been shown to produce excellent meshes for

natural domain (ξ, η) physical domain (x, y)

ξ = const ξ = const

η = constη = const

Figure 3.23. Smooth map between squares in the natural space (ξ, η) and general quadrilaterals

in the physical space (x, y)

relatively complex two-dimensional domains.10

To derive the equations used for equipotential relaxation, start by considering the general

two-dimensional mapping from the natural to the physical domain in the form

x = x(ξ, η) , y = y(ξ, η) . (3.161)

Since the mapping is assumed to be invertible, one may also write

ξ = ξ(x, y) , η = η(x, y) (3.162)

where the individual functions x, y, ξ, η are to be determined, see Figure 3.24. The two

Laplace-Poisson problems are now defined by

∇2ξ = 0 , ∇2η = 0 , (3.163)

10J.F. Thompson, F.C. Thames, and C.W. Mastin. Automatic numerical generation of body-fitted curvi-

linear coordinate system for fields containing any number of arbitrary two-dimensional bodies, J. Comput.

Phys., 15:299-319 (1974).

May 7, 2019 ME280B

https://www.sciencedirect.com/science/article/pii/0021999174901144



where ∇2 = ∂2

∂x2 + ∂2

∂y2. At this stage, it is crucial to observe that the boundary conditions

for the equations in (3.163) are given in the physical space (put differently, it is the domain

in the physical space that is known and that needs meshing). Therefore, since boundary

conditions are enforced on the dependent variable of a differential equation, it is necessary

that (3.163) be expressed inversely, that is, to interchange the dependent and independent

variables. This is possible since x, y are invertible.

x

y

x, y

ξ, ηξ

η

1 2

34

1′ 2′

3′4′

Figure 3.24. Smooth map from a square in the natural space (ξ, η) and a general quadrilateral

domain in the physical space (x, y)

To this end, use the chain rule to write for any differentiable function f = f(x, y),

∂f

∂ξ=

∂f

∂x

∂x

∂ξ+∂f

∂y

∂y

∂ξ∂f

∂η=

∂f

∂x

∂x

∂η+∂f

∂y

∂y

∂η

(3.164)

or, in matrix form, [f,ξ

f,η

]=

[x,ξ y,ξ

x,η y,η

][f,x

f,y

]. (3.165)

Therefore, [f,x

f,y

]=

1

j

[y,η −y,ξ−x,η x,ξ

][f,ξ

f,η

], (3.166)

where j = x,ξy,η − x,ηy,ξ and j 6= 0 owing to the invertibility of x, y. Setting f = ξ and

f = η, it follows from the above equation that

ξ,x =1

jy,η , ξ,y = −1

jx,η (3.167)

and

η,x =1

jy,ξ , η,y = −1

jx,ξ . (3.168)

ME280B May 7, 2019


Repeating this process to calculate expressions for ξ,xx, ξ,yy, η,xx, and η,yy, and substituting

these in (3.163) results in the two coupled non-linear partial differential equations

αx,ξξ − 2βx,ξη + γx,ηη = 0 , αy,ξξ − 2βy,ξη + γy,ηη = 0 , (3.169)

where

α = x2,η + y2,η , β = x,ξx,η + y,ξy,η , γ = x2,ξ + y2,ξ (3.170)

and the overbars have been dropped for brevity. With reference to Figure 3.24, the boundary

conditions take the form

x = x(ξ1, η) on 1′ − 4′ , y = y(ξ1, η) on 1′ − 4′ ,

x = x(ξ2, η) on 2′ − 3′ , y = y(ξ2, η) on 2′ − 3′ ,

x = x(ξ, η1) on 1′ − 2′ , y = y(ξ, η1) on 1′ − 2′ ,

x = x(ξ, η2) on 3′ − 4′ , y = y(ξ, η2) on 3′ − 4′ ,

(3.171)

where x, y are assumed to be known on the boundary of the physical domain. The solutions

of the equations (3.169) provide directly the placements of points (x, y) at the intersection

of equipotential curves ξ = constant and η = constant.

Note that the differential equations (3.169) are more complex than the original Laplacians

in (3.163). However, the domain and boundary conditions of (3.169) are considerably simpler

than those of the original Laplacians.

In practice, equations (3.169) are solved approximately by a finite difference method using

an iterative technique. For example, consider the case of 3-node triangles with equally-spaced

equipotential lines of unit spacing in the natural domain, as in Figure 3.25. Here, it is easy to

ξր, η =const.

ξ =const., ηր

0

12

3

4

5

6

8

Figure 3.25. Finite difference stencil for linear triangles with unit spacing in the natural domain

derive the placement (x0, y0) of the center node in terms of the placement of the neighboring

May 7, 2019 ME280B


nodes according to

x,ξ|0.=

1

6[(x1 + 2x6 + x5)− (x2 + 2x3 + x4)] ,

x,η|0.=

1

6[(x2 + 2x1 + x6)− (x3 + 2x4 + x5)] ,

x,ξξ|0.= x6 − 2x0 + x3 ,

x,ηη|0.= x1 − 2x0 + x4 ,

x,ξη|0.=

1

2[(x1 + x6 + x3 + x4)− (x2 + x5 + 2x0)] .

(3.172)

with corresponding equations for y,ξ, y,η, y,ξξ, y,ηη, and y,ξη. To deduce, say, (3.172)1, use

the divergence theorem to write for the shaded region Ω7 of Figure 3.25

∫

Ω7

x,ξ dv2 =

∫

∂Ω7

xnξ da2 . (3.173)

Recalling that all triangles are equilateral with unit side (hence, the area of each triangle is

equal to 1/√3), the above equation implies that, by finite difference approximation,

61√3x,ξ

∣∣∣∣0

.=

x5 + x62

2√3+x6 + x1

2

2√3+ 0− x2 + x3

2

2√3− x3 + x4

2

2√3+ 0 , (3.174)

which reduces to (3.172)1. An even simpler formula for x,ξ|0 may be obtained by considering

the divergence theorem only along the ξ-axis, in which case

x,ξ|0.=

1

2(x6 − x3) . (3.175)

Similar derivations apply to the remaining formulae in (3.172).

ξր, η =const.

ξ =const., ηր

0

123

4

5 6 7

8

Figure 3.26. Finite difference stencil for bilinear quadrilaterals with unit spacing in the natural

domain

ME280B May 7, 2019


For the case of bilinear quadrilateral elements, as in Figure 3.26, use of the divergence

theorem for the shared region ΩBox yields

∫

Ω2

x,ξ dv2 =

∫

∂Ω

xnξ da2 , (3.176)

or

4 x,ξ|0.=

x7 + x82

+x8 + x1

2+ 0 + 0− x3 + x4

2− x4 + x5

2− 0− 0 , (3.177)

hence,

x,ξ|0.=

1

8[(x7 + 2x8 + x1)− (x3 + 2x4 + x5)] , (3.178)

with analogously derived formulae for the remaining derivatives. Again, an even simpler

approximation results from one-dimensional finite differencing, so that

x,ξ|0.=

1

2(x8 − x4)

x,η|0.=

1

2(x2 − x6) ,

x,ξξ|0.= x8 − 2x0 + x4 ,

x,ηη|0.= x2 − 2x0 + x6 ,

x,ξη|0.=

1

4(x1 + x5 − x7 − x3) ,

(3.179)

with corresponding formulae for the derivatives of y.

Upon substituting either (3.172) or (3.179) into the equations in (3.169), one obtains a

non-linear algebraic system of two equations for (x0, y0). In practical computations, Poisson

smoothing is performed using a Gauss-Seidel-like iterative procedure, where the placement

of the nodes is updated successively from one node to the next until all nodes are exhausted.

Several rounds of Poisson smoothing are typically performed until convergence of the node

placements in an appropriate measure.

A variational interpretation of Poisson smoothing is possible and provides useful in-

sights.11 Indeed, it can be easily shown that the solution of (3.163) is attained by the

minimization of the functional

Is =

∫

Ω

(|∇ξ|2 + |∇n|2

)dv . (3.180)

11J.U. Brackbill and J.S. Saltzman. Adaptive zoning for singular problems in two dimensions. J. Comp.

Physics, 46:342–368, (1982).

May 7, 2019 ME280B



In this sense, the functional Is measures the variation in mesh spacing along equipotential

curves. Within the variational setting, it is now possible to introduce additional functionals

Io and Iv, as

Io =

∫

Ω

(∇ξ · ∇η)2 dv , Iv =

∫

Ω

wjdV , (3.181)

where w is a given positive function. These measure orthogonality and (weighted) volume

variation, respectively. The minimization of a combined functional of the form I = Is +

c1Io+ c2Iv, where c1, c2 ≥ 0, may be used to simultaneously control mesh quality in all three

measures. In addition, mesh grading (e.g., when one needs to refine the mesh in a specified

area) may be introduced by solving

∇2ξ = f1 , ∇2η = f2 , (3.182)

where the sources f1, f2 are appropriately chosen to effect mesh refinemenet, as needed.

3.7 Eulerian Methods

In computational mechanics, Eulerian finite element methods are used almost exclusively

in applications of fluid dynamics. The weak forms of mass and linear momentum balance

can be readily derived from their ALE counterparts (3.135) and (3.142) by setting the mesh

velocity vM to zero and reinterpreting all mesh time derivatives as partial derivatives with

respect to time at a fixed point in space. This leads to the statements

∫

R

σ

(∂ρ

∂t+∂ρ

∂x· v + ρ div v

)dv = 0 , (3.183)

and

∫

R

ξ · ρ(∂v

∂t+∂v

∂xv

)dv +

∫

R

∂ξ

∂x· T dv =

∫

R

ξ · ρb dv +

∫

Γq

ξ · t da , (3.184)

respectively, with ρ = ρ(x, t) and v = v(x, t), and with admissible spaces and weights defined

accordingly.

The interpolation of velocity and density fields follows (3.143) and (3.144), whereN ei = N e

i (x)

are the element interpolation functions.

A specific application of Eulerian finite elements for incompressible Newtonian fluids is

presented in Section 5.1.

ME280B May 7, 2019

Chapter 4

Solution of Non-linear Field Equations

4.1 Generalities

Recall the semi-discrete form of linear momentum balance (3.61) obtained from a Lagrangian

finite element method or a corresponding form resulting from ALE or Eulerian methods.

With the use of an implicit time integrator, the above system of non-linear ordinary dif-

ferential equations in time gives rise to a system of non-linear algebraic equations for un+1

(typically, in solids) or vn+1 (typically, in fluids). In ALE or Eulerian methods, an additional

system of equations is obtained by imposing conservation of mass and ρn+1 is added to the

list of unknowns to be determined.

By way of background, consider a system of non-linear algebraic equations expressed in

column vector form as

[f(x)] =

f1(x)

f2(x)...

fN(x)

= [0] , (4.1)

where x is a vector in RN (not a position vector!) and f : Ω ⊂ R

N → RN is a non-linear

mapping, in the sense that given any real α, β and any points x1, x2 in Ω, then

f(αx1 + βx2) 6= αf(x1) + βg(x2) . (4.2)

In the interest of simplicity, no brackets are included henceforth to denote general vector

equations, therefore (4.1) would be written simply as

f(x) = 0 . (4.3)

85

86 Solution of Non-linear Field Equations

Next, consider a continuous mapping g : Ω ⊂ RN → R

N , such that x → g(x). A point

x∗ ∈ Ω is called a fixed point of g if

x∗ = g(x∗) . (4.4)

The solution of the system of non-linear algebraic equations in (4.3) can be trivially

reduced to a problem of finding the fixed points of a function, since (4.3) is equivalent to

x = x− cf(x) = g(x) , (4.5)

for any constant c 6= 0. Here, recall that the function g is continuous at x0 ∈ Ω if, given

any ε > 0, there exists a δ = δ(ε) > 0 such that

‖g(x)− g(x0)‖ < ε , (4.6)

for all x ∈ B(x0, δ) = x ∈ Ω | ‖x − x0‖ < δ, where, geometrically, B(x0, δ) is an N -

dimensional ball of radius δ centered at x0, as in Figure 4.3.

x0

x

δ

Figure 4.1. Ball of radius δ centered at x0

Let g be differentiable and its domain Ω be a convex subset of RN , which means that for

any x, y in Ω, the point (1− ω)x+ ωy is also in Ω if 0 < ω < 1, as in Figure 4.2. One may

x y

convex non-convex

Figure 4.2. Convex and non-convex sets

now invoke the mean-value theorem in RN to write for any points x, y in Ω,

‖g(x)− g(y)‖ ≤ supz∈[x,y]

‖J(z)‖‖x− y‖ , (4.7)

ME280B May 7, 2019

Generalities 87

where z is on the line segment connecting x and y, that is, z = x+ω(y−x), 0 ≤ ω ≤ 1, and

Jij(x) =∂gi(x)

∂xj(4.8)

is the Jacobian matrix of g.

Note that any vector norm can be used in the mean-value theorem, such as, for example,

the classical Euclidean 2-norm, defined as

‖x‖2 =(x21 + x22 + · · ·+ x2N

) 12 . (4.9)

Also, recall that all norms in RN are equivalent, in the sense that, given any two such norms

‖ · ‖a, ‖ · ‖b, there exist positive scalar c1 and c2, such that for any x ∈ Ω,

c1‖x‖a ≤ ‖x‖b ≤ c2‖x‖a . (4.10)

Lastly, the natural matrix norm associated with a given vector norm is defined for any given

matrix A (viewed as a linear mapping of a vector x ∈ RN to another vector Ax ∈ R

N) as

‖A‖ = sup‖x‖6=0

‖Ax‖‖x‖ = sup

‖x‖=1

‖Ax‖ . (4.11)

It is noteworthy that the classical equality form of the mean-value theorem in R, that is,

|g(x)− g(y)| = |g′(z)||x− y| , (4.12)

for at least one point z ∈ [x,y], does not generalize to RN .

Returning to (4.7), it is observed that if there is a positive constant λ, such that ‖J(x)‖ ≤λ < 1 for all x ∈ Ω, then clearly

‖g(x)− g(y)‖ ≤ λ‖x− y‖ < < ‖x− y‖ . (4.13)

Under the above condition, the operator g is termed contractive (or a contraction), in the

sense that its application to x − y strictly reduces the distance between the two points

measured by the vector norm. The following theorem provides a useful connection between

contractions and the existence of a solution to (4.3).

Theorem 4.1. (Contraction mapping) Let g map a closed Ω ⊂ RN onto itself, and be

contractive on Ω with constant λ. It follows that:

(a) there exists a unique fixed point x∗ in Ω, and

May 7, 2019 ME280B


(b) the fixed point x∗ can be determined by a fixed-point iteration, that is from

x(k+1) = g(x(k)) , k = 0, 1, 2, · · · , (4.14)

starting from any point x(0) in Ω.

Proof. Start with part (b) and note that, in view of (4.13)1,

‖x(k+1) − x(k)‖ = ‖g(x(k))− g(x(k−1))‖ ≤ λ‖x(k) − x(k−1)‖· · ·≤ λk‖x(1) − x(0)‖ = λk‖g(x(0))− x(0)‖ .

(4.15)

Then, for any positive integers m,n, with m > n, (4.15) implies that

‖x(m) − x(n)‖ = ‖x(m) − x(m−1) + x(m−1) − · · · − x(n+1) + x(n+1) − x(n)‖≤ ‖x(m) − x(m−1)‖+ ‖x(m−1) − x(m−2)‖+ · · ·+ ‖x(n+1) − x(n)‖≤ (λm−1 + λm−2 + · · ·+ λn)‖x(1) − x(0)‖= λn(λm−n−1 + λm−n−1 + · · ·+ 1)‖x(1) − x(0)‖

= λn1− λm−n

1− λ ‖x(1) − x(0)‖ ≤ λn

1− λ‖x(1) − x(0)‖ . (4.16)

Hence,

limm,n→∞

‖x(m) − x(n)‖ ≤ limn→∞

λn

1− λ‖x(1) − x(0)‖ = 0 . (4.17)

Thus, x(n) is a Cauchy sequence and since Rn is complete and Ω is closed, it follows that

limn→∞ x(n) = x∗ ∈ Ω.

Returning to part (a), write for the point x∗ obtained above

‖x∗ − g(x∗)‖ = ‖x∗ − g(x(n)) + g(x(n))− g(x∗)‖≤ ‖x∗ − g(x(n))‖+ ‖g(x(n))− g(x∗)‖≤ ‖x∗ − x(n+1)‖+ λ‖x(n) − x∗‖ . (4.18)

Taking the limit as n goes to infinity of both sides of (4.18),

limn→∞

‖x∗ − g(x∗)‖ ≤ limn→∞

‖x∗ − x(n+1)‖+ λ limn→∞

‖x(n) − x∗‖ = 0 , (4.19)

which implies that x∗ is a fixed point.

ME280B May 7, 2019

Newton’s Method and its Variants 89

To show that x∗ is unique, assume, by contradiction, that there exists a second fixed

point y∗ ∈ Ω, such that y∗ 6= x∗. Then,

‖x∗ − y∗‖ = ‖g(x∗)− g(y∗)‖ ≤ λ‖x∗ − y∗‖ < ‖x∗ − y∗‖ (4.20)

which is a contradiction, therefore x∗ = y∗.

If there exists no guarantee that g maps the closed set Ω onto itself, the following theorem

is useful.

Theorem 4.2. Let g : Ω→ RN be contractive on Ω, and assume that x(0) ∈ Ω is such that

S =

x | ‖x− g(x(0))‖ ≤ λ

1− λ‖g(x(0))− x(0)‖

⊂ Ω , (4.21)

where λ is the contraction constant. Then, the fixed-point iteration x(k+1) = g(x(k)) converges

to the unique fixed point x∗ ∈ S.

Proof. For any x ∈ S,

‖g(x)− g(x(0))‖ ≤ λ‖x− x(0)‖ = λ‖x− g(x(0)) + g(x(0))− x(0)‖≤ λ

[‖x− g(x(0))‖+ ‖g(x(0))− x(0)‖

]

≤[

λ2

1− λ + λ

]‖g(x(0))− x(0)‖ =

λ

1− λ‖g(x(0))− x(0)‖ , (4.22)

therefore g(x) ∈ S, thus g maps the closed set S onto itself. Thus, the contraction mapping

x g(x)

ΩΩ

SS

Figure 4.3. Contraction mapping in S

theorem is applicable and yields the desired result.

The preceding analysis can be carried out in the broader framework of Banach spaces

(that is, complete normed linear spaces). However, this is not necessary here since the sub-

sequent development pertains strictly to finite-dimensional spaces (which are, by necessity,

complete).

May 7, 2019 ME280B

90 4.2. THE NEWTON-RAPHSON METHOD AND ITS VARIANTS

4.2 The Newton-Raphson Method and its Variants

The Newton-Raphson method is one of the most widely used iterative methods for solving

non-linear algebraic equations. Its origins may be traced at least as far back as the first

century AD work of Heron of Alexandria, who was clearly aware of the iterative formula

x(k+1) = 12

(x(n) + a

x(k)

)for estimating the square root of a positive real a starting from any

positive guess x(0). It is now widely accepted that the classical Newton-Raphson formula

is due to Raphson, although Newton had earlier employed a related, albeit less general,

formula.1

Recalling (2.13), the idea of the Newton-Raphson method is to approximate f(x) with

its linear part about a point x0 ∈ RN , that is,

f(x) = f(x0) + J(x0)(x− x0)︸︷︷︸L[f ;x−x0]x0

+O(‖x− x0‖2

)︸︷︷︸

ignore

= 0 . (4.23)

In the above equation, J(x0) is the (Frechet or Gateaux) derivative of f at x, with components

Jij(x0) =∂fi(x0)

∂xj, (4.24)

and, similarly, J(x0)(x − x0) is the (Frechet or Gateaux) differential at x0 in the direction

x− x0. Then, one may use the iterative formula

x(k+1) = x(k) −[J(x(k))

]−1f(x(k)) (4.25)

to determine, if possible, the solution x∗ = limk→∞ x(k), starting with x(0) as the initial

guess, and assuming that det [J(x(k))] 6= 0 for all iterates k. Equation (4.25) is the classical

Newton-Raphson formula, see Figure 4.4 for a geometric interpretation.

x

x∗

x(k)x(k+1)

f

f(x)

f(x(k))

f ′(x(k)).=

f(x(k))− 0

x(k) − x(k+1)

Figure 4.4. One-dimensional geometric interpretation of the Newton-Raphson method

1For a detailed historical account, see F. Cajori, Historical Note on the Newton-Raphson Method of

Approximation, Amer. Math. Monthly, 18:29–32, (1911).

ME280B May 7, 2019

https://archive.org/details/jstor-2973939

https://archive.org/details/jstor-2973939


The three major drawbacks of the Newton-Raphson method are that (a) the deriva-

tive J(x) must be computed explicitly, which is occasionally difficult to do analytically

(although either automatic or symbolic differentiation software may be used to free the pro-

grammer from this task), (b) the condition det [J(x(k))] 6= 0 for all k = 0, 1, 2, · · · is essentialand its violation (or near-violation) results in failure of the method, see Figure 4.5, and (c)

xx∗

x(k)

f

f(x)

Figure 4.5. Failure of the Newton-Raphson method due to singular J(x(k))

convergence is not guaranteed unless the initial guess x(0) is sufficiently close to the solution

(a point that will be expounded upon later in this section), see Figure 4.6.

x xx∗ x∗x(k) x(k)x(k+1)x(k+1) x(k+2)

f f

f(x) f(x)

flip-flop divergence

Figure 4.6. Failure of the Newton-Raphson method due to the large distance of x(k) from the

solution x∗

The Newton-Raphson method can be interpreted as a fixed-point iteration (4.7), where

g(x) = x− [J(x)]−1 f(x) . (4.26)

As an alternative to the Newton-Raphson method, one may define the so-called modified

Newton method, where

g(x) = x−[J(x(0))

]−1f(x) , (4.27)

so that now

x(k+1) = x(k) −[J(x(0))

]−1f(x(k)) , k = 0, 1, 2, · · · (4.28)

May 7, 2019 ME280B


x

x∗

x(0)x(1)x(2)

f

f(x)

f(x(0))

Figure 4.7. One-dimensional geometric interpretation of the modified Newton-Raphson method

and J(x(0)) needs to be factorized only once, see Figure 4.7.

The generalized Newton method, where

g(x) = x− [A(x)]−1 f(x) , (4.29)

where A(x) : Ω 7→ RN is an invertible matrix which is typically an approximation to J(x).

The generalized Newton method encompasses a wide class of iterative methods for the so-

lution of (4.3), including the Newton-Raphson, the modified Newton method, as well as all

approximations to Newton-Raphson resulting from numerical differentiation of f .

The preceding contraction mapping theorem may be used to study the convergence of

the Newton-Raphson method. Indeed, in light of (4.26), if the resulting g(x) maps some

closed region Ω to itself and is contractive, then the Newton-Raphson method convergences

to the unique solution x∗ ∈ Ω starting from any point x(0) ∈ Ω. A sharper, and, therefore,

more interesting result, is as follows:2

Theorem 4.3. Let a non-linear operator f : Ω ⊂ RN → R

N be continuously differentiable,

and assume that there exists a point x∗ ∈ Ω, such that f(x∗) = 0 and detJ(x∗) 6= 0.

Then, there is a ρ > 0 such that for all initial guesses x(0) ∈ B(x∗; ρ), the Newton-Raphson

iteration (4.25) is well-defined and converges to x∗.

Proof. Since detJ(x∗) 6= 0, one may find an M > 0, such that

‖ [J(x∗)]−1 ‖ < M . (4.30)

In addition, since J(x) is continuous, then, by the inverse function theorem, so is [J(x∗)]−1

in a neighborhood of x∗. Then, there is a δ > 0, such that

‖ [J(x)]−1 − [J(x∗)]−1 ‖ ≤ M − ‖ [J(x∗)]−1 ‖︸︷︷︸>0

, (4.31)

2See, e.g., Section 8.1.10 in J.M. Ortega, Numerical Analysis: a Second Course, SIAM, Philadelphia,

1990.

ME280B May 7, 2019

https://epubs.siam.org/doi/book/10.1137/1.9781611971323


for all x ∈ B(x∗; δ). This, in turn, implies that

‖ [J(x)]−1 ‖ ≤ ‖ [J(x∗)]−1 ‖+M − ‖ [J(x∗)]−1 ‖ = M , (4.32)

that is, [J(x)]−1 is bounded in the ball B(x∗; δ) = x ∈ Ω | ‖x− x∗‖ ≤ δ of radius δ centeredat x∗. Then, for any x(k) ∈ B(x∗; δ) it is concluded that

x(k+1) − x∗ = x(k) −[J(x(k))

]−1f(x(k))− x∗

=[J(x(k))

]−1 J(x(k))[x(k) − x∗]− f(x(k))

. (4.33)

Since J(x) is continuous in B(x∗; δ), one may invoke the fundamental theorem of calculus3

to conclude that

f(x(k)) = f(x∗) +

[∫ 1

0

J(x∗ + ω(x(k) − x∗)

)dω

](x(k) − x∗)

=

[∫ 1

0

J(x∗ + ω(x(k) − x∗)

)dω

](x(k) − x∗) . (4.34)

Substituting the above equation in (4.33) yields

x(k+1) − x∗ =[J(x(k))

]−1∫ 1

0

[J(x(k))− J

(x∗ + ω(x(k) − x∗)

)]dω(x(k) − x∗) . (4.35)

Taking norms of both sides of (4.35) and using standard norm properties, it follows that

‖x(k+1) − x∗‖ ≤ ‖[J(x(k))

]−1 ‖[∫ 1

0

‖J(x(k))− J(x∗ + ω(x(k) − x∗)

)‖dω

]‖x(k) − x∗‖ .

(4.36)

Now, since J(x) is continuous in B(x∗; δ), there exists a ρ, 0 < ρ ≤ δ, such that, for

any x(k) ∈ B(x∗; ρ) and all ω, 0 ≤ ω ≤ 1,

‖J(x(k))− J(x∗ + ω(x(k) − x∗)

)‖ ≤ 1

2M, (4.37)

as in Figure 4.8. Indeed, taking advantage of the aforementioned continuity of J(x), one

may always find a ρ, 0 < ρ ≤ δ, such that

‖J(x(k))− J(x∗ + ω(x(k) − x∗)

)‖ = ‖J(x(k))− J(x∗) + J(x∗)− J

(x∗ + ω(x(k) − x∗)

)‖

≤ ‖J(x(k))− J(x∗)‖+ ‖J(x∗)− J(x∗ + ω(x(k) − x∗)

)‖

≤ 1

4M+

1

4M=

1

2M. (4.38)

3The one-dimensional version of the fundamental theorem of calculus, f(b) = f(a) +∫a

b

df

dxdx readily

generalizes in RN to f(b) = f(a) +

∫ 1

0df(a+ω(b−a))d(a+ω(b−a))

d(a+ω(b−a))dω

= f(a) +[∫ 1

0J (a+ ω(b− a)) dω

](b− a).

May 7, 2019 ME280B


x∗

x(k)ρ

δ

Figure 4.8. Balls B(x∗; ρ) and B(x∗; δ) around the solution x∗

Taking advantage of (4.30) and (4.37), it follows from (4.36) that

‖x(k+1) − x∗‖ ≤ M1

2M‖x(k) − x∗‖ =

1

2‖x(k) − x∗‖ . (4.39)

Note that clearly x(k+1) ∈ B(x∗; ρ) and a standard argument of induction leads to

limk→∞‖x(k+1) − x∗‖ = 0 , (4.40)

therefore limk→∞ x(k+1) = x∗.

An additional condition may be included with the original assumptions of the preceding

theorem, whereby it is assumed that there exists a constant λ > 0 such that

‖J(x)− J(y)‖ ≤ λ‖x− y‖ , (4.41)

for all x,y ∈ B(x∗; ρ). Under this condition, J is called Lipschitz continuous in B(x∗; ρ)

with Lipschitz constant λ. Under the additional assumption of Lipschitz continuity for J,

one may start again (4.36) and use (4.30) to conclude that

‖x(k+1) − x∗‖ ≤ ‖[J(x(k))

]−1 ‖∫ 1

0

‖J(x(k))− J(x∗ + ω(x(k) − x∗)

)‖dω

‖x(k) − x∗‖

≤ M

∫ 1

0

λ(1− ω)‖x(k) − x∗‖dω‖x(k) − x∗‖

= Mλ

[ω − ω2

2

]1

0

‖x(k) − x∗‖2 =1

2λM‖x(k) − x∗‖2 . (4.42)

This implies that the Newton-Raphson method converges quadratically in the sense that

‖x(k+1) − x∗‖‖x(k) − x∗‖2 ≤ 1

2λM = constant . (4.43)

ME280B May 7, 2019


The scalar ρ > 0 is referred to as the radius of attraction for the Newton-Raphson

method. Clearly, convergence to x∗ is not guaranteed when starting from x(0) /∈ B(x∗; ρ).

This, in effect, limits the step-size when using the Newton-Raphson method. The radius

of attraction depends on the spatial variation of J around the solution x∗, as depicted in

Figure 4.9. Note that, contrary to statements in some texts, quadratic convergence does

fxfx

x xx∗ x∗

f f

“large” ρ “small” ρ

Figure 4.9. Functions with different radii of attractions for the Newton-Raphson method

not require differentiability of J(x),x ∈ B(x∗; ρ). Indeed, Lipschitz continuity is a stronger

condition than continuity but weaker than differentiability.

The above theorem is local in nature, in the sense that it assumes that the solution x∗

exists, and subsequently asserts that there exists a neighborhood B(x∗; ρ) around x∗, such

that starting with any point x(0) in it, the Newton-Raphson sequence will converge to x∗.

This theorem does not allow for any computable error estimate for the quantity ‖x(k)−x∗‖.A semi-local result (that is, one for in which the existence of a solution x∗ is not assumed

at the outset) is the celebrated Newton-Kantorovich theorem:4

Theorem 4.4. Let a non-linear operator f : RN → RN be differentiable in a convex set

Ω ⊂ RN and its derivative J be Lipschitz continuous with constant λ in Ω. Suppose that

there is a point x(0) ∈ Ω, such that

‖[J(x(0))

]−1 ‖ ≤M , ‖[J(x(0))

]−1f(x(0))‖ ≤ η (4.44)

and

α = ληM ≤ 1

2. (4.45)

Also, assume that B(x(0); ρ∗) ⊂ Ω where

ρ∗ =1

λM

[1− (1− 2α)

12

], (4.46)

4See p. 421 in J.M. Ortega and W. Rheinboldt, Iterative Solution of Nonlinear Equations in Several

Variables, Academic Press, New York, 1970.

May 7, 2019 ME280B


see Figure 4.10.

x(0)

ρ∗x∗

Ω

Figure 4.10. Convergence under the Newton-Kantorovich conditions

Then, the Newton-Raphson iteration is well-defined and converges to the unique solu-

tion x∗ of f(x) = 0 in B(x(0); ρ∗). In addition, an error estimate is obtained in the form

‖x∗ − x(k)‖ ≤ (2α)2k

2kλM. (4.47)

Even sharper error estimates can be obtained for ‖x∗ − x(k)‖ by further assuming that

the second derivative of f exists everywhere and is bounded.5

4.3 The Newton-Raphson Method in Non-linear Finite

Elements

The Newton-Raphson method enjoys wide use in non-linear computational mechanics. To

demonstrate its application, consider first the system of ordinary differential equations (3.61)

emanating from the spatial discretization of a Lagrangian finite element method and the cor-

responding algebraic system (3.72) resulting from the application of discrete time integrator.

The latter may be put in the form

f(un+1) =1

β∆t2nMun+1 +R(un+1)−M

(un + vn∆tn)

1

β∆t2n+

1− 2β

2βan

− Fn+1 = 0

(4.48)

To find un+1 using the Newton-Raphson method, start with the initial guess u(0)n+1 = un,

that is, the displacement field of the converged solution at t = tn, and, recalling (4.23), write

0 = f(u(k+1)n+1 )

.= f(u

(k)n+1) +Df(u

(k)n+1)

(u(k+1)n+1 − u

(k)n+1

), (4.49)

5See p. 708 in L.V. Kantorovich and G.P.Akilov, Functional Analysis, Pergamon Press, New York, 1982.

ME280B May 7, 2019

The Newton-Raphson Method in Non-linear Finite Elements 97

where ∆u(k)n+1 = u

(k+1)n+1 − u

(k)n+1 is the (yet unknown) increment of the displacement vector

from iteration (k) to iteration (k + 1). Then,

u(k+1)n+1 = u

(k)n+1 −

[Df(u

(k)n+1)

]−1

f(u(k)n+1) , k = 0, 1, 2, ... . (4.50)

Clearly, the term Df(u(k)n+1)

(u(k+1)n+1 − u

(k)n+1

)is the differential (Frechet or Gateaux) of the

vector f at u(k)n+1 in the direction ∆u

(k)n+1. In order for the Newton-Raphson method to be

employed, the derivative Df(u(k)n+1) must be determined explicitly. This can be done at the

level of the full post-assembly algebraic system, or more conveniently, at the element level

using the original weak forms of the balance laws. To explore the latter option, recall the

discrete linear momentum balance equations (3.59) derived from (3.42), as restricted to the

element e with domain Ωe0, that is,

∫

Ωe0

ξn+1 · ρ0an+1 dV +

∫

Ωe0

∂ξn+1

∂X·Pn+1 dV

=

∫

Ωe0

ξn+1 · ρ0bn+1 dV +

∫

∂Ωe0∩Γq0

ξn+1 · pn+1 dA+

∫

∂Ωe0\Γq0

ξn+1 · pn+1 dA . (4.51)

Next, proceed with obtaining the differential in the direction ∆un+1 of each of the terms in

the above weak form, except for the last one which will not appear in the global balance

statements. For the acceleration term, one may write

D

[∫

Ωe0

ξn+1 · ρ0an+1 dV

](un+1,∆un+1) =

[d

dω

∫

Ωe0

ξn+1 · ρ0 ¨(un+1 + ω∆un+1) dV

]

ω=0

=

∫

Ωe0

ξn+1 · ρ0∆un+1 dV . (4.52)

Assuming that the body forces are known and independent of u, and also that the prescribed

tractions are deformation-independent (that is, they are dead loads), then clearly

D

[∫

Ωe0

ξn+1 · ρ0bn+1 dV +

∫

∂Ωe0∩Γq0

ξn+1 · pn+1 dA

](un+1,∆un+1) = 0 . (4.53)

An additional important assumption made above is that Γq0 is itself independent of u, that

is, the characterization of traction boundaries does not depend on the solution.

Alternatively, it is possible to have deformation-dependent tractions. For example, a fol-

lower load of the form (tn+1 = −pnn+1) may be imposed on Γq0 , so that, upon recalling (1.12)

and (1.75),

pn+1 = −pJn+1F−Tn+1N . (4.54)

May 7, 2019 ME280B


nn+1

nn+1

Nn+1

dadA

Figure 4.11. Pressure-type follower traction

where p is a constant pressure, as in Figure 4.11. These traction loads are clearly subject

to non-trivial linearization.

Similarly, it is possible to model body forces which are derived from a potential, such as

gravitational forces which are changing with position, as in Figure 4.12. These also lead to

b

b

Figure 4.12. Body force in a gravitational field

a non-trivial linearization.

Generally, the principal concern in formulating the Newton-Raphson iteration in finite

element methods is to obtain the differential of the stress-divergence term. For the case of

solids, write

D

[∫

Ωe0

∂ξn+1

∂X·Pn+1 dV

](un+1,∆un+1) =

[d

dω

∫

Ωe0

∂ξn+1

∂X·P (un+1 + ω∆un+1) dV

]

ω=0

=

∫

Ωe0

∂ξn+1

∂X·[d

dωP (un+1 + ω∆un+1)

]

ω=0

dV

=

∫

Ωe0

∂ξn+1

∂X·DP (un+1,∆un+1) dV . (4.55)

If the stress depends only on the deformation gradient (as in the case of elastic materials

discussed in Chapter 5), then

DP (un+1,∆un+1) = DP (Fn+1,∆Fn+1) =∂Pn+1

∂Fn+1

∆Fn+1 , (4.56)

ME280B May 7, 2019


where

∆Fn+1 =∂∆un+1

∂X. (4.57)

Thus

D

[∫

Ωe0

∂ξn+1

∂X·Pn+1 dV

](un+1,∆un+1) =

∫

Ωe0

∂ξn+1

∂X· ∂Pn+1

∂Fn+1

∂∆un+1

∂XdV . (4.58)

For materials which exhibit rate-dependence, e.g., when P = P(F, F), the differential

DP (un+1,∆un+1) does not capture the rate-dependence and needs to be augmented to

DP(un+1;vn+1,∆un+1; ∆vn+1) . (4.59)

Upon time discretization, this augmented differential is not necessary because the rate quan-

tities can be expressed in terms of the primary variable u through the use of a discrete time

integrator.

Example 4.3.1: By way of example, consider the simple one-dimensional Kelvin-Voigt solid (spring-dashpot in parallel), as in Figure 4.13. Here,

σ = Eε+ ηε , (4.60)

where ε is the infinitesimal strain, σ is the uniaxial stress, E is the Young’s modulus, and η is theviscosity coefficient. Clearly, at t = tn+1,

σn+1 = Eεn+1 + ηεn+1 , (4.61)

therefore

Dσ(εn+1; εn+1,∆εn+1; ∆εn+1) = E∆εn+1 + η∆εn+1 . (4.62)

Upon integrating in time using the backward Euler method, it follows that

σ σ

E

η

Figure 4.13. The Kelvin-Voigt solid

εn+1.= εn +∆tnεn+1 , (4.63)

May 7, 2019 ME280B


therefore (4.61) becomes

σn+1.= Eεn+1 + η

εn+1 − εn∆tn

. (4.64)

This, in turn, implies that the differential of the stress is

Dσ(εn+1,∆εn+1) =

(E +

η

∆tn

)∆εn+1 , (4.65)

hence the algorithmic tangent modulus equals E + η∆tn

.

It is important to recognize that in determining the differential of P in an algorith-

mic framework, one needs to first account for all approximations to rate-type quantities on

which P depends. In solid mechanics problems, this leads to a functional expression of the

form

Pn+1 = P(Fn+1, · · · |n, · · · ) , (4.66)

where the notation · · · |n is used to denote generic functional dependency on quantities at

t = tn. Sssuming sufficient smoothness of P, this leads to an algorithmically consistent

differentiation of P as

DP(Fn+1,∆Fn+1) =

(∂P

∂F

)

n+1

∆Fn+1 . (4.67)

The quantity

(∂P

∂F

)

n+1

is referred to as the algorithmic tangent modulus at t = tn+1.

Using the above results, one may obtain the linear part of the weak form (4.51) at u(k)n+1

in the direction ∆(k)un+1, as

∫

Ωe0

ξn+1 · ρ0a(k)n+1 dV +

∫

Ωe0

∂ξn+1

∂X·P(k)

n+1 dV −∫

Ωe0

ξn+1 · ρ0b dV

−∫

∂Ωe0∩Γq0

ξn+1 · pn+1 dA−∫

∂Ωe0\Γq0

ξn+1 · p(k)n+1 dA

+

∫

Ωe0

ξn+1 · ρ0∆u(k)n+1 dV +

∫

Ωe0

∂ξn+1

∂X·(∂P

∂F

)(k)

n+1

∂∆u(k)n+1

∂XdV = 0 , (4.68)

where use is made of (4.52), (4.53), and (4.67). Recalling the implicit Newmark for-

mula (3.71), one may write for element e

a(k)n+1 =

1

β∆t2n

(u(k)n+1 − un − vn∆tn

)− 1− 2β

2βan , (4.69)

ME280B May 7, 2019


therefore

∆u(k)n+1 =

1

β∆t2n∆u

(k)n+1 . (4.70)

Substituting (4.70) into (4.69) and introducing the standard element interpolations (3.43)

results in

[ξe

n+1]T[Me][a

e (k)n+1 ] + [Re(u

e (k)n+1)]− [Fe

n+1]− [Fe int (k)n+1 ]

+ [ξe

n+1]T

∫

Ωe0

[Ne]Tρ0

β∆t2n[Ne] dV +

∫

Ωe0

[Be]T

[∂P

∂F

](k)

n+1

[Be] dV

[∆ue (k)

n+1 ] = 0 , (4.71)

where[∂P∂F

](k)n+1

is a 9 × 9 matrix containing the components∂PiA

∂FjB

, consistently with the

convention adopted in (3.46). The quantity inside the large brackets in (4.71) is the element

tangent stiffness matrix Ke (k)n+1 , that is,

Ke (k)n+1 =

∫

Ωe0

[Ne]Tρ0

β∆t2n[Ne] dV

︸︷︷︸inertial stiffness

+

∫

Ωe0

[Be]T

[∂P

∂F

](k)

n+1

[Be] dV

︸︷︷︸material stiffness

. (4.72)

Likewise, the quantity inside the small brackets in (4.71) is the residual vector, which quan-

tifies the imbalance of linear momentum at the (k)-th iteration of the Newton-Raphson

method. Upon assemblying (4.71), the global Newton-Raphson equations take the form(Ma

(k)n+1 +R(u

(k)n+1)− Fn+1

)+K

(k)n+1∆u

(k)n+1 = 0 , k = 0, 1, · · · , (4.73)

where K(k)n+1 =A

eK

e (k)n+1 is the global tangent stiffness matrix.

A similar analysis may be carried out starting from the Eulerian form of linear momentum

balance (3.41) restricted to an element e with domain Ωe, such that

∫

Ωe

ξn+1 · ρn+1an+1 dv +

∫

Ωe

∂ξn+1

∂xn+1

·Tn+1 dv =∫

Ωe

ξn+1 · ρn+1bn+1 dv +

∫

∂Ωe∩Γq

ξn+1 · tn+1 da+

∫

∂Ωe\Γq

ξn+1 · tn+1 da . (4.74)

By appealing to mass conservation, it is immediately seen that

D

[∫

Ωe

ξn+1 · ρn+1an+1 dv

](un+1,∆un+1) =

∫

Ωe

ξn+1 · ρn+1Dan+1(un+1,∆un+1) dv

=

∫

Ωe

ξn+1 · ρn+1∆un+1 dv . (4.75)

May 7, 2019 ME280B


A similar procedure can be employed for the term involving the body forces.

Neglecting non-linearities due to body force or surface traction, proceed next with the

differentiation of the stress-divergence term in (4.74). First, note that, upon using (1.85)

and the chain rule, one may conclude for a solid mechanics problem that

∂ξ

∂x·T =

(∂ξ

∂XF−1

)·(1

JFSFT

)=

1

J

∂ξ

∂X· (FS) . (4.76)

Taking into account the preceding relation, as well as (2.36), the differentiation of the stress-

divergence term may be obtained as

D

[∫

Ωe

∂ξn+1

∂xn+1

·Tn+1 dv

](un+1,∆un+1) (4.77)

= D

[∫

Ωe0

1

Jn+1

∂ξn+1

∂X· (Fn+1Sn+1) Jn+1 dV

](un+1,∆un+1)

= D

[∫

Ωe0

∂ξn+1

∂X· (Fn+1Sn+1) dV

](un+1,∆un+1)

=

∫

Ωe0

∂ξn+1

∂X·[DF (un+1,∆un+1)Sn+1

]dV +

∫

Ωe0

∂ξn+1

∂X·[Fn+1DS (un+1,∆un+1)

]dV

=

∫

Ωe0

∂ξn+1

∂X·(∂∆un+1

∂XSn+1

)dV +

∫

Ωe0

∂ξn+1

∂X·[Fn+1DS (un+1,∆un+1)

]dV . (4.78)

Assuming, in analogy to earlier discussion on P, that

Sn+1 = S(En+1, · · · |n, · · · ) , (4.79)

one may write

DS (En+1,∆En+1) =

(∂S

∂E

)

n+1

∆En+1 . (4.80)

It is now possible to push-forward to the current configuration the first term on the right-

hand side of (4.77) to find that∫

Ωe0

∂ξn+1

∂X·(∂∆un+1

∂XSn+1

)dV =

∫

Ωe0

(∂ξn+1

∂xn+1

Fn+1

)·[∂∆un+1

∂xn+1

Fn+1

(Jn+1F

−1n+1Tn+1F

−Tn+1

)]dV

=

∫

Ωe

∂ξn+1

∂xn+1

·(∂∆un+1

∂xn+1

Tn+1

)dv , (4.81)

where use is again made of (1.85) and the chain rule. This term gives rise to the geometric

stiffness, namely the part of the tangent stiffness matrix which is due to the change of the

geometry in the current configuration for fixed stress Tn+1.

ME280B May 7, 2019


Repeat the above procedure for the second term in the differentiation to conclude that

∫

Ωe0

∂ξn+1

∂X·[Fn+1DS (un+1,∆un+1)

]dV

=

∫

Ωe0

∂ξn+1

∂X·Fn+1

[(∂S

∂E

)

n+1

1

2

(∆FT

n+1Fn+1 + FTn+1∆Fn+1

)]

dV

=

∫

Ωe0

(FT

n+1

∂ξn+1

∂X

)·[(

∂S

∂E

)

n+1

1

2

[(∂∆un+1

∂X

)T

Fn+1 + FTn+1

(∂∆un+1

∂X

)]]dV ,

(4.82)

where use is made of (2.36), (2.41), and the chain rule. Pushing the right-hand side of (4.82)

forward to the current configuration leads to

∫

Ωe0

∂ξn+1

∂X·[Fn+1DS (un+1,∆un+1)

]dV

=

∫

Ωe

1

Jn+1

(FT

n+1

∂ξn+1

∂xn+1

Fn+1

)·[(

∂S

∂E

)

n+1

FTn+1

(∂∆un+1

∂xn+1

)Fn+1

]dv , (4.83)

with the aid of (1.10), the chain rule, and exploiting the symmetry of the fourth-order tensor

∂S

∂Ein its third and fourth legs. This term gives rise to the material stiffness, namely the

part of the tangent stiffness matrix which is due to the non-linearity of the constitutive law

for stress at the fixed geometry of the body in the current configuration. In component

form, one may express the integral in (4.83) as 1J(FiAξi,jFjB)

(SAB

ECDFkC∆uk,lFlD

), where the

subscript “n+ 1” is omitted for brevity.

Now obtain again the linear part of the weak form of linear momentum balance (4.74) at

u(k)n+1 in the direction ∆u

(k)n+1 as

∫

Ωe

ξn+1 · ρ(k)n+1a(k)n+1 dv +

∫

Ωe

∂ξn+1

∂x(k)n+1

·T(k)n+1 dv −

∫

Ωe

ξn+1 · ρ(k)n+1bn+1 dv

−∫

∂Ωe∩Γq

ξn+1 · tn+1 da−∫

∂Ωe\Γq

ξn+1 · tn+1 da

+

∫

Ωe

ξn+1 · ρ(k)n+1∆u(k)n+1 dv +

∫

Ωe

∂ξn+1

∂x(k)n+1

·(∂∆u

(k)n+1

∂x(k)n+1

T(k)n+1

)dv

+

∫

Ωe

1

J(k)n+1

(F

(k)Tn+1

∂ξn+1

∂x(k)n+1

F(k)n+1

)·

(∂S

∂E

)(k)

n+1

F(k)Tn+1

(∂∆u

(k)n+1

∂x(k)n+1

)F

(k)n+1

dv = 0 , (4.84)

May 7, 2019 ME280B


where use is made of (4.81) and (4.83). Admitting the standard element interpolations

and recalling the implicit Newmark formula (3.71), as done earlier in this section for the

referential description, it follows that

[ξe

n+1]T[Me][a

e (k)n+1 ] + [Re(u

e (k)n+1)]− [F

e (k)n+1 ]− [F

e int (k)n+1 ]

+ [ξe

n+1]T

∫

Ωe

[Ne]Tρ(k)n+1

β∆t2n[Ne]dv

︸︷︷︸inertial stiffness

+

∫

Ωe

[Be (k)n+1 ]

T [Ae (k)n+1 ][B

e (k)n+1 ]dv

︸︷︷︸geometric stiffness

+

∫

Ωe

[Be (k)n+1 ]

T [c(k)n+1][B

e (k)n+1 ]dv

︸︷︷︸material stiffness

[∆ue (k)n+1 ] = 0 , (4.85)

where [Ae (k)n+1 ] is a 9× 9 array, such that

∂ξ(k)n+1

∂x(k)n+1

· ∂∆u(k)n+1

∂x(k)n+1

T(k)n+1 = [ξ

e

n+1]T [B

e (k)n+1 ]

T [Ae (k)n+1 ][B

e (k)n+1 ][∆u

e (k)n+1 ] (4.86)

in Ωe, while, in light of (4.83), [c(k)n+1] is a 9×9 array whose components in full tensorial form

are

c(k)ijkl =

1

J (k)F

(k)iA F

(k)jB

∂S(k)AB

∂ECD

F(k)kC F

(k)lD , (4.87)

where the subscript “n + 1” is omitted. The symmetry of T is sufficient to deduce that∂ξ

∂x· ∂∆u

∂xT =

∂∆u

∂x· ∂ξ∂x

T, which implies that the matrix [Ae (k)n+1 ] is necessarily symmetric

provided the element interpolation functions for u and ξ are identical. Indeed, it is easy to

show using the ordering convention in (3.46) that

[Ae] =

T11 0 0 T21 0 0 T31 0 0

0 T22 0 0 T32 0 0 T12 0

0 0 T33 0 0 T13 0 0 T23

T12 0 0 T22 0 0 T32 0 0

0 T23 0 0 T33 0 0 T13 0

0 0 T31 0 0 T11 0 0 T21

T13 0 0 T23 0 0 T33 0 0

0 T21 0 0 T31 0 0 T11 0

0 0 T32 0 0 T12 0 0 T22

. (4.88)

ME280B May 7, 2019


This, in turn, implies that the geometric stiffness is always symmetric. The material stiffness

is symmetric only when the matrix [c] is symmetric, which, according to (4.87) is contingent

on the symmetry of∂S

(k)AB

∂ECD

in both the first two and the last two pairs of legs. Further

dimensional reduction to the arrays used to represent the geometric and material stiffness

may be introduced by exploiting the fact that∂ξ

∂x·T =

(∂ξ

∂x

)s

·T, owing to the symmetry

of T.

Upon assembly of the element-wise equations stemming from (4.85) one readily recovers

a global system analogous to the one in (4.73).

Equation (4.74) is the starting point for the Newton-Raphson method in fluids. In con-

trast with solids, the inertial term is more complex, since

D

[∫

Ωe

ξn+1 · ρn+1an+1 dv

](vn+1,∆vn+1)

=

∫

Ωe

ξn+1 · ρn+1Dan+1(vn+1,∆vn+1) dv

=

∫

Ωe

ξn+1 · ρn+1∆vn+1 dv

=

∫

Ωe

ξn+1 · ρn+1

(∂∆vn+1

∂t+∂∆vn+1

∂xn+1

vn+1 +∂vn+1

∂xn+1

∆vn+1

)dv .

(4.89)

On the other hand, the term emanating from stress-divergence is much simplified compared

to solids and is given by

D

[∫

Ωe

∂ξn+1

∂xn+1

·Tn+1 dv

](vn+1,∆vn+1) =

∫

Ωe

∂ξn+1

∂xn+1

·DT(vn+1,∆vn+1) dv . (4.90)

This is because the Eulerian nature of the analysis implies that spatial derivatives are not

subject to linearization due to changes in the velocity unless the argument of the derivative

depends itself on the velocity. Of course, the weak form of linear momentum balance must

be supplemented by an equation that enforces mass balance in a weak form.

Revisit the Newton-Raphson method in the form of (4.73) and write it succinctly as

f(u(k)n+1) +K(u

(k)n+1)∆u

(k)n+1 = 0 , k = 0, 1, 2, · · · , (4.91)

with initial guess u(0)n+1 = un. The algorithmic steps for the Newton-Raphson method are

outlined in Algorithm 1 below.

May 7, 2019 ME280B


Algorithm 1: Newton-Raphson method

Data: ∆tn, state at time tn, maxit, tol

Set k = 0, u(0)n+1 = un;

while k ≤ maxit

Form residual f(u(k)n+1);

if ‖f(u(k)n+1)‖ < tol then

return u(k)n+1 and rest of state at time tn+1

else

Form and factorize tangent stiffness K(u(k)n+1);

Solve (4.91) for ∆u(k)n+1 and set u

(k+1)n+1 = u

(k)n+1 +∆u

(k)n+1;

Set k ← k + 1;

end

end

The Newton-Raphson iterations are terminated based on a pre-selected stopping criterion.

Such possible criteria include

(a) The residual norm criterion, ‖f(u(k)n+1)‖ < tol, which is shown in Algorithm 1,

(b) The “absolute” error criterion, ‖∆u(k)n+1‖ < tol,

(c) The “relative” error criterion,‖∆u

(k)n+1‖

‖u(k)n+1‖

< tol,

(d) The “energy” norm criterion E(k)n+1 = ∆u

(k)Tn+1 f(u

(k)n+1) < tol. In essence, E(k) is the

work done by the unbalanced forces f(u(k)n+1

)6= 0 going over the displacement incre-

ment ∆u(k)n+1.

The tolerance tol is typically set to be a multiple of the machine epsilon eps, say 10m∗epsfor some positive integer m.

The modified Newton method of (4.28) leads to

f(u(k)n+1) +K(u

(0)n+1)∆u

(k)n+1 = 0 , k = 0, 1, 2, · · · , (4.92)

with initial guess u(0)n+1 = un. This is implemented very similarly to the original Newton-

Raphson method, as shown in Algorithm 2.

ME280B May 7, 2019


Algorithm 2: Modified Newton method


Set k = 0, u(0)n+1 = un;

Form and factorize tangent stiffness K(u(0)n+1);

while k ≤ maxit




else


(k+1)n+1 = u

(k)n+1 +∆u

(k)n+1;

Set k ← k + 1;

end

end

In highly non-linear problems, it may be necessary to perform a line-search to either

accelerate convergence or to prevent divergence of the Newton-Raphson or modified Newton

iteration. Here, one obtains the increment ∆u(k)n+1 by solving (4.91) or (4.92), as usual.

However, the displacement vector update is defined as

u(k+1)n+1 = u

(k)n+1 + ηk∆u

(k)n+1 , (4.93)

where 0 < ηk ≤ 1 is a scalar parameter. The value of ηk is determined so that the scalar

function

g(η) = ∆u(k)Tn+1 f(u

(k)n+1 + η∆u

(k)n+1) (4.94)

satisfies g(ηk) = 0, that is, given u(k)n+1 and ∆u

(k)n+1, the residual f(u

(k+1)n+1 ) is orthogonal to

∆u(k)n+1, as in Figure 4.14.

u(k)n+1

u(k)n+1 +∆u

(k)n+1

u(k)n+1 + ηk∆u

(k)n+1

f(u(k)n+1)

f(u(k)n+1 +∆u

(k)n+1)

f(u(k)n+1 + ηk∆u

(k)n+1)

Figure 4.14. Schematic of line search

May 7, 2019 ME280B


The existence of such an ηk is not guaranteed for any given f , u(k)n+1, and ∆u

(k)n+1. In

practice, one typically seeks to find a ηk, such that

∣∣∣∆u(k)Tn+1 f(u

(k)n+1 + η∆u

(k)n+1)

∣∣∣ ≤ s∣∣∣∆u

(k)Tn+1 f(u

(k)n+1)

∣∣∣ , (4.95)

where s is a user-specified positive parameter less than unity (e.g., s = 0.8). Clearly, line

search involves (relatively inexpensive) residual evaluations for different values of η, until the

above condition is met. A coarse one-dimensional exhaustive search is effective in searching

for an ηk that satisfies (4.95), as shown in Algorithm 3 below.

Algorithm 3: Line search with Newton-Raphson method

Data: u(k)n+1, s, m

Form K(u(k)n+1) and f(u

(k)n+1) and solve (4.91) for ∆u

(k)n+1;

Form f(u(k)n+1 +∆u

(k)n+1);

if∣∣∣∆u

(k)Tn+1 f(u

(k)n+1 +∆u

(k)n+1)

∣∣∣ ≤ s∣∣∣∆u

(k)Tn+1 f(u

(k)n+1)

∣∣∣ thenreturn u

(k+1)n+1 = u

(k)n+1 +∆u

(k)n+1;

else

Set i = 0;

while i ≤ m

Set ηk = (m− i+ 1)/m;

Form f(u(k)n+1 + ηk∆u

(k)n+1);

if∣∣∣∆u

(k)Tn+1 f(u

(k)n+1 + ηk∆u

(k)n+1)

∣∣∣ ≤ s∣∣∣∆u

(k)Tn+1 f(u

(k)n+1)

∣∣∣ thenreturn u

(k+1)n+1 = u

(k)n+1 + ηk∆u

(k)n+1;

else

Set i← i+ 1;

end

end

end

Quasi-Newton methods are iterative schemes with convergence properties generally some-

where between those of the Newton-Raphson and the modified Newton method. The general

idea is to obtain an approximation K(k)

n+1 to the tangent stiffness matrix K(u(k)n+1), which sat-

isfies the condition

K(k)

n+1(u(k)n+1 − u

(k−1)n+1 ) = f(u

(k)n+1)− f(u

(k−1)n+1 ) , (4.96)

ME280B May 7, 2019


which is referred to as the quasi-Newton or secant property. Subsequently, one uses the

approximate stiffness to efficiently solve

K(k)

n+1∆u(k)n+1 = −f(u(k)

n+1) (4.97)

for ∆u(k)n+1 and update the displacement vector as

u(k+1)n+1 = u

(k)n+1 +∆u

(k)n+1 . (4.98)

Note that the quasi-Newton property (4.96) shows that the approximate stiffness K(k)

n+1 is

obtained by backward interpolation from the state at the k-th to the (k − 1)-st iterate, see

also Figure 4.15.

f

uu(k)u(k−1)

f(u

(k) )

f(u

(k−1) )

(dfdu

)(k) .=

f(u(k))−f(u(k−1))u(k)−u(k−1)

Figure 4.15. Schematic of quasi-Newton method in one-dimension

There exist many different ways (each corresponding to a different quasi-Newton method)

for constructing K(k)

n+1. Ideally, one expects of a quasi-Newton method that:

(a) The incremental displacement ∆u(k)n+1 can be obtained efficiently, that is, without hav-

ing to factorize K(k)

n+1. This is possible if K(k)

n+1 is obtained from K(k−1)

n+1 as a low-rank

update. In such a case, one may use the Sherman-Morrison-Woodbury formula, ac-

cording to which, given an invertible n× n matrix K and two n×m matrices G and

H with m < n, then

(K+GHT

)−1= K

−1 −K−1G(Im +HTK

−1G)−1

HTK−1

, (4.99)

where Im stands for the m × m identity matrix. Thus, the cost of finding ∆u(k)n+1 is

very small when, say, m = 1 or 2.

(b) The stiffness matrix update preserves properties such as symmetry and positive-definiteness,

if these are present in the initial stiffness K(0)

n+1.

May 7, 2019 ME280B


Proceeding constructively, assume that K(0)

n+1 is symmetric and one wants to develop a rank-

one quasi-Newton method that preserves symmetry. Then, by necessity,

K(k)

= K(k−1)

+ zzT , (4.100)

where z is a vector to be determined and the subscript n + 1 is ignored henceforth for the

sake of brevity. Since the quasi-Newton property (4.96) needs to be enforced, it follows that

K(k−1)

∆u(k−1) + zzT∆u(k−1) = f(u(k))− f(u(k−1)) , (4.101)

which implies that

z =1

zT∆u(k−1)

[f(u(k))− f(u(k−1))−K

(k−1)∆u(k−1)

](4.102)

and also

∆u(k−1)TK(k−1)

∆u(k−1) +(zT∆u(k−1)

)2=[fT (u(k))− fT (u(k−1))

]∆u(k−1) . (4.103)

The last two equations lead to

K(k)

= K(k−1)

+1

[f(u(k))− f(u(k−1))]T∆u(k−1) −∆u(k−1)TK

(k−1)∆u(k−1)

[f(u(k))− f(u(k−1))−K

(k−1)∆u(k−1)

] [f(u(k))− f(u(k−1))−K

(k−1)∆u(k−1)

]T. (4.104)

Regrettably, this quasi-Newton method does not necessarily preserve definiteness and, in

fact, can become unbounded due to the term in the denominator of the right-hand side.

Unlike the preceding symmetric case, where satisfaction of the quasi-Newton property

uniquely determines the rank-one update, more assumptions are needed in the unsymmetric

rank-one case

K(k)

= K(k−1)

+ z1zT2 , (4.105)

where z1 and z2 are vectors, and z1 6= z2.

Broyden6 further assumed that,

K(k)q(k−1) = K

(k−1)q(k−1) ; q(k−1)T∆u(k−1) = 0 , (4.106)

6C.G. Broyden, A class of methods for solving nonlinear simultaneous equations, Mathematics of Com-

putation, 19:577-593, (1965).

ME280B May 7, 2019

https://www.ams.org/journals/mcom/1965-19-092/S0025-5718-1965-0198670-6/


which asserts that the change in f predicted by K(k)

along a direction orthogonal to ∆u(k−1)

is the same with the change predicted by K(k−1)

(stated differently, K(k)

and K(k−1)

are

identical in the direction normal to ∆u(k−1)). It follows from (4.105) that

K(k)q(k−1) = K

(k−1)q(k−1) + z1z

T2 q

(k−1) , (4.107)

so that, in view of (4.106),

z2 = ∆u(k−1) , (4.108)

to within a scalar multiplier. Furthermore, the quasi-Newton property (4.96) necessitates

that [K

(k−1)+ z1∆u(k−1)T

]∆u(k−1) = f(u(k))− f(u(k−1)) , (4.109)

which, in turn, implies that

z1 =1

∆u(k−1)T∆u(k−1)

[f(u(k))− f(u(k−1))−K

(k−1)∆u(k−1)

]. (4.110)

A quasi-Newton method may be implemented as shown in Algorithm 4.

Algorithm 4: Quasi-Newton method


Set u(0)n+1 = un;

Form K(u(0)n+1) and f(u

(0)n+1) and solve (4.91) for ∆u

(0)n+1;

Set u(1)n+1 = u

(0)n+1 +∆u

(0)n+1;

while k ≤ maxit




else

Form and factorize K(k)

n+1;


(k+1)n+1 = u

(k)n+1 +∆u

(k)n+1;

Set k ← k + 1;

end

end

The BFGS method (named after Broyden-Fletcher-Goldfard-Shanno) is a quasi-Newton

method intended for problems that lead to symmetric tangent stiffness matrix. The method

makes use of two rank-one updates, such that

K(k)

= K(k−1)

+ z1zT1 + z2z

T2 , (4.111)

May 7, 2019 ME280B


where z1 and z2 are vectors with z1 6= z2. The BFGS method is employed as follows:

(i) Assuming K(k−1)

(symmetric) and u(k−1) assumed known, determine the increment

∆u(k−1) from the standard quasi-Newton formula (4.97) as

K(k−1)

∆u(k−1) + f(u(k−1)) = 0 . (4.112)

(ii) Perform a line search along u(k−1), that is determine ηk−1, 0 < ηk−1 ≤ 1, such that

u(k) = u(k−1) + ηk−1∆u(k−1) . (4.113)

(iii) Update the stiffness matrix consistently with a quasi-Newton property (4.96), where,

in particular,

K(k)

= K(k−1)

+

[f(u(k))− f(u(k−1))

] [f(u(k))− f(u(k−1))

]T

ηk−1 [f(u(k))− f(u(k−1))]T∆u(k−1)

−

[K

(k−1)∆u(k−1)

] [K

(k−1)∆u(k−1)

]T

∆u(k−1)TK(k−1)

∆u(k−1). (4.114)

The update formula (4.114) is derived from (4.111) by imposing the quasi-Newton

property (4.96) and also setting z1 to be proportional to the change in the force im-

balance f(u(k))− f(u(k−1)). It can be easily shown that if K(k−1)

is positive-definite,

then K(k)

may or may not be positive-definite but is at least positive semi-definite.

The inverse of the stiffness matrix in (4.114) may be obtained in closed form by two

successive applications of the formula in (4.99). Special case is typically needed to

ensure that the terms in (4.114) are computed in a manner than minimizes errors

associated with algebraic operations involving small numbers.

The BFGS method is implemented as shown in Algorithm 5.

ME280B May 7, 2019

Continuation Methods 113

Algorithm 5: BFGS method

Data: ∆tn, state at time tn, maxit, tol, s

Set u(0)n+1 = un;

Form K(u(0)n+1) and f(u

(0)n+1) and solve (4.91) for ∆u

(0)n+1;

Perform line search according to Algorithm 3 for given s and set

u(1)n+1 = u

(0)n+1 + η0∆u

(0)n+1;

while k ≤ maxit




else

Form K(k)

n+1 as in (4.114) and factorize;

Solve (4.97) for ∆u(k)n+1;

Perform line search and set u(k+1)n+1 = u

(k)n+1 + ηk∆u

(k)n+1;

Set k ← k + 1;

end

end

4.4 Continuation Methods

Recall the discrete equations of motion for solids, in the form of equation (3.72), and consider

a special case, in which inertial effects may be neglected (rendering the problem quasi-static)

and the external loading is proportional with proportionality factor λ = λ(s). Here, s may

be thought of an “implicit” measure of time and λ is taken to be a smooth function of s.

Under these conditions, the N -dimensional discrete equilibrium equations (3.72) reduce to

f(un+1) = R(un+1)− Fn+1 = f(un+1)− λn+1c ,

where c is a constant force vector in RN and λn+1 effects an alternative parametrization of

loading. Therefore, one may express the equilibrium equation at any load level as

f (u, λ) = f (u)− λc = 0 . (4.115)

Now identify two distinct types of stability points (u, λ) ∈ RN ×R, that is limit points (also

referred to as turning points) and bifurcation points. A one-dimensional representation of

May 7, 2019 ME280B


LB

I

II

u u

λλ

limit point bifurcation point

Figure 4.16. Limit and bifurcation points in (u, λ)-space

these points is included in Figure 4.16. The distinction between limit and bifurcation points

was originally suggested by Poincare7.

Some typical examples of stability problems from solid mechanics are shown below.

Example 4.4.1: Snapping of a deep archThis problem involves an arch made of an elastic material. The arch is loaded by a downward-pointingforce at its midpoint. As the load increases, the arch resists the load in compression until it snapsthrough, as shown in Figure 4.17. Subsequently, the arch resists the load in tension. This is an exampleof a structure whose response exhibits two limit loads (one in the pre- and the other in the post-snap-through phase).

Figure 4.17. Snapping of a deep arch

Example 4.4.2: Buckling of a cantilever columnThis is the classical buckling problem of a column made of an elastic material and loaded by a compressiveforce. It is well-known that upon reaching a critical value, the force induces buckling of the column,in which case multiple equilibrium states become possible. This is an example of a bifurcation of theoriginal equilibrium state into a set of possible such states.

7H. Poincare, Sur l’ equilibre d’ une mass fluide animee d’ un mouvement de rotation, Acta Mathematica,

7:259–380, (1885).

ME280B May 7, 2019

https://projecteuclid.org/download/pdf_1/euclid.acta/1485888300


An equilibrium path (or solution path) is defined as a set of functions u(s), λ(s), such that

f (u, λ) = 0 . (4.116)

Let f : RN × R → R

N be continuously differentiable, and also such that for a given s

associated with a solution (u(s), λ(s))

det

[∂f

∂u

(u(s), λ(s)

)]6= 0 . (4.117)

Then, by the implicit function theorem, one may conclude that there exists an open neigh-

borhood U ⊂ RN around u(s), an open neighborhood V ⊂ R around λ(s), and a unique

continuously differentiable function u : V → U , such that

f (u(λ), λ) = 0 , (4.118)

see Figure 4.18.

u

s u(s), λ(s)

Figure 4.18. Solution in neighborhood of point (u(s), λ(s))

Upon differentiating the original equation with respect to s, it follows that

∂f

∂u

du

ds+∂f

∂λ

dλ

ds= 0 . (4.119)

A solution point (u, λ) is termed a stability point if:

(a)

det

[∂f

∂u

(u, λ

)]= 0 , (4.120)

(b)

ker

[∂f

∂u

(u, λ

)]=

n∑

i=1

αiφi , n < N (4.121)

where n < N , αi ∈ R and φi ∈ RN , i = 1, 2, . . . , n, that is,

[∂f

∂u

(u, λ

)]φi = 0 , i = 1, · · · , n , (4.122)

May 7, 2019 ME280B


(c)

range

[∂f

∂u

(u, λ

)]=y ∈ R

N | φTi y = 0 , i = 1, · · · , n

. (4.123)

Note that in the context of a finite element analysis,∂f

∂u(u, λ) = K(u, λ). Also, condition (a)

is implied by condition (b), if it is stated explicitly that φi 6= 0, that is, if there exists a

non-trivial null eigenvector of∂f

∂u(u, λ). In addition, condition (c) is implied by condition (b)

when∂f

∂uis symmetric. Indeed, in this case note that, for any x ∈ R

N ,

[∂f

∂u

(u, λ

)]x = y⇒ φ

Ti

[∂f

∂u

(u, λ

)]x = xT

[∂f

∂u

(u, λ

)]Tφi

= xT

[∂f

∂u

(u, λ

)]φi = φ

Ti y = 0 , (4.124)

where use is made of (4.121) and (4.123).

Now consider (4.119) at (u, λ) = (u(s), λ(s)), in the form

∂f

∂u

(u, λ

) [duds

]

s=s

+∂f

∂λ

(u, λ

) [dλds

]

s=s

= 0 (4.125)

and dot this equation with a non-trivial null eigenvector φ of∂f

∂u(u, λ) to conclude that

φT ∂f

∂u

(u, λ

) [duds

]

s=s

+ φT ∂f

∂λ

(u, λ

) [dλds

]

s=s

= 0 . (4.126)

However, owing to condition (c), the first term on the right-hand side of (4.126) vanishes,

therefore

φT ∂f

∂λ

(u, λ

) [dλds

]

s=s

= 0 . (4.127)

One may now distinguish between two cases in connection with (4.127):

(i) φT ∂f

∂λ

(u, λ

)= 0 (or, equivalently, φ

Tc = 0)

In this case, (u, λ) is a bifurcation point. Note that here it is typically true that[dλ

ds

]

s=s

6= 0.

(ii) φT ∂f

∂λ

(u, λ

)6= 0 (or, equivalently, φ

Tc 6= 0)

In this case,(u, λ

)is a limit point. Note that for (4.127) to hold it is now necessary

that

[dλ

ds

]

s=s

= 0.

ME280B May 7, 2019


In extraordinary circumstances, it is possible that φTc = 0 and

[dλ

ds

]

s=s

= 0. Such a

stability point is simultaneously a bifurcation and limit point.

In the computational investigation of stability, the key tasks are:

(a) To detect stability points and classify them as limit or bifurcation points,

(b) To develop methods which allow the analyst to follow specific non-linear solution paths

past stability points. These are referred to as continuation methods.

(c) To develop methods which effect switching from one solution path to another in bifur-

cation problems. These are referred to as branch-switching methods.

Here, attention is focused on continuation methods. The classical approach is to define an

extended system of the form

f(u)− λc = 0 ,

g(u, λ) = 0 ,(4.128)

where g : RN×R→ R is a given continuously differentiable function. The second of the above

equations is a global constraint equation which completely characterizes the continuation

method.

The extended system may be solved using the classical Newton-Raphson method. Here,

a typical iteration is

[f(u(k)

)− λ(k)c

]+K

(u(k)

)∆u(k) −∆λ(k)c = 0

g(u(k), λ(k)

)+

[∂g

∂u

(u(k), λ(k)

)]T∆u(k) +

[∂g

∂λ

(u(k), λ(k)

)]∆λ(k) = 0

(4.129)

or, put in matrix form,

K(u(k)

)−c[

∂g

∂u

(u(k), λ(k)

)]T ∂g

∂λ

(u(k), λ(k)

)

[∆u(k)

∆λ(k)

]= −

[f(u(k)

)− λ(k)c

g(u(k), λ(k)

)], (4.130)

from where one derives the update relations

u(k+1) = u(k) +∆u(k) , λ(k+1) = λ(k) +∆λ(k) . (4.131)

A crucial observation here is that the tangent stiffness of the extended system of non-linear

equations (4.128) may be non-singular, even when K(u(k)

)is singular. At the same time,

May 7, 2019 ME280B


this tangent stiffness matrix of the extended system is neither symmetric (even whenK(u(k)

)

is) nor banded.

If both K(u(k)

)and the stiffness matric of the extended system are non-singular, then

one may solve (4.130)1 for ∆u(k) according to

∆u(k) = −K−1(u(k)

) [f(u(k)

)−(λ(k) +∆λ(k)

)c]

(4.132)

and then substitute the above to (4.130)2 to find that

− ∂g

∂u

(u(k), λ(k)

)K−1

(u(k)

) [f(u(k)

)−(λ(k) +∆λ(k)

)c]

+∂g

∂λ

(u(k), λ(k)

)∆λ(k) = −g

(u(k), λ(k)

). (4.133)

Solving (4.133) for ∆λ(k) leads to

∆λ(k) = −[∂g

∂λ

(u(k), λ(k)

)+∂g

∂u

(u(k), λ(k)

)K−1

(u(k)

)c

]−1

g(u(k), λ(k)

)− ∂g

∂u

(u(k), λ(k)

)K−1

[f(u(k)

)− λ(k)c

], (4.134)

with ∆u(k) subsequently computed by substituting ∆λ(k) from (4.134) in (4.132). This

solution procedure is referred to a bordering algorithm and the quantity inside the square

bracket in (4.134) is known as the Schur complement of K(u(k)

)in the extended tangent

stiffness matrix in (4.130).

It can be shown8 that the necessary and sufficient conditions for the tangent stiff-

ness matrix of the extended system to be non-singular when K(u(k)

)is singular with

ker[K(u(k)

)]= φ 6= 0, that is, when there is a single non-trivial null eigenvector, are

c /∈ range[K(u(k)

)],

∂g

∂u

(u(k), λ(k)

)/∈ range

[KT

(u(k)

)]. (4.135)

Indeed, in this case

K(u(k)

)x 6= c , (4.136)

for any x ∈ RN , therefore if ψ is the null eigenvector of KT , that is,

KT(u(k)

)ψ = 0 , (4.137)

8See Section 4.1.1 in H.B. Keller, Numerical Methods in Bifurcation Problems, Springer-Verlag, Berlin,

1987, and also D.W. Decker and H.B Keller, Multiple limit point bifurcation, J. Math. Anal. Appl., 75:417–

430, (1980).

ME280B May 7, 2019

http://www.math.tifr.res.in/~publ/ln/tifr79.pdf

https://www.sciencedirect.com/science/article/pii/0022247X80900906


it follows, in view of the arbitrariness of x, that

0 = ψTK(u(k)

)x 6= ψTc (4.138)

or

ψTc 6= 0 , (4.139)

as in Figure 4.19.

c

Kx

ψ

Figure 4.19. Relation between the loading vector c and the null eigenvector ψ of KT

Note here that if K is symmetric (hence, the null eigenvectors φ of K and ψ of KT

coincide), then it follows from the above and the defining property of bifurcations that at a

bifurcation point(u, λ

), there is a φ 6= 0, such that

K(u, λ

)φ = 0 , φTc 6= 0 . (4.140)

Therefore, if K is symmetric, then the extended system is singular at any bifurcation point.

Returning to the original condition (4.135)2, note that

KT(u(k)

)y 6= ∂g

∂u

(u(k), λ(k)

), (4.141)

for any y ∈ RN . Therefore, given the null eigenvector φ of K

(u(k)

), it follows as earlier

from the arbitrariness of y that

0 = φTKT(u(k)

)y 6= φT ∂g

∂u

(u(k), λ(k)

)(4.142)

or

φT ∂g

∂u

(u(k), λ(k)

)6= 0 . (4.143)

Now, return to the extended system (4.130) and pre-multiply the first set of equations

with ψT to get

ψTK(u(k)

)∆u(k) −ψTc∆λ(k) = −ψT

[f(u(k)

)− λ(k)c

], (4.144)

May 7, 2019 ME280B


so that, in light of (4.137) and (4.139), one may deduce that

∆λ(k) =ψT[f(u(k)

)− λ(k)c

]

ψTc. (4.145)

This implies that (4.130)2 takes the form

K(u(k)

)∆u(k) = −

[f(u(k)

)− λ(k)c

]+ψT[f(u(k)

)− λ(k)c

]

ψTcc . (4.146)

Since K(u(k)

)is singular, one may stipulate that the (unique) solution ∆u(k) of the above

system is of the form

∆u(k) = ∆u(k)p + z(k)φ , (4.147)

where ∆u(k)p is a particular solution obtained by fixing one of the degrees of freedom and

solving for the rest, and z(k) is a scalar to be determined. Substituting (4.147) to (4.130)2,

one gets

[∂g

∂u

(u(k), λ(k)

)]T (∆u(k)

p + z(k)φ)+∂g

∂λ

(u(k), λ(k)

)∆λ(k) = −g

(u(k), λ(k)

), (4.148)

from where z(k) is determined in light of (4.143). Thus, the exact solution(∆u(k),∆λ(k)

)

is obtained as a function of φ, ψ and ∆u(k)p . These arrays can be computed without a

substantially higher effort than required for the original bordering algorithm.

Finally, when ker[K(u(k), λ(k)

)]=

n∑

i=1

αiφi, n > 1, that is, when the tangent stiffness

matrix of the original system possesses more than one non-trivial null eigenvectors, then it

can be shown that the tangent stiffness matrix of the extended system is always singular.

Next, several choices for the constraint function g(u, λ) are reviewed in detail.

4.4.1 Force control

Here, the constraint function takes the form

g(u, λ) = λ− λ , (4.149)

where λ is a specified positive scalar, see Figure 4.20.

In this case, the extended system becomes[K(u(k)

)−c

0T 1

][∆u(k)

∆λ(i)

]= −

[f(u(k)

)− λ(k)c

λ(k) − λ

](4.150)

ME280B May 7, 2019


λ

λ

u

Figure 4.20. Force control in (u, λ)-space

It follows from the above that

K(u(k)

)∆u(k) = −

[f(u(k)

)− λc

]. (4.151)

Clearly, force control fails when K(u(i))becomes singular. This is consistent with (4.135)2,

whose violation implies that 0T ∈ range[KT

(u(i))], see Figure 4.21.

LB

I

II

u u

λλ

limit point bifurcation point

Figure 4.21. Range of validity of force control in (u, λ)-space

Force control is easily implemented by updating λ = λ(s) at the end of each time step.

4.4.2 Displacement control

Here, the constraint function is defined as

g(u, λ) = eTl u− ul , (4.152)

where ul is a specified scalar and el is an N -dimensional column vector given by

[el] = [0 . . . 0 . . . 1︸︷︷︸l−th entry

. . . 0]T , (4.153)

see Figure 4.22. Essentially, the motion of the body is externally constrained by (4.152)

May 7, 2019 ME280B


λ

ul ul

Figure 4.22. Displacement control in (ul, λ)-space

so that the extended system be non-singular and the equilibrium path is traced for the

prescribed displacement ul.

Under displacement control the extended system (4.130) becomes[K(u(k)

)−c

eTl 0

][∆u(k)

∆λ(k)

]= −

[f(u(k)

)− λ(k)c

eTl u(k) − ul

](4.154)

It follows from (4.135)2 that displacement control fails if there is a vector x ∈ RN , such

that K(u(k)

)x = el, in which case displacement control along el is unable to remove the

singularity of the extended system, see Figure 4.23.

ul

λ

Figure 4.23. Range of validity of displacement control in (ul, λ)-space

The implementation of displacement control in finite element codes is trivial, since it

merely amounts to imposing displacement boundary conditions, as routinely done for Dirich-

let boundary conditions.

More generally, one may impose a set of M constraints of the form

gm(u, λ) = eTm(u)− um , m = 1, 2, . . . ,M , (4.155)

where M < N . This would lead to an extended system with N +M equations and M load

vectors λm along directions cm.

ME280B May 7, 2019


4.4.3 Arc-length control

The idea of arc-length methods is impose a constraint to the arc-length s of the equilibrium

path, defined as

s =

∫ds =

∫ √duTdu+ τ 2dλ2 , (4.156)

where τ is a scaling parameter that effects consistency of units in (4.156), see Figure 4.24.

ds

udu

τλ

τdλ

Figure 4.24. Arc-length in (u, τλ)-space

Arc-length methods fix an s = s0 at a given solution point (un, τλn), and seek to determine

a new equilibrium point under this constraint. Various arc-length methods are possible by

means of different approximations to the actual arc-length of the solution curve (since the

latter is not known in advance).

In the spherical arc-length method around the solution point (un, τλn), the constraint

function takes the form

g(u, λ) = (u− un)T (u− un) + τ 2(λ− λn)2 − s20 , (4.157)

see Figure 4.25.

uun

τλ

τλns0

Figure 4.25. Spherical arc-length in (u, τλ)-space

May 7, 2019 ME280B


A simplified version of the spherical arc-length method is obtained by linearizing the

constraint function along the tangent [tn] to the solution path at (un, τλn), where

[tn] =

[unt − un

τ(λnt − λn) ,

](4.158)

with magnitude s0, hence,

(unt − un)T (unt − un) + τ 2(λnt − λn)2 = s20 , (4.159)

see Figure 4.26. If follows that the constraint function at (un, τλn) becomes

tn

uun unt

τλ

τλn

τλnt

Figure 4.26. Normal-plane arc-length in (u, τλ)-space

g(u, λ) = (unt − un)T (u− un) + τ 2(λnt − λn)(λ− λn)− s20 . (4.160)

In view of (4.159), one may also write

g(u, λ) = (unt − un)T (u− un) + τ 2(λnt − λn)(λ− λn)

− (unt − un)T (unt − un) + τ 2(λnt − λn)(λnt − λn)

= (unt − un)T (u− unt) + τ 2(λnt − λn)(λ− λnt) , (4.161)

so that g(u, λ) = 0 implies that [tn] is orthogonal to

[u− unt

τ(λ− λnt)

], see Figure 4.26. Owing

to this property, the preceding method is referred to as the normal-plane arc-length method.

Given (4.160), the extended system now takes the form

[K(u(k)

)−c

(unt − un)T τ 2(λnt − λn)

][∆u(k)

∆λ(k)

]= −

[f(u(k)

)− λ(k)c

g(u(k), λ(k))

]. (4.162)

ME280B May 7, 2019


Its main advantage over its spherical arc-length counterpart is that all extra terms in the

extended stiffness matrix are constant.

The implementation of normal-plane arc-length in connection with the Newton-Raphson

method is summarized in Algorithm 6 below, following the so-called Riks-Wempner method9

Algorithm 6: Riks-Wempner method

Data: state at time tn, s0

Solve K(un)∆u∗n = ∆λ∗nc

for any ∆λ∗n > 0 (ascending part of curve) or ∆λ∗n < 0 (descending part of curve);

Use (4.158) to write [t∗n] =

[∆u∗

n

τ∆λ∗n

];

Compute [tn] = α∗[t∗n], where α∗ enforces (4.159);

Solve (4.162) using the Newton-Raphson method;

Note that the preceding algorithm requires that K(un) be non-singular and also that

the extended system in (4.159) be non-singular during the Newton-Raphson iterations. As

already argued, the latter is possible even when tracing bifurcation points, as long as the

continuation method does not “hit” the singularity at one the iterations.

A simple brute-force computational method for detecting and classifying stability points

with the aid of arc-length continuation is described in Algorithm 8 below.

9E. Riks, The application of Newton’s method to the problem of elastic stability. ASME J. Appl. Mech.,

39:1060–1065, (1972).

May 7, 2019 ME280B

http://appliedmechanics.asmedigitalcollection.asme.org/article.aspx?articleid=1400638

126 4.5. COMPUTATIONAL TREATMENT OF CONSTRAINTS

Algorithm 7: A method for detecting and classifying stability points

Data: state at time tn, s0, tol, tolk, tolp, maxit

Form K(un);

Perform an LU decomposition on K(un) and store the signs of the diagonal terms in U;

while k ≤ maxit

Use arc-length with s = s0 to advance solution to (un+1, λn+1);

Perform an LU decomposition on K(un+1) and store the signs of the diagonal terms

in U;

if no change in the signs of the diagonal terms of U from n to n+ 1 then

return (un+1, λn+1) and advance time (no stability points detected);

else if change to exactly one diagonal term of U from n to n+ 1 thenUse bisection to bracket stability point until u such that |detK(u)| .= tolk

Use subspace iteration to find the null eigenvector φ of K(u)

if∣∣∣φT

c∣∣∣ ≥ tolp then

(u, λ) is a limit point

else

(u, λ) is a bifurcation point

end

elseSet k ← k + 1, s0 ← s0/2 and repeat

end

end

Note, in connection to the preceding algorithm, that the LU decomposition of a square

matrix such as K is unique as long as the matrix is non-singular.

There exist alternative procedures for the determination of stability points, which are

not discussed here.

4.5 Computational Treatment of Constraints

The motion of a continuum is often subject to constraints. Typical examples of such con-

straints are noted below.

Example 4.5.1: IncompressibilityCertain rubber-like solid materials are practically incompressible, that is their deformation is always

ME280B May 7, 2019

Computational Treatment of Constraints 127

volume-preserving. In light of (1.10), this implies that their motion is subject to the constraint

detF = 1 , (4.163)

or, equivalently in terms of the principal stretches in (1.30),

λ1λ2λ3 = 1 . (4.164)

Many fluids (especially, liquids) are also practically incompressible. In this case, taking into consid-eration (1.52), it is readily concluded that an incompressible fluid is subject to the constraint

trL = divv = 0 . (4.165)

Example 4.5.2: Inextensibility along a given directionSome solids are reinforced by inextensible rod-like structures along a given direction, say M, in thereference configuration. In this case, no stretching is permitted along M, which leads to the constraint

λ2M = M ·CM = 1 . (4.166)

Example 4.5.3: Impenetrability of matterAccording to this fundamental postulate, two material points may not occupy the same position in spaceat the same time. In the context of a deformable solid coming to contact with a rigid foundation, as inFigure 4.27, the gap (penetration) function g, defined as

g = (xf − x) · n (4.167)

must satisfy the constraint conditiong ≥ 0 . (4.168)

nx

xf

rigid foundation

Figure 4.27. Contact of deformable solid with rigid foundation

Constraints are classified in various ways. Here, attention is placed on the distinction

between:

May 7, 2019 ME280B


(a) equality (bilateral) and inequality (unilateral) constraints

Equality (resp. inequality) constraints are mathematically expressed by equalities

(resp. inequalities). For instance, incompressibility is an equality constraint, while

impenetrability is an inequality constraint.

(b) internal and external constraints

Internal constraints are expressed in terms of measures of deformation (or rate of

deformation), while external constraints cannot. For instance, in view of (4.163),

(4.165) and (4.168), incompressibility is an internal constraint, while impenetrability

is an external constraint.

(c) rheonomous and scleronomous constraints

Rheonomous constraints depend explicitly on time, while scleronomous constraints do

not. For instance, all time-dependent (resp. time-independent) Dirichlet boundary

conditions may be thought of as rheonomous (resp. scleronomous) constraints.

Consider a bilateral scleronomous constraint written for a solid in the form

c (u(X, t)) = 0 (4.169)

and assumed to apply on a part C0 of either the domain R0 or the external boundary ∂R0.

While the constraint is expressed in (4.169) in terms of the displacement field, it may be

thought of as either internal or external.

When the motion of the solid is subject to (4.169), the space of constraint admissible

displacements becomes

Uc =u : R0 × R→ R

3 | u (X, t) = u (X, t) on Γu , c(u) = 0 on C0 × R.

(4.170)

In the context of finite element approximations, one may choose to construct subsets Uchof Uc, and satisfy (weakly) the balance laws over Uch. Such a so-called primal method may

be impractical if the constraint condition is difficult to enforce by the choice of admissible

displacements.

An alternative, so called dual method retains the original space of admissible functions Uand introduces a new Lagrange multiplier field. To this end, recall that the constraint (4.169)

is imposed on kinematic variables and rewrite the linear momentum balance equations (3.42)

ME280B May 7, 2019


as∫

R0

ξ ·ρ0a dV +

∫

R0

∂ξ

∂X·P dV =

∫

R0

ξ ·ρ0b dV +

∫

Γq0

ξ ·p dA−∫

C0

Dc(u, ξ)·Λ dV , (4.171)

where Λ = Λ(X, t) is a Lagrange multiplier field on C0×R. Note that the term in (4.171)

associated with the constraint could be a surface rather than a volume integral. Also note

that, since c(u) = 0 for all admissible displacement fields, then

Dc(u, ξ) = 0 , (4.172)

and the Lagrange multiplier field Λ which enforces the constraint is workless, in the sense

that ∫

C0

Dc(u, ξ) ·Λ dV = 0 . (4.173)

The preceding observation does not generally hold for rheonomous constraints.

Example 4.5.4: Incompressibility in dual methodsTaking into account (2.60) and (4.163), the constraint integral term in (4.173) becomes

∫

R0

Dc(u, ξ) ·Λ dV =

∫

R0

J∂ξ

∂X· F−TΛ dV . (4.174)

One may now combine this term with the stress-divergence term in (4.171) to conclude that∫

R0

∂ξ

∂X·P dV +

∫

R0

J∂ξ

∂X· F−TΛ dV =

∫

R0

∂ξ

∂X· (P+ ΛF−T ) dV (4.175)

=

∫

R

∂ξ

∂x· (T+ Λi) dv , (4.176)

where use is made of (1.73) and the chain rule. The preceding relation shows that the scalar Lagrangemultiplier field Λ may be interpreted as a pressure. Ideally, the Cauchy stress T should be decomposed toa pressure-independent part (which remains constitutively specified) and a pressure part, which coincideswith the Lagrange multiplier Λ.

A similar analysis applies to incompressibility in fluids, where, according to (4.165),∫

RDc(v, ξ) ·Λ dv =

∫

Rtr

(∂ξ

∂x

)Λ dv =

∫

R

∂ξ

∂x· Λi dv . (4.177)

Example 4.5.5: Tied sliding in dual methodsConsider a special case of the impenetrability constraint in which a given part C is a deformable body’sboundary is in persistent contact with a rigid foundation, but is allowed to slide on it. This is the case oftied sliding. Now, the constraint condition (4.168) is imposed as an equality. Hence, in light of (4.167),the constraint integral term in (4.173) becomes

∫

C0

Dc(u, ξ) ·Λ dA =

∫

C0

ξ · Λn dA , (4.178)

May 7, 2019 ME280B


therefore the scalar Lagrange multiplier field may be interpreted as the magnitude of the normal traction(pressure) necessary to resist penetration or separation of the two bodies.

The extended equations of motion now read

∫

R0

ξ · ρ0a dV +

∫

R0

∂ξ

∂X·P dV

−∫

R0

ξ · ρ0b dV −∫

Γq0

ξ · p dA+

∫

C0

Dc(u, ξ) ·Λ dV = 0 , (4.179)

∫

C0

θ · c(u)dV = 0 , (4.180)

where θ = θ(X, t) is an arbitrary weight field on C0 × R.

In discrete form, one may write

∫

C0

Dc(u, ξ) ·Λ dV.= ξ

Td(u)Λ ,

∫

C0

θ · c(u) dV .= θ

Tg(u) ,

(4.181)

where u, ξ ∈ RN and where Λ, θ ∈ R

M with N > M , consequently d : RN 7→ RN ×R

M and

g : RN 7→ RM . Taking into account (4.181), one may write

∫

C0

θ ·Dc(u, ξ) dV.= θ

TDg(u)ξ = ξ

TDTg(u)θ

= ξTd(u)θ

T,

(4.182)

therefore

d(u) = DTg(u) . (4.183)

Given (4.179), (4.180), and (4.181), the discrete equations of motion become

ξT[f(u) + d(u)Λ

]= 0 ,

θTg(u) = 0 ,

(4.184)

or, after accounting for the arbitrariness of ξ and θ,

f(u) + d(u)Λ = 0 ,

g(u) = 0 .(4.185)

ME280B May 7, 2019


Note that in the dual method, the original N equations for the unconstrained problem with u

as unknowns are replaced by N +M equations for the constrained problem with u and Λ

as unknowns.

To use the Newton-Raphson method in (4.185), write

f(u(k)

)+K

(u(k)

)∆u(k) + d

(u(k)

)Λ

(k)+DTd

(u(k)

)Λ

(k)∆u(k) + d

(u(k)

)∆Λ

(k)= 0

g(u(k)

)+Dg

(u(k)

)∆u(k) = 0

(4.186)

where, as usual,

f(u(k) +∆u(k)

) .= f

(u(k)

)+K

(u(k)

)∆u(k) ,

D

[∫

C0

Dc(u, ξ) ·Λ(k) dV

] (u(k),∆u(k)

) .= ξ

TDTd

(u(k)

)Λ

(k)∆u(k)

(4.187)

and also, upon recalling (4.183),

D

[∫

C0

θ · c(u)dV] (

u(k),∆u(k)) .

= θTDg(u(k))Λ

(k)∆u(k)

.= ∆u(k) T

d(u(k)

)θ = θ

TdT(u(k)

)∆u(k) . (4.188)

The linearized system of equations (4.186) may be put in bordered matrix form as[K(u(k)

)+DTd

(u(k)

)Λ

(k)d(u(k)

)

dT(u(k)

)0

][∆u(k)

∆Λ(k)

]= −

[f(u(k)

)+ d

(u(k)

)Λ

(k)

g(u(k)

)]. (4.189)

The extended tangent stiffness matrix in (4.189)above is symmetric provided bothK(u(k)

)

and DTd(u(k)

)Λ

(k)are symmetric. However, this stiffness matrix is not banded. In fact,

given its structure, a special equation solver is required to take care of zero diagonals, as the

classical Gauss elimination procedure without pivoting would generally fail.

Dual methods enable the use of the original (unconstrained) space of admissible dis-

placements U and its finite element subspaces Uh, which is convenient. The price for this

convenience is in the need to solve an extended system of equations, including the Lagrange

multipliers Λ ∈ L, where L is the space of admissible multiplier fields. Also, the choice of

discrete admissible fields Uh and Lh is subject to substantial restrictions and one needs to

exercise caution in order to ensure solvability of (4.189).

In the special case of linear elastic response subject to a constraint which is also linear

in u (or F), the matrix form of the extended system becomes[K d

dT 0

][u

Λ

]= −

[f (0)

g (0)

], (4.190)

May 7, 2019 ME280B


where K and d are constant matrices. This would be the case for the problem of incom-

pressible linear elasticity.

A corresponding analysis applies to a constrained problem in fluid dynamics.

The treatment of constraints by Lagrange multipliers in continuum mechanics is often

motivated by the formalization of constrained non-linear optimization problems of the general

form: find the extremum of a functional G(u), subject to constraint condition c(u) = 0.

In solid mechanics, under severe restrictions, there exist variational theorems, according to

which the equations of motion are derived as extrema of a functional G(u) to zero. For

example, in linear elastostatics, one may define the total potential energy functional G(u) of

a body occupying the region R0 as

G(u) =1

2

∫

R0

σ(u) · ε(u) dV −∫

R0

u · ρ0b dV −∫

Γq0

u · p dV . (4.191)

Here σ and ε are the stress and strain of the infinitesimal theory. It is easy to confirm

that the equations of motion are recovered by setting DG(u, ξ) = 0, where, under mild

assumptions, G attains an infimum (which reduces to a minimum for the discrete case) at

the solution u∗, that is G(u∗) = infU G(u), where U is the space of admissible displacements.

A corresponding variational principle may be also established for the problem of finite elastic

deformations, see Section 5.2.

In this restricted context, the Lagrange multiplier method of (4.179) and (4.180) can

be interpreted as a constrained extremization method. To this end, define the Lagrange

multiplier functional GL as

GL(u,Λ) = G(u) +

∫

C0

Λ · c(u) dV . (4.192)

Then, the preceding extended system is recovered by setting

DGL(u, ξ;Λ,θ) = 0 . (4.193)

Indeed, this saddle-point problem may be equivalently expressed as

DGL(u, ξ;Λ,θ) = DGL(u, ξ)︸︷︷︸Λ fixed

+DGL(Λ,θ)︸︷︷︸u fixed

= 0 , (4.194)

which implies that

DGL(u, ξ) = 0 , DGL(Λ,θ) = 0 . (4.195)

ME280B May 7, 2019


It is then immediately clear that (4.195)1,2 coincide with (4.179) and (4.180). Various ap-

proximations of the above Lagrange multiplier method are possible to bypass the need for

the direct solution of the extended system of N + M equations. Three such regulariza-

tions discussed here are the classical penalty, the perturbed Lagrangian, and the augmented

Lagrangian methods

In the classical penalty method10, one considers the functional GP defined as

GP (u, ε) = G(u) +1

2

∫

C0

εc(u) · c(u) dV , (4.196)

The perturbed Lagrangian method is associated with the functional GPL given as

GPL(u,Λ; ε) = G(u) +

∫

C0

Λ · c(u) dV 1

2ε

∫

C0

Λ ·Λ dV , (4.197)

where, again, ε > 0 is a penalty parameter. Lastly, the augmented Lagrangian method is

defined in terms of the functional GAL written as

GAL(u,Λ; ε) = G(u) +

∫

C0

Λ · c(u) dV +1

2

∫

C0

εc(u) · c(u) dV . (4.198)

In all three functionals, ε > 0 is a penalty parameters.

First, examine the classical penalty method and assume that G(u) attains an infimum

at the solution u∗ of the equations of motion, such that c(u) = 0. Then, it can be proved

conditional upon sufficient smoothness of the involved functions that if u∗ε is such that

DGP (u∗ε, ξ) = 0 (4.199)

and GP (u, ε) attains an infimum at u∗ε, then

limε→∞

u∗ε = u∗ . (4.200)

While the formal proof of this result is omitted here,11 a simple intuitive argument can

be made by observing that as ε → ∞ the penalty term would become unbounded unless

c(u) = 0, which is necessary to attain an infimum of GP .

Upon setting the differential of GP in the direction ξ to zero, one finds that

DG(u, ξ) +

∫

C0

ε [Dc(u, ξ)] · c(u) dV = 0 . (4.201)

10See R. Courant, Variational methods for the solution of problems of equilibrium and vibration, Bulletin

of the American Mathematical Society, 49:1-23, (1943).11See p. 366 in D.G. Luenberger, Linear and Nonlinear Programming, 2nd edition, Addison-Wesley,

Reading, (1984).

May 7, 2019 ME280B

https://www.taylorfrancis.com/books/e/9780429175480/chapters/10.1201/b16924-5


In discrete form, this translates to

ξT[f (u) + εd (u)g (u)] = 0 , (4.202)

where use is made of (4.181). A typical step of the Newton-Raphson method would then be

ξT[f(u(k)

)+K

(u(k)

)∆u(k) + εd

(u(k)

)g(u(k)

)

εDTd

(u(k)

)g(u(k)

)+ d

(u(k)

)Dg

(u(k)

)∆u(k)

]= 0 (4.203)

or, upon recalling (4.183), it follows that

[K(u(k)

)+ ε

DTd

(u(k)

)+ g

(u(k)

)d(u(k)

)dT(u(k)

)]∆u(k)

= −f(u(k)

)− εd

(u(k)

)g(u(k)

). (4.204)

Clearly, the classical penalty method requires the solution of only N algebraic equations

and u∗ε satisfies the constraint only approximately, see Figure 4.28. The extended stiffness

uu∗0 u∗1 u∗

c(u∗) = 0

GP

ε = 0

ε1 > 0

ε 7→ ∞

Figure 4.28. Penalty functional and the enforcement of the constraint c = 0

matrix in (4.204) will be symmetric if K itself is symmetric. This is immediately concluded

from the identity

D

D

[∫1

2εc(u) · c(u)dV

](u, ξ)

(u,∆u)

= D

D

[∫1

2εc(u) · c(u)dV

](u,∆u)

(u, ξ) , (4.205)

ME280B May 7, 2019


as the order of differentiation in the directions ξ and ∆u does not matter, conditional upon

sufficient smoothness of the integral in u. At the same time, the form of the extended

stiffness matrix in (4.204) reveals that ill-conditioning is bound to occur for large values of ε.

Next, consider the perturbed Lagrangian method. Here, extremization of the functional

in (4.197) leads to the set of equations

DG(u, ξ) +

∫

C0

[Dc(u, ξ)] ·Λ dV = 0 (4.206)

and ∫

C0

[c(u)− 1

εΛ

]· θ dV = 0 . (4.207)

The second of the above equations is strongly satisfied when

Λ = εc(u) (4.208)

inR0, in which case the first equation becomes reduces to the equation (4.201) of the classical

penalty method. Alternatively, one may choose to enforce (4.207) weakly, resulting in

θT[g(u)− 1

εΛ

]= 0 , (4.209)

which lead to the discrete equations

Λ = εg(u) . (4.210)

Depending on the interpolation of the multiplier field, it may be possible to efficiently de-

termine Λ at the element level. In such a case, the perturbed Lagrangian method would be

an attractive alternative to the classical penalty method.

Finally, consider the augmented Lagrangian method, which may be thought of as a combi-

nation of the Lagrange multiplier and the classical penalty method. Here, the extremization

of the functional in (4.198) leads to the equations

DG(u, ξ) +

∫

C0

[Dc(u, ξ)] ·Λ dV +

∫

C0

ε [Dc(u, ξ)] · c(u) dV = 0 (4.211)

and ∫

C0

c(u, ξ) · θ dV = 0 . (4.212)

May 7, 2019 ME280B


A common algorithmic implementation of the augmented Lagrangian method is the so-

called Uzawa algorithm12 as follows:

Algorithm 8: Uzawa’s algorithm for updated Lagrangian methods

Data: state at time tn, ε, ζ, maxits, tol

Set k = 0, Λ0 = 0 and u0 = un;

while k ≤ maxit and |c(uk)| ≥ tol

Solve discrete counterpart of (4.211) for ∆uk;

Set uk+1 = uk +∆uk;

Set Λk+1 = Λk + ζc(uk);

Set k ← k + 1end

The above algorithm can be proven to converge to the desired solution given any ζ, such

that 0 < ζ < 2ε, for a class of minimization problems with linear constraints.13

The augmented Lagrangian functional with the Uzawa algorithm requires the (repeated)

solution of only N equations. However, in contrast to the classical penalty method, ill-

conditioning of the tangent stiffness matrix is altogether avoided, since the penalty parameter

ε is kept fixed at relatively small. In addition, the constraint equation is satisfied exactly

despite the fixed value of the penalty parameter.

While the classical penalty method and its two variants are mathematically justified only

for under the severe restriction of the existence of a variational problem for the momentum

balance equations, they are used heuristically and with great success on a wide array of

problems in computational mechanics.

A critical question in designing a robust finite element approximation of a continuum-

mechanical problem in the presence of constraints is the choice of admissible displacement (or

velocity) fields and of Lagrange multplier fields. To address this question, one may consider,

for simplicity, a linearized counterpart of the extended system (4.189), in the form[K d

dT 0

][u

Λ

]= −

[f

g

], (4.213)

where nowK, d, f , and g are constant arrays. It can be shown14 that necessary and sufficient

12H. Uzawa, Iterative methods for concave programming. In K.J. Arrow,L. Hurwicz and H. Uzawa editors,

Studies in linear and nonlinear programming, Stanford University Press, Palo Alto, (1958).13See p. 48 of R. Glowinski and P. Le Tallec, Augmented Lagrangian and Operator Splitting

Methods in Non-linear Mechanics, SIAM, Philadelphia, 1989.14See Section 3 in F. Brezzi and K.-J. Bathe, A discourse on the stability conditions for mixed finite

ME280B May 7, 2019

https://www.sciencedirect.com/science/article/pii/004578259090157H


conditions for the above system to be solvable are that: (i) N ≥ M and (ii) if v ∈ kerd

and wTKv = 0 for all w ∈ kerd, then v = 0. These conditions ensure that the solution

to (4.213) exists, but do not guarantee that it will be a convergent solution, say, in the sense

of h-refinement. Regardless, condition (i) is an easy one to check when contemplating a dual

method for the solution of a constrained problem.

Example 4.5.6: Constraint counts for incompressibilityConsider a square block meshed using n×n 4-node elements (n > 1) and assume that it is used to modelan incompressible continuum (solid or fluid). Let the displacements or velocities be fully prescribed onthe boundary (except for a single point). In this case, the number of uknowns displacement or velocitydegrees of freedom is

N = 2(n− 1)2 − 2 . (4.214)

Also, assume that the Lagrange multiplier field is piecewise constant in each element and free ofspecification (except for one point). In this case, the number of uknown Lagrange multiplier degrees offreedom is

M = n2 − 1 . (4.215)

Clearly, in this case N > M so the preceding solvability condition (i) is met. Moreover, it is immediatelyclear from (4.214) and (4.216) that, under uniform h-refinement,

limn→∞

N

M= 2 . (4.216)

This ratio is considered optimal, as it equals the ratio of linear momentum equations to constraintequations in the continuum problem.

element formulations, Computer Methods in Applied Mechanics and Engineering, 82:27–57, (1990).

May 7, 2019 ME280B

https://www.sciencedirect.com/science/article/pii/004578259090157H

Chapter 5

Constitutive Modeling of Deformable

Continua

5.1 Incompressible Newtonian Viscous Fluid

In this problem, the balance of linear momentum is solved together with the condition of

incompressibility, that is,

divT+ ρb = ρ

(∂v

∂t+∂v

∂xv

)in R ,

div v = 0 in R .

(5.1)

These equations are subject to the boundary conditions

vi = vi on Γui

Tijnj = ti on Γqi

(5.2)

where, Γui∩ Γqi = ∅, Γu =

⋃3i=1 Γui

, Γq =⋃3

i=1 Γqi , and ∂R = Γu ∪ Γq, as well as the initial

condition

v(x, 0) = v0(x) , (5.3)

where div v0 = 0. Note that if Γq = ∅, then the boundary velocity v should satisfy∫

∂R

v · n da =

∫

R

div v dv = 0 . (5.4)

The Cauchy stress T is written as

T = −pi+ 2µD , (5.5)

138

Incompressible Newtonian Viscous Fluid 139

where, as argued in Section 4.5, p is the Lagrange multiplier which enforced the incompress-

ibility condition (5.1)2 and µ is the viscosity coefficient (assumed here to be constant). Given

the preceding constitutive law, one may write

divT = div(−pi+ 2µD) = − grad p+ 2µ divD = − grad p+ 2µ div

(∂v

∂x

)s

. (5.6)

Starting from the weak form of linear momentum balance in (3.184), rewrite the stress-

divergence term as

∫

R

(∂ξ

∂x

)s

·[−pi+ 2µ

(∂v

∂x

)s]dv =

∫

R

[−p div ξ +

(∂ξ

∂x

)s

· 2µ(∂v

∂x

)s]dv . (5.7)

Therefore, the weak form of linear momentum balance (3.184) takes the form

∫

R

ξ · ρ(∂v

∂t+∂v

∂xv

)dv +

∫

R

[−p div ξ +

(∂ξ

∂x

)s

· 2µ(∂v

∂x

)s]dv

=

∫

R

ξ · ρb dv +∫

Γq

ξ · t da . (5.8)

In addition, the weak form of the incompressibility condition may be expressed as

∫

R

σ(div v)dv = 0 . (5.9)

Before proceeding with the numerical approximation of the preceding weak forms, recall

the Helmholtz-Hodge decomposition, according to which any vector field v : R → E3 can be

uniquely decomposed into a solenoidal and an irrotational part, that is,

v = vso + vir , (5.10)

where vso satisfies

div vsol = 0 in R ,

vsol · n = 0 on R ,(5.11)

while for vir there exists a scalar function φ : R → R, such that

virr = gradφ , (5.12)

hence

curlvirr = curl gradφ = 0 . (5.13)

May 7, 2019 ME280B

140 Constitutive Modeling of Deformable Continua

A proof of this result can be found elsewhere.1

A very efficient and straightforward method for solving the incompressible Navier-Stokes

equation using finite elements is based on a classical work in the context of finite differences.2

In order to appreciate this approach, it is introduced first using the strong forms. To this

end, suppose that the velocity and pressure fields have already been determined to be vn

and pn, respectively, at time tn, respectively, and the same fields are to be determined at

time tn+1 = tn +∆tn. Therefore, it is known that

ρ

(∂vn

∂t+∂vn

∂xvn

)= − grad pn + 2µ div

(∂vn

∂x

)s

+ ρbn ,

div vn = 0 ,

(5.14)

everywhere in Rn. To advance the solution in time to tn+1, first determine a predictor

velocity v∗n+1, such that

ρ

(v∗n+1 − vn

∆tn+∂vn

∂xvn

)= 2µ div

(∂vn

∂x

)s

+ ρbn+1 in R ,

v∗n+1 = vn+1 on Γu ,

t∗n+1 = 0 on Γq .

(5.15)

Note that the predictor velocity ignores completely any forces due to change in the pressure,

as well as any external tractions. As a result, it does not necessarily satisfy the incom-

pressibility condition. Subsequently, a corrector step is applied to account for the pressure,

as

ρvn+1 − v∗

n+1

∆tn= − grad pn+1 . (5.16)

Note that in the above equation, both vn+1 and pn+1 are unknown, so this equation cannot

be solved directly. Equation (5.16) may be readily rewritten as

v∗n+1 = vn+1 +

∆tnρ

grad pn+1 . (5.17)

If vn+1 · n = 0 on ∂R, then (5.17) represents precisely the Helmholtz-Hodge decomposition

in (5.10–5.13). Hence, in this case, vn+1 and pn+1 are obtained by projecting v∗n+1 onto

its solenoidal and irrotational parts. Even in the case of more general boundary conditions

1See, e.g., Section 6.3.1 in P. Papadopoulos, Introduction to Continuum Mechanics, ME185 course notes,

Berkeley, 2017.2A. Chorin. Numerical solution of the Navier-Stokes equations. Math. Comp., 22:745–762, (1968).

ME280B May 7, 2019

http://csml.berkeley.edu/Notes/download-ME185.php

http://www.ams.org/journals/mcom/1968-22-104/S0025-5718-1968-0242392-2/home.html

Incompressible Newtonian Viscous Fluid 141

for vn+1, one may take the divergence of both sides of (5.17) and observe (5.1)2 to deduce

the Laplace equation

div v∗n+1 =

∆tnρ

div(grad pn+1) in Rn+1 , (5.18)

for pn+1, subject to boundary conditions

∆tnρ

grad pn+1 · n = 0 on Γu ,

[−pn+1i+ 2µ

(∂vn

∂x

)s]n = tn+1 on Γq .

(5.19)

Note that (5.19)1 by taking the normal projection of (5.17) and recalling (5.15)2. Likewise,

(5.19)2 is obtained from (1.76) and (5.5), where the rate-of-deformation tensor is evaluated

at v = vn (or, possibly, v = v∗n+1) to avoid an implicit dependence on (the yet unknown)

vn+1. It follows that (5.19)1,2 are now respectively the Neumann and Dirichlet boundary

conditions for the Laplace equation (5.18). After pn+1 is calculated from (5.18) and (5.19),

the velocity vn+1 is determined from (5.16).

Turning to the Galerkin-based finite element interpretation,3 this procedure translates to

finding the predictor velocity v∗n+1 from

∫

R

ξn+1 · ρ(∂v∗

n+1

∂t+∂vn

∂xvn

)dv =

∫

R

[−(∂ξn+1

∂x

)s

· 2µ(∂vn

∂x

)s

+ ξn+1 · ρbn+1

]dv ,

(5.20)

where∂v∗

n+1

∂t

.=

v∗n+1−vn

∆tn, with v∗

n+1 = vn+1 on Γu and ξn+1 = 0 on Γu. Subsequently, solve

the weak counterpart of (5.16) together with the incompressibility constraint (5.1)2. Upon

using the divergence theorem for the former, the system of equations may be expressed as∫

R

ξn+1 · ρvn+1 − v∗

n+1

∆tndv =

∫

R

div ξn+1 · pn+1 dv +

∫

Γq

ξn+1 · tn+1 dv ,

∫

R

σn+1 div vn+1 dv = 0 ,

(5.21)

with vn+1 · n = vn+1 · n and ξn+1 · n = 0 on Γu. Note that the tangential component of the

velocity does not affect incompressibility.

Neglecting the element representation for brevity, proceed with the global matrix form

of this two-step algorithm. The predictor step is expressed as

1

∆tn[M]

([v∗

n+1]− [vn])+ [A(vn)] + [K][vn] = [Fb

n+1] , (5.22)

3J. Donea, S. Giuliani, H. Laval, and L. Quartapelle, Finite element solution of the unsteady Navier-Stokes

equations by a fractional step method, Comp. Meth. Appl. Mech. Engrg., 30:53–73, (1982).

May 7, 2019 ME280B




subject to [v∗n+1] = [ˆvn+1] on Γu. In (5.22), [M] is the global mass matrix (symmetric,

positive-definite), [A(vn)] is the global advection vector (a non-linear function of vn), [K]

is the global diffusion matrix (symmetric, positive-semidefinite), and [Fbn+1] is the global

external force due to prescribed body forces only. This system is now solved for [v∗n+1]. The

solution becomes trivially simple to compute if [M] is rendered diagonal, as earlier. Either

way, one may write

[v∗n+1] = [vn]−∆tn[M]−1 ([A(vn)] + [K][vn]) + ∆tn[M]−1[Fb

n+1] . (5.23)

For the corrector step, write the matrix counterpart of (5.21) as

1

∆tn[M]

([vn+1]− [v∗

n+1])− [C][pn+1] = [Ft

n+1] ,

[C]T [vn+1] = [Gn+1] ,

(5.24)

where [C] is the matrix emanating from the divergence operator and [Ftn+1] is the external

force vector due to prescribed tractions only. Note that the right-hand side of (5.24)2 becomes

non-zero upon accounting for boundary conditions because it needs to account for the discrete

counterpart of the boundary condition vn+1 · n = vn+1 · n on Γu.

The pressure [pn+1] may be determined from the system (5.24) as

[C]T [M]−1[C][pn+1] =1

∆tn

([Gn+1]− [C]T [v∗

n+1])− [C]T [M]−1[Ft

n+1] , (5.25)

where [C]T [M]−1[C] is symmetric and under sufficiently general conditions has the rank of

[pn+1]. Lastly, the velocity [vn+1] is determined as

[vn+1] = [v∗n+1] + ∆tn[M]−1

([Ft

n+1] + [C][pn+1]). (5.26)

Therefore, obtaining the new state vectors [vn+1] and [pn+1] requires the solution of a sym-

metric linear algebraic system for [pn+1] and performing two updates for [v∗n+1] and [vn+1].

5.2 Nonlinear Elasticity

Recall the mechanical energy balance equation (1.80), and admit the existence of a strain

energy function ψ (F) per unit mass, such that∫

P

T ·D dv =

∫

P

ρψ dv . (5.27)

ME280B May 7, 2019

Nonlinear Elasticity 143

Recalling the result on work-conjugacy of stress and strain or deformation measures, one

may also write∫

P

ρψ dv =

∫

P0

ρ0ψ dV =

∫

P0

P · F dV =

∫

P0

S · E dV . (5.28)

It follows from (5.28) that

∫

P0

(ρ0∂ψ

∂F−P

)· F dV = 0 , (5.29)

so that, using a standard argument,

P = ρ0∂ψ

∂F. (5.30)

This constitutive law characterizes a general homogeneous hyperelastic (or Green-elastic)

material.

The strain energy function is subject to invariance requirements under superposed rigid-

body motions, namely

ψ = ψ (F) = ψ(F+)

= ψ (QF) , (5.31)

for all proper orthogonal Q. It can be easily argued that this implies

ψ = ψ (F) = ψ (U) = ψ (C) = ψ (E) . (5.32)

Therefore, one may conclude from (5.28)2,4 that

∫

P0

(2ρ0

∂ψ

∂C− S

)· E dV = 0 , (5.33)

therefore,

S = 2ρ0∂ψ

∂C= ρ0

∂ψ

∂E. (5.34)

If the stress response of the hyperelastic material is further assumed isotropic, then, given

any orthogonal tensor Q,

ψ = ψ (F) = ψ (FQ) = ψ (C) = ψ(QTCQ

). (5.35)

Recalling the standard representation theorem for isotropic scalar functions of a tensor vari-

able, one may conclude that

ψ = ψ (IC, IIC, IIIC) (5.36)

May 7, 2019 ME280B


where IC,IIC, IIIC are the three scalar invariants of C. In view of (5.34)1 and (5.36), one

may conclude, with the aid of the chain rule, that

S = 2ρ0

[∂ψ

∂IC

∂IC∂C

+∂ψ

∂IIC

∂IIC∂C

+∂ψ

∂IIIC

∂IIIC∂C

]. (5.37)

The identities

∂IC∂C

=∂

∂C(trC) = I ,

∂IIC∂C

=∂

∂C

[1

2

(trC)2 − trC2

]= ICI−C ,

∂IIIC∂C

= IIICC−1 ,

(5.38)

can be shown to hold. Indeed, the first two may be confirmed by direct calculation, while the

third one follows immediately from (2.55). In view of (5.38), the constitutive equation (5.37)

takes the form

S = 2ρ0

[(∂ψ

∂IC+ IC

∂ψ

∂IIC

)I− ∂ψ

∂IICC+

∂ψ

∂IIICIIICC

−1

]. (5.39)

Example 5.2.1: Kirchhoff-Saint Venant materialIn this model, the classical stress-strain law of elasticity at infinitesimal strains is written in terms of thesecond Piola-Kirchhoff stress and the Lagrangian strain, that is,

S = 2µE+ λ(trE)I , (5.40)

where λ and µ are material constants. This is the Kirchhoff-Saint Venant (or generalized Hookean)law. The preceding law can be expressed equivalently as

S = µ(C− I) + λ1

2(IC − 3)I =

[1

2λ(IC − 3)− µ

]I+ µC . (5.41)

This constitutive law is derivable from a strain-energy function defined as

ρ0ψ(IC, IIC, IIIC) =1

8λ(IC − 3)2 + frac14µ

(I2C − 2IC − 2IIC

). (5.42)

Recalling (1.86), it follows from (5.41) and (1.14) that

T =1

J

[1

2λ(IB − 3)− µ

]B+

1

JµB2 . (5.43)

Clearly, the Kirchhoff-Saint Venant law reduces to the constitutive equations of isotropic linear elasticityfor infinitesimal deformations.

The Kirchhoff-Saint Venant material law is used to describe materials that undergo a moderateamount of elastic deformation.

ME280B May 7, 2019


Example 5.2.2: Compressible neo-Hookean materialHere, the strain energy takes the form

ρ0ψ(IC, IIC, IIIC) =µ

2(IC − 3)− µ ln J +

λ

2(J − 1)2 , (5.44)

where λ and µ are material parameters. Taking into account (5.39) that

S = µ(I−C−1) + λJ(J − 1)C−1 . (5.45)

Pushing the preceding equation forward to the current configuration, it follows that

T =1

Jµ(B− i) + λ(J − 1)i . (5.46)

This constitutive law is consistent with linearized isotropic elasticity. Indeed, recalling (2.54) and (2.64),it can be shown that

L [T;u]X = σ

= µDB(X,u) + λDJ(X,u)I = 2µε+ λ(tr ε)I , (5.47)

as expected.The compressible neo-Hookean material law is used to describe materials that may undergo large

elastic deformations, such as rubber.

Under quasi-static conditions and under mild assumptions on the strain energy func-

tion Ψ, there exists a variational theorem for nonlinear elasticity, according to which the

total potential functional G(u), defined as

G(u) =

∫

R0

ρ0Ψ dV −∫

R0

u · ρ0b dV −∫

Γq,0

u · p dA , (5.48)

attains a local minimum at u which satisfies equilibrium.4

Many nonlinearly elastic materials, such as dense rubber, are incompressible or nearly

incompressible. Here, write the constraint of incompressibility as

c(F) = detF− 1 = 0 . (5.49)

To impose incompressibility, it is important to recognize that it is an internal constraint

that affects the form of the strain energy function. In particular, in view of the discussion in

Section 4.5, write the strain energy functional Ψc of the incompressible elastic material as

ρ0Ψc(F, p) = ρ0Ψ(F) + pc(F) , (5.50)

4See Chapter 11 in M.E. Gurtin, Topics in Finite Elasticity, SIAM, Philadelphia, 1981.

May 7, 2019 ME280B

https://epubs.siam.org/doi/book/10.1137/1.9781611970340


where p is a Lagrange multplier, and proceed as usual to defining the stress starting with

the definition ∫

P0

ρ0Ψc dV =

∫

P0

P · F dV . (5.51)

Recalling that c = 0, this implies that∫

P0

[ρ0

(∂Ψ

∂F+

p

ρ0

∂c

∂F

)−P

]· F dV = 0 , (5.52)

hence, in view of (2.55),

P = ρ0∂Ψ

∂F+ pJF−T . (5.53)

Taking into account (1.86) and (5.53), it follows that

S = ρ0F−1∂Ψ

∂F+ pJC−1 (5.54)

and

T = ρ∂Ψ

∂FFT + pi . (5.55)

The last equation demonstrates that the Lagrange multiplier is a pressure term on the Cauchy

stress.

Example 5.2.3: Compressible neo-Hookean materialStart with the strain energy function of the compressible neo-Hookean model in (5.44) and assumeincompressibility, which reduces the function to

ρ0ψ =µ

2(IC − 3) . (5.56)

Taking into account (5.53-5.55), one may write

P =µ

2

∂

∂F

tr(FTF

)+ pJF−T = µF+ pF−T , (5.57)

S = F−1P = µI+ pC−1 , (5.58)

andT = µB+ pi , (5.59)

subject to c = 0. Alternatively, one may start with the stress relations (5.45) and (5.46), and appendthe pressure term. For instance, for the Cauchy stress, one would deduce the expression

T = µ(B− i) + pi = µB+ (p− µ)i , (5.60)

which suggests that the same meaning of the pressure as in (5.59) to within a constant µ.In the nearly incompressible case, one would use the original stain-energy function of (5.44), but

view λ as a penalty parameter by letting it grow toward infinity to satisfy the constraint of incompress-ibility in approximate fashion.

ME280B May 7, 2019


From the standpoint of finite element implementation, it is important to observe that

the material stiffness of the hyperelastic material model is symmetric. Indeed, starting from

the term which is associated with the stress-divergence in linear momentum balance, recall

that

D

[∫

Ωe0

∂ξ

∂X· (FS) dV

](u,∆u) =

∫

Ωe0

(∂∆u

∂X

)T∂ξ

∂X· S dV

︸︷︷︸geometric term

+

∫

Ωe0

∂ξ

∂X· FDS(u,∆u) dV

︸︷︷︸material term

.

(5.61)

As already argued in Section 4.3, the geometric term yields a symmetric stiffness provided ξ

and ∆u are interpolated by the same functions. For a hyperelastic solid, the material term

also yields a symmetric stiffness. This is because, according to (5.34),

∂S

∂E= ρ0

∂2Ψ

∂E2(5.62)

hence, it is symmetric. Therefore, in light of (2.41),

∫

Ωe0

∂ξ

∂X· FDS(u,∆u) dV =

∫

Ωe0

FT ∂ξ

∂X·DS(u,∆u) dV =

∫

Ωe0

1

2

[FT ∂ξ

∂X+

(∂ξ

∂X

)T

F

]· ∂S∂E

DE(u,∆u) dV =

∫

Ωe0

1

4

[FT ∂ξ

∂X+

(∂ξ

∂X

)T

F

]· ∂S∂E

[FT ∂∆u

∂X+

(∂∆u

∂X

)T

F

]dV ,

(5.63)

which proves the symmetry.

May 7, 2019 ME280B

Index

acceleration, 2adaptive remeshing, 75advective remapping, 74ALE

mixed variables, 67algorithmic tangent modulus, 100arc-length

normal-place, 124spherical, 123

augmented Lagrangian method, 133axial vector, 9

BFGS method, 111bifurcation point, 113, 116bordering algorithm, 118branch-switching method, 117

Cauchy tetrahedron, 12Cauchy’s lemma, 12Cauchy-Green deformation tensor

left, 4right, 4

centered-difference method, 56classical penalty method, 133configuration

current, 1reference, 2

consistent differentiation, 100constraint

bilateral, 128external, 128internal, 128rheonomous, 128scleronomous, 128unilateral, 128

continuation method, 117contraction, 87control volume, 45

dead load, 97

derivativeFrechet, 23Gateaux, 22

differentialFrechet, 23Gateaux, 21

differentiationautomatic, 91symbolic, 91

divergence theorem, 10dual method, 128

energy norm, 106equilibrium path, 115equipotential relaxation, 78error

“absolute”, 106“relative”, 106

Eulerian remapping, see also advective remapping

fixed point, 86fixed-point iteration, 88follower load, 54, 97force vector

externalelement, 53global, 53

gap function, 127generalized Hookean law, see also Kirchhoff-Saint

Venant lawgradient, 10Green-elastic material, see also hyperelastic mate-

rial

heat flux, 15heat flux vector, 15heat supply, 15Helmholtz-Hodge decomposition, 139hyperelastic material, 143

148

INDEX 149

implicit integration, 55integration

explicit, 56internal energy, 15inverse function theorem, 3

Kelvin-Voigt solid, 99kinetic energy, 14Kirchhoff-Saint Venant law, 144

Lagrange multiplier functional, 132Lagrange’s criterion of materiality, 44limit point, 113, 116line search, 107linear part

function, 23Lipschitz constant, 94Lipschitz continuity, 94loading

proportional, 113localization theorem, 10

mass, 10mass density, 10mass matrix

element, 53global, 53

material surface, 44matrix norm

natural, 87mean-value theorem, 86mechanical energy balance theorem, 14mesh rezoning, see also adaptive remeshingmesh time derivative, 66momentum balance

semi-discrete form, 54motion, 3

Jacobian, 3

Nanson’s formula, 4Newmark method, 55Newton method

generalized, 92modified, 91

Newton-Kantorovich theorem, 95non-linearity

geometric, 20material, 20

operator-split procedure, 72

perturbed Lagrangian method, 133

polar decompositionleft, 6right, 6

polar decomposition theorem, 5primal method, 128principal directions, 6

quasi-Newton method, 108quasi-Newton property, 109quasi-static, 54

radius of attraction, 95rate-of-deformation tensor, 8remainder

function, 23remeshing, 64residual norm, 106Reynolds’ transport theorem, 10Riks-Wempner method, 125

saddle-point problem, 132Schur complement, 118secant property, see also quasi-Newton propertysemi-discretization, 47set

convex, 86Sherman-Morrison-Woodbury formula, 109solution path, see also equilibrium pathspectral representation theorem, 6stability point, 113, 115stiffness

geometric, 102material, 103

strain tensorEulerian, 5Hencky, 7Lagrangian, 5

stress power, 14stress tensor

Cauchy, 12first Piola-Kirchhoff, 13second Piola-Kirchhoff, 15

stress-divergence vectorelement, 53global, 53

stretch, 4stretch tensor

left, 6right, 6

strong form, 38strong solution, 40

May 7, 2019 ME280B

150 Index

tangent stiffness matrixglobal, 101

tangent vector, 38tensor

adjugate, 30identityreferential, 5spatial, 5

proper-orthogonal, 6referential, 4spatial, 4symmetric, 4two-point, 3

tied sliding, 129total Lagrangian formulation, 56

total potential energy functional, 132turning point, see also limit point

updated Lagrangian formulation, 56Uzawa algorithm, 136

variation, 41velocity, 2velocity gradient, 8virtual power, 41virtual work, 41volumetric distortion, 78vorticity tensor, 8

weak form, 39weak solution, 40

ME280B May 7, 2019

finite element methods for non-linear continua · 2019. 5. 7. · an inﬁnitesimal material area...

Documents